Tuesday, June 2, 2026

fsck_hfs Cache Exhaustion Bug

Kıvanç Günalp:

fsck_hfs in macOS Sequoia (version hfs-683.x) has a cache exhaustion bug that reports false corruption on large HFS+ volumes. On machines with 8 GB RAM, volumes of 24 TB or larger trigger “Couldn’t read node” errors during the extended attributes check.

[…]

fsck_hfs pre-allocates a cache at startup — a pool of 32KB blocks used for all disk reads. The size of this pool is determined by available system RAM[…]

[…]

BTCheckUnusedNodes races through tens of thousands of free nodes, and every unique disk offset it touches gets a Tag_t structure allocated via calloc and inserted into the cache’s hash table. Each tag claims one 32KB buffer from the pool. When the release path runs, it returns the tag to the LRU list — but the LRU management doesn’t keep up with the rate of allocations.

[…]

The irony: a function designed to verify filesystem integrity is itself broken — reporting phantom corruption on perfectly valid volumes.

I’m surprised that we keep seeing new HFS+ bugs. I would have thought that code would be frozen by now.

Previously:

1 Comment RSS · Twitter · Mastodon


crazytrainmatt

Since updating to Sequoia last year I lost two HFS+ volumes. In both cases, they failed after running a routine disk first aid repair from a sequoia host. One became unmountable and had to be restored from a backup and has been working fine for ~9 months since then. The other can be mounted but the OS complains about corruption. Neither were as large (18 TB and 4 TB) as that medium post, so perhaps there are further problems.

Leave a Comment