Thursday, September 19, 2019

APFS Enumeration Performance on Rotational Hard Drives

Mike Bombich:

My APFS-formatted rotational disks have always felt slower than when they were HFS+ formatted. The speed of copying files to them felt about the same, but slogging through folders in the Finder was taking a lot longer. At first I shrugged it off to the filesystem being new; “It just needs some tuning, it will come along.” But that performance hasn’t come along, and after running some tests and collecting a lot more data, I’m convinced that Apple made a fundamental design choice in APFS that makes its performance worse than HFS+ on rotational disks. Performance starts out at a significant deficit to HFS+ (OS X Extended) and declines linearly as you add files to the volume.

[…]

After the very first simulation, APFS starts at a deficit — APFS takes three times as long to enumerate a million files on a rotational disk compared to HFS+ enumerating the exact same collection of files on the exact same hardware. This result on its own is staggering. As you add and remove files from the volume, however, the performance continues to decline. After just 20 cycles, APFS enumeration performance is 15-20 times worse than HFS+ performance.

This seems to be because it doesn’t keep the filesystem metadata contiguous.

Previously:

28 Comments RSS · Twitter

(Looks at entry-level 2019 iMac specs) Hmmmmmmmmm, curious.

One anecdotal datum: I have a mix of APFS and HFSplus-formatted rotational drives. The HFS+ drives are always quick to mount, whereas the APFS drives take longer and longer to mount over time. It’s to the point now where it takes 10+ minutes for an APFS drive to mount. Will probably just have to reformat.

Catalina’s restriction to APFS booting is going to be a serious pain for affordable bootable backups.

@remmah Wow, I think that explains why only certain of my clones take forever to mount. It’s the APFS ones!

Sören Nils Kuklau

This is definitely a disconnect between software engineering (we’ll make a file system for the future, and magnetic storage just isn’t as much of a concern) and hardware marketing (we need to be able to hit certain price points with the iMac, which means we’re still doing magnetic storage on the low end).

But even if that iMac aspect didn’t exist, it doesn’t really answer what their story is on backups. It just isn’t practical to put backups on flash storage (even leaving aside cost, HDDs also tend to have better longevity). Time Capsule is gone, but Time Machine is still a prominent feature. Are they killing it off altogether in favor of iCloud backups?

(Yikes.)

I wonder if this partially explains why Apple still hasn’t enabled support for Time Machine backups to APFS volumes.

With Apple moving more and more to "services", they probably want to hoover up our backup eventually. A scary thought.

I subscribe to Carbon Copy Cloner's RSS feed precisely for this sort of deep dive into drive technology. Love it! Mike Bombich and team always come correct. Probably sound like a broken record by now, but I love, love, love Carbon Copy Cloner. One of my indispensable, "If only I still used Mac OS" tools.

Mike Bombich:

No, let's not throw out the baby with the bath water. APFS has loads of really nice features, like snapshots and volume space sharing. Managing volumes within an APFS container is a dream compared to the older method of preallocating space to specific partitions. It's important to understand why we might expect to see performance differences between the two filesystems and when that might impact your use of the filesystem, but this one performance aspect on its own isn't enough reason to avoid it.

The only thing that puzzles me, if I am using spinning platters for external backup, why would I ever want to use APFS for storage? I understand the underlying system is more robust than creaky old HFS Extended, but the performance hits are hard to deny. Deal breaker in fact. 10+ minutes to mount a drive, as reported in this very thread, admittedly anecdotally, but that would not be a feasible solution in my use case.

Maybe Disk Utility could default to formatting external hard drives as HFS Extended? What is the default now?

@bob
I still do not understand why there is no real solution to local network backups for Mac OS devices. I was never a Time Capsule fan, but at least it existed once upon a time. An inexpensive Apple ARM powered device could not handle backups? Four USB ports, Gigabit Ethernet, and WiFi for $100ish? Designed around media sharing and backups? Then again, as mentioned, if everyone streams everything from/to the Apple cloud, why would anyone use local resources? So short sighted, but it is what it is.

Look, I can roll my own solution and did in fact have a networked Carbon Copy Cloner backup solution for my Macs (now it's Linux and a "CCC influenced" rsync script over SSH for my non Macs), but regular people would not want fast backup and restore without hard drives dangling off their mobile computers? This saddens me.

@Nathan I think the main benefit is that snapshots on the backup drive let you essentially get multiple backups in one. Much better than CCC’s SafetyNet feature. And, secondly, I don’t know whether any backup apps support this yet, but if you have a lot of cloned files, backing up to HFS+ can take a lot more disk space.

@Michael Tsai
I actually get the cloned file argument, but CCC used to disable snapshots on backup hard drives. Unless this has changed, seems like the feature is non functional on hard drives anyway?

Knowledge Base (ccc5): Leveraging Snapshots on APFS Volumes

APFS and snapshots on rotational HDD devices

APFS performs more poorly than HFS+ on rotational media (i.e. traditional HDD hard disks) due to the inherently fragmented nature of the APFS filesystem. This performance is particularly noticeable when performing snapshot-related activity. Starting in CCC 5.1.10, CCC will only automatically enable snapshot support on an APFS volume backed by a Solid State Device, and only when CCC can determine that the device is a Solid State Device — that assessment is often not possible on external devices. If you are encountering poor performance on an APFS-formatted HDD device, we recommend that you disable snapshot support on that volume and delete any snapshots that are on that volume. We also recommend that you consider purchasing an SSD for making bootable backups of your startup disk.

By my reading, it seems like users can enable the feature, but CCC suggests not using snapshots on hard drives.

Honestly, I am somewhat confused by the APFS changeover, even after reading such thoughtful articles from your site, Howard Oakley, and Mike Bombich. It is likely the same reason I have never attempted to move over to Btrfs nor ZFS on my Linux boxes. I am just too basic of a user to exchange the simplicity of EXT4 for the much more feature laden complexity of the former two systems. I wonder how many normal Mac users take advantage of the advanced features of APFS? Does much of its feature set bubble up in ways regular users can access?

@Nathan I must have missed that change in CCC. I believe SuperDuper does create snapshots for each backup to APFS. One of the issues is that there’s lots that APFS can do that isn’t really exposed to developers yet. There are some advantages; for instance, importing into an EagleFiler library is much faster and uses less disk space due to cloning. I think the main benefit for regular users is the way snapshots work with local Time Machine and software updates.

APFS also helps to ensure the consistency of the backup because it can copy from a fixed snapshot.

@Michael
Thanks for the info!

”The only thing that puzzles me, if I am using spinning platters for external backup, why would I ever want to use APFS for storage?”

I switched to APFS after SuperDuper recommended it.

Even better, if you're backing up to an APFS volume as we encourage, this works on a bootable copy, too! Since v3.0, we've been taking a snapshot of the backup volume before every copy we make. That means there's not just one available backup on the drive—if you've been using Smart Update, there are many! Start up from your backup drive, click the triangle, and you'll be presented with a list of available snapshots. Pick one, "Copy Now", and you've restored a day ago's backup, or a week ago's.

Sören Nils Kuklau

The only thing that puzzles me, if I am using spinning platters for external backup, why would I ever want to use APFS for storage? I understand the underlying system is more robust than creaky old HFS Extended, but the performance hits are hard to deny.

People used to use tape storage for backups.

When it comes to backups, reliability often trumps performance, and while the issue is real (and I hope Apple is working on mitigating it, and it’s even plausible this is part of why they haven’t officially supported APFS as a backup target yet), so are the sometimes-irreparable reliability issues of HFS+.

I still do not understand why there is no real solution to local network backups for Mac OS devices.

I moved to Synology last year, and it wasn’t really that hard.

Not quite user-friendly, but also nowhere near compile-your-kernel levels of frustration.

Sören Nils Kuklau

I wonder how many normal Mac users take advantage of the advanced features of APFS? Does much of its feature set bubble up in ways regular users can access?

Some of it. (And they’ve only introduced it last year; there might be more to come.)

For example, when an OS update gets installed, a snapshot gets created. This is now such a cheap operation that you don’t notice a performance or disk space hit as much (though Purgeable Storage has been known to create issues of its own…).

I believe (not sure on this) macOS even automatically rolls back to before the snapshot if the update fails. Plus, that snapshot gets shipped as a “backup” to Time Machine, as a more efficient evolution of the Mobile Time Machine feature introduced in 10.7.

@Sören I wonder whether the scattering of the metadata across the platter hurts reliability by putting so much more stress on the drive mechanism.

Sören Nils Kuklau

I wonder whether the scattering of the metadata across the platter hurts reliability by putting so much more stress on the drive mechanism.

That’s a good point.

Clearly, they made some strategic mistake/correction here, where they originally designed APFS purely with flash storage in mind.

(Maybe they were more optimistic about moving the iMac to flash-only at the time. Or maybe they didn’t really think it through.)

@Adrian
I often find myself confounded by the concept of snapshots. Are these full versions of the state? Or more like hard linked versions similar to how Time Machine creates "snapshots"?

I can do the latter with rsync, so I don't really need APFS for that feature, if the former, how does APFS snapshots not take up extra data with the extra snapshots? I mean, one could always run multiple timestamped full backups each time with other tools, so it must be something else.

Please do not take my comments are argumentative, I am truly deficient in my understanding of this particular technology.

@Sören
See, I totally get the use case for snapshots as "backup for potentially failed/buggy upgrades". Windows has been offering something like that for ages. It makes total sense for Apple to have added that feature as an obvious end user advantage.

I also get your point about tape backups. Something akin to "If I think hard drives with APFS are slow, try recovering from tape sometime!" Yeah, it is a pain. An old job had me dealing with tape backups and while I can understand the potential benefits, we migrated to redundant local backups with rotatable offsite backups, all hard drive based. We could buy bunches of drives for redundancy while still saving money and backup/recovery was much faster. There was no real advantage of tape for this particular client, thank goodness we could ditch it.

This point here is intriguing:

I moved to Synology last year, and it wasn’t really that hard.

Not quite user-friendly, but also nowhere near compile-your-kernel levels of frustration.

I have done something similar for myself and family with home servers with attached drives and routers with attached drives, respectively. While I prefer the former, the latter also functions nicely and is much simpler configuration wise in some ways. Either seems to work well enough, but I do not really expect the average person to do the same as there was not much hand holding with getting everything configured. Sure, I like to read the Arch Linux wiki and peruse DD-WRT forums, these are admittedly stranger hobbies.

Thanks for the thoughts!

Sören Nils Kuklau

I also get your point about tape backups. Something akin to “If I think hard drives with APFS are slow, try recovering from tape sometime!”

To be clear, “think of how bad we used to have it” wasn’t really my point.

Just this: everyone hated the speed of tape backups, but people still used them, because sometimes, reliability trumps everything. Leaving aside that Apple isn’t actually recommending APFS for backups at this point (and downright disallowing it for Time Machine), if they *were, you might argue that, even though directory enumerations are annoyingly slow at this point, the added reliability is a very useful feature to have for a backup.

Or, more succinctly: I don’t think the choice between HFS+ or APFS for a backup destination is a slam dunk for HFS+.

(But, personally, I use btrfs with Time Machine via Synology. Though if Synology had offered/suggested HFS+ or APFS as an option, I might have picked that instead?)

Sören Nils Kuklau

I often find myself confounded by the concept of snapshots. Are these full versions of the state?

Yes.

if the former, how does APFS snapshots not take up extra data with the extra snapshots?

It does, but it does so in a differential format. It only takes up space for the block-level differences.

Snapshots count towards purgeable space, so as they start filling up the volume, macOS will eventually prune them. (Not all apps handle this gracefully or at least have in the past; I’ve had some pretty bad issues between VMware thinking the disk is full and refusing to boot VMs, and macOS saying it isn’t, from its point of view.)

I mean, one could always run multiple timestamped full backups each time with other tools, so it must be something else.

Right. The “something else” ;-) is the efficient storage format. You effectively get multiple snapshots in time out of them, but only differences are stored.

You mentioned Time Machine’s hardlinks. That’s similar, but they operate at the file level.

Given two snapshots at 16:00:00 and 17:00:00, and you changing a few bytes in a huge file between those timestamps,

Time Machine would have to make a distinct copy of the entire file
APFS snapshots would only store those changed blocks (a block is only a few bytes each)

That’s why people would be excited for Apple to upgrade Time Machine such that it can take advantage of file system-level snapshots rather than their hardlink approach.

I have nothing to add, Sören explained it really well :)

(And honestly, I don't know that much about the details, I just assumed that Dave Nanian at Shirt Pocket knew what he was talking about when he recommended APFS.)

@Soren
I love the info you are providing. Very illuminating.

I do have a small quibble, not with your words, mind you, but with the concept of block level snapshots.

Right. The “something else” ;-) is the efficient storage format. You effectively get multiple snapshots in time out of them, but only differences are stored.

You mentioned Time Machine’s hardlinks. That’s similar, but they operate at the file level.

Given two snapshots at 16:00:00 and 17:00:00, and you changing a few bytes in a huge file between those timestamps,

Time Machine would have to make a distinct copy of the entire file
APFS snapshots would only store those changed blocks (a block is only a few bytes each)

This makes perfect sense and is a step up from file and hard link approach of course. Obviously rsync has the ability to only copy delta changes to files as well when copying over a network vs file copying when on a local share, but I'm pretty sure the time stamped archives are whole files and hard links to unchanged files.

Everything seems pretty swell with block level snapshots, however, I would quibble with the concept there are "multiple copies" of a file with the snapshot approach because it seems like there are truly only multiple parts of the same file. If you lost some of those block level snapshots, you would still have part of the file, but not necessarily the whole thing? This was always my confusion with ZFS, Btrfs, APFS, etc.

Clearly I like the concept and hopefully with the other data integrity improvements made with these file systems, data should be safer. Yet, I find the clarity of the description a bit muddled because either I am totally not understanding block level deltas or the marketing side of these tech projects are being slippery with their language. Not maliciously, just a bit glib. I know my own language in such discussions can be similarly imprecise, as I have a tendency to throw out casual references to details, and as we all know, similar terms or not always interchangeable. What I am getting at is a long winded apology for possibly needless pedantry, and yes, overly belaboring the point.

Thanks again.
😃😃😃😃

@Nathan When deleting a snapshot, APFS does not delete the blocks that are in use by other snapshots. So your other snapshots will still have the complete file.

However, since an unchanging block is only stored once, bit rot of that block will damage the “copy” of the file on all of the snapshots.

@Nathan. Snapshots that are implemented within the filesystem are, in almost all cases, implemented as an add-on to the filesystem being 'copy-on-write'.

Copy-on-write is when the filesystem only writes to free blocks.

A file is the result of a chain of data blocks that starts with a 'superblock' - the beginning of the data chain for the entire filesystem. The superblock points to directory blocks, the directory blocks points to either subordinate directories or to file blocks. File blocks the data blocks holding the file data.

Let’s imagine a FILE chain that is SB->DIR1->DIR2->FILE.BLOCK1->FILE.BLOCK2->FILE.BLOCK3, and BLOCK3 has the text in it "The End”. In older filesystems editing FILE and changing "The End" to "Not quite The End" would cause the file system to check when attempting to write to see if there was room for the modified text in BLOCK3. If there was it would overwrite BLOCK3 with the new contents, then update various metadata to reflect the changes if needed. At the end of the write the file structure chain is still SB->DIR1->DIR2->FILE.BLOCK1->FILE.BLOCK2->FILE.BLOCK3

In Copy-on-Write the same edit and save causes the filesystem to allocate a new empty BLOCK4 then write the contents from BLOCK3 plus your changes to BLOCK4. Now BLOCK3 contains "The End" and BLOCK4 contains "Not quite The End".

The filesystem will update the FILE pointer in the Directory block by allocating a new block, DIR3, and writing the contents of DIR2 plus the updated BLOCK4 pointer to it.

This leaves DIR2 and BLOCK3 on filesystem, but without anything pointing to them. The advantage of this is that updating the pointer in the directory block is a single atomic operation. If it works you have your new content saved and a good filesystem. If it *fails* (someone trips over the power cord...) then you still have a good filesystem, your file is structurally sound but you lost your edit. After you successful write your file is now SB->DIR1->DIR3->FILE.BLOCK1->FILE.BLOCK2->FILE.BLOCK4. The filesystem is then able to free DIR2 and BLOCK3 at its leisure and return them to the free pool.

Having implemented this engineers noticed that they could cheaply keep a kind of versioning, that we refer to as snapshots. If we take a copy of the Superblock before you edit the file and call it SNAP1 then we have a chain that is exactly the same as SB->DIR1->DIR2->FILE.BLOCK1->FILE.BLOCK2->FILE.BLOCK3 but is something like SB->SNAP1->DIR1->DIR2->FILE.BLOCK1->FILE.BLOCK2->FILE.BLOCK3

Once you have done your edit as above the regular path takes you to your current edited file : SB->DIR1->DIR3->FILE.BLOCK1->FILE.BLOCK2->FILE.BLOCK4 but if you want to see your file prior to your edit you could reference it via the copied superblock:
SB->SNAP1->DIR1->DIR2->FILE.BLOCK1->FILE.BLOCK2->FILE.BLOCK3

Notice that FILE.BLOCK3 only exists in the snapshot version of your FILE, and FILE.BLOCK4 only exists in the current (edited) version of the file. Ditto for DIR2 and DIR3

If bitrot happens to FILE.BLOCK3 then only the snapshot version of the file is affected. HOWEVER, and this is why snapshots ARE!NOT!BACKUPS!, if bitrot happens to FILE.BLOCK1 or FILE.BLOCK2 then BOTH versions of the file are negatively impacted as there is only the ONE copy of those data blocks

As an aside, IMO programs like SuperDuper and CCC using APFS for their target drives make sense because that lets them provide an enhancement over what they delivered with HFS+. Using the old filesystem they has a single external backup copy of your drive. With APFS they still provide a single external backup copy of your drive but and now can offer you some version choice over what you recover.

Hope this manages to make sense :-)

Thank you @Michael and @Liam. (As an aside, wow Liam, what an amazing write-up!) This matches up with what I expected, a nice bit of versioning, but bitrot is still bitrot and you need more than backup of course. Thanks!

I love these discussions, very stimulating and I always learn quite a bit.

Sören Nils Kuklau

bitrot is still bitrot

Yup. Critically, snapshots are not backups in the sense of redundancy (but they are in the sense that you can go back after making a mistake).

Leave a Comment