Tuesday, March 3, 2020

APFS Snapshots and Large Files

Howard Oakley:

Snapshots are more efficient than regular backups. If a single byte changes in a file, the whole of that file has to be copied in the next backup. Snapshots keep only the parts of the file that change, so that the original can be reconstructed. But over time and use of that file, the amount of it which has to be retained to restore its original state inevitably rises up to the limit of the whole file size.

[…]

If you have sufficient free disk space to include VM and other large files in backups and snapshots, then you don’t need to change their location or policies.

To maintain better control of backup storage, you should move VMs and other large files to a separate volume, and add the whole volume to the Time Machine exclude list, or disable snapshots on that volume in Carbon Copy Cloner.

See also:

Previously:

7 Comments RSS · Twitter

See, as Mac users of pre APFS years, we already had this feature, no? I always thought Time Machine handled backups cleverly, when using sparse bundles anyway. Instead of a big disk image chunk that had a predetermined size, sparse images introduced the concept of an image that grew as needed, up to the predetermined allocation. Sparse bundles went ever further, slicing up each chunk of the image into 8MB bands which meant you could use pretty much any file system for backups.

While Time Machine was unnecessarily flaky, I do not think the fault lied with the concept of sparse bundles nor necessarily "creaky", "old" HFS Extended. APFS may have brought many features to the Mac, the concept of backups being smaller slices that could write just the changes is certainly not new. Never mind certainly can transfer delta changes…

Part of the reason Time Machine is flaky *is* the creaky, old HFS+. Whenever you get that message "To improve reliability, Time Machine must create a new backup for you" it means that the HFS+ file system in the sparse image has trashed itself somehow.

You can mount the sparse image as a device and run a "diskutil repairVolume" which *will* find errors, and sometimes even fix them.

It is not the fault of the underlying filesystem, you get that on local attached disks, Time Capsule, Synologys, Apple Server, ZFS NAS devices.

One of the reasons I built a Nas4free box was to get ZFS snapshots so that when the HFS+ filesystem ate itself I could just promote a snapshot and next time Time Machine ran to that device it would catch up. HFS+ is old, flaky and more than creaky. It's not just in sparse-images either - it eats itself on your running OS disk as well. I have 5 different network attached devices that are Time Machine targets and see that error every 3-4 months.

"Never mind rsync certainly can transfer delta changes…"

Forgot a word there, sorry, but to expand upon this point, this feature certainly behaves that way for me, but I think you can force whole file transfers and you can even tune the block sizes of the transfer.

@Nathan Yes, putting a big file like that on a sparsebundle could make it more friendly to Time Machine backups. But the bands were much larger than APFS blocks, so it was less space efficient than APFS snapshots.

@Liam
Then why were we not reinstalling Mac OS constantly? A boot drive would see far more thrashing than a backup drive; yet, Time Machine, constantly messed up, boot drive, generally functional. What was weird, I ran HFS Extended as my backup drives with Carbon Copy Cloner and never had the same problems as Time Machine. I had so many problems with Time Machine, I gave up on it fairly quickly. New OS, I would test it, but it would fail basic tasks for me, so I would move on from it until the next OS release.

@Nathan I think it may be the fact that Time Machine volumes had so many more files. I don’t know exactly why, but that seemed to cause more reliability problems for HFS+.

@Michael
Normally I would agree, but my Carbon Copy Cloner drives were set to archive changed/deleted files and would have a comparable amount of extra data compared to when these drives were using Time Machine. To be fair, I did not set CCC to backup every hour (but my main system was three times a day). Neither can I remember the last time the backup stuff built into Windows 10 nor my rsync script simply ate my backup, no matter how many times I ran the backup or had backups interrupted by network outages.

Yes, yes, those drives no longer use HFS Extended but for a while after making my transition from the Mac, my media/data drives continued to run HFS Extended and Windows and Linux worked with them quite well (the former with the excellent Paragon HFS driver). This is but one anecdotal experience, but as old and creaky as HFS had become, I only had consistent problems with Time Machine itself, not Carbon Copy Cloner, not SuperDuper!, not any other backup methods, not even with the same HFS+ formatted drives. Not even as @Liam mentioned when I also ran sparse bundles on other network attached file systems. I'm sure there was data degradation only because it is possible with any dataset, but not to the extent of constantly needing to either begin new backups or attempting repair just to make them readable.

I am not saying HFS+ was trouble free, I agree it was getting on in age and had some problems, but everything I read in @Liam's post mentions trouble originating from one other commonality besides HFS+…Time Machine itself. I would be truly curious if one were to separate out the Time Machine trouble from the HFS+ trouble and do a deep dive into their findings. Again, in my anecdotal case, my drives more or less just worked when formatted HFS+ but did not when using Time Machine with them. Disk Images of various sorts also had trouble when being accessed by Time Machine, but not from SuperDuper! nor CCC.

Either way, thanks for the fantastic discussion! Really enjoying these tangents about backup and file systems.

Leave a Comment