Wednesday, June 19, 2019

Backing Up VM Image Files to Internet Backup Services

Adam Engst:

Seriously? Code42 is actually admitting that CrashPlan may not back up large files correctly? Isn’t that Job #1 for any backup app?


After the initial upload, apps like Backblaze and CrashPlan do block-level data deduplication, which means that they analyze small blocks of each file, compare them to what’s already backed up, and copy only those blocks that are new or changed. It might seem as though large files wouldn’t present a problem after initial backup as long as they didn’t change all that much. However, as Yev pointed out, the resources necessary to analyze all the blocks in a multi-gigabyte file are significant—you need enough drive space to store a copy of the file, and then the backup app has to spend a lot more time and CPU power analyzing all those blocks.


For these reasons, Backblaze also excludes VM image files and other large file types (it also doesn’t back up system files or applications), as you can see in the app’s Exclusions screen.

In both cases, you can adjust the exclusion lists to include these files.


Update (2019-06-20): Kyle Howells:

You can remove the file type exclusions, but not the location exclusions (I tried and failed to get it to backup the applications directory).

See also: Boot SuperDuper! backup in VMWare Fusion.

4 Comments RSS · Twitter

As a former CrashPlan user here - one reason I picked it many years ago is that it does block-level deduplication. If it had stopped backing up VM images prior to dropping the home version then I'd have been done with it much sooner.

Currently I use a combination of backup utilities. I use Arq for file-level backup on Mac and Windows, but don't find it does a very efficient or fast job with VM images.

I use and recommend Duplicacy for VM backups (and also use it for backing up my servers). Its author has some recommended settings ( on how to configure it for VM backups. I use the command-line version, and there is now a Web UI which is substantially easier to use than the older GUI but I wouldn't really recommend it unless you're at least somewhat comfortable on the command line.

Ghost Quartz

If your backup occurs while a VM is running (or any disk image-backed filesystem is mounted r/w), doesn’t that also carry the risk of a corrupted backup? The disk image or VM’s file system needs to be unmounted, or at the very least the files backing it need to somehow be copied atomically, otherwise your backup might result in a disk image or VM where the file system is in an inconsistent state.

Ghost Quartz

I suppose I should clarify that I think there are two issues:

(1) Backing up a file system that hasn’t first been cleanly unmounted (e.g. because the VM is still running), even if the disk image backing it can be atomically copied
(2) Backing up a disk-image split across multiple files, in which case even copying it atomically may be a challenge

Anyways, in either case, I think my point is that some things require a bit more care when backing up, even *if* your backup strategy of choice correctly handles large files.

They recently sent an email saying they were going to block backing up /Applications and ~/Applications. That's fair enough, I suppose. However, I have a folder ~/work/Xcode/iOS/Applications where I have all the source code for the apps I develop. Guess what? That's not being backed up anymore (as I found out today when trying to restore something). Aaaarrghhhh!

Leave a Comment