Monday, October 23, 2017

How Well Do Filesystems Handle Errors?

Dan Luu (tweet, Hacker News):

We’re going to reproduce some results from papers on filesystem robustness that were written up roughly a decade ago: Prabhakaran et al. SOSP 05 paper, which injected errors below the filesystem and Gunawi et al. FAST 08, which looked at how often filessytems failed to check return codes of functions that can return errors.

[…]

No tested filesystem other than btrfs handled silent failures correctly. The other filesystems tested neither duplicate nor checksum data, making it impossible for them to detect silent failures. zfs would probably also handle silent failures correctly but wasn’t tested. apfs, despite post-dating btrfs and zfs, made the explicit decision to not checksum data and silently fail on silent block device errors.

[…]

Relatedly, it appears that apfs doesn’t checksum data because “[apfs] engineers contend that Apple devices basically don’t return bogus data”. Publicly available studies on SSD reliability have not found that there’s a model that doesn’t sometimes return bad data. It’s a common conception that SSDs are less likely to return bad data than rotational disks[…]

Plus, APFS can be used on non-Apple SSDs as well as on hard drives, so there’s really no reason to believe that checksums wouldn’t detect errors.

Previously: Apple File System (APFS).

3 Comments RSS · Twitter

Sounds like Apple has cooked up a real winner with their Apple P File System (q.v. also the other article regarding 10 lbs 💩 in 5 lb bag). Although this decision does fit in line with their apparently increasing lack of regard for software QA.

Not checksumming data in APFS is a baffling design decision for a new file system on the part of Apple. I can't tell if they just didn't have it ready in time for 1.0 and its on the back burner, or they truly think it is unnecessary.

in APFS there're checksums for metadata blocks though. (see https://mjtsai.com/blog/2016/06/17/apple-file-system-apfs/)

Leave a Comment