Friday, March 31, 2017 [Tweets] [Favorites]

APFS to Add Case-Insensitive Variant for Mac

Apple has updated its APFS Guide (via Thomas Zoechling):

APFS has case-sensitive and case-insensitive variants. The case-insensitive variant of APFS is normalization-preserving, but not normalization-sensitive. The case-sensitive variant of APFS is both normalization-preserving and normalization-sensitive. Filenames in APFS are encoded in UTF-8 and aren’t normalized.

HFS+, by comparison, is not normalization-preserving. Filenames in HFS+ are normalized according to Unicode 3.2 Normalization Form D, excluding substituting characters in the ranges U+2000–U+2FFF, U+F900–U+FAFF, and U+2F800–U+2FAFF.

The first developer preview of APFS, made available in macOS Sierra in June 2016, offered only the case-sensitive variant. In macOS 10.12.4, the APFS developer preview was updated to also include a case-insensitive variant. In iOS 10.3, the case-sensitive variant of APFS is used.

[…]

Directory hard links are not supported by Apple File System. All directory hard links are converted to symbolic links or aliases when you convert from HFS+ to APFS volume formats on macOS.

[…]

Apple plans to document and publish the APFS volume format specification when Apple File System is released for macOS in 2017.

Regarding the normalization issues that I raised last week:

Developers should be aware of behavior differences between normalization sensitivity and insensitivity which may arise when an iOS device upgrades to iOS 10.3 and migrates the filesystem from HFS+ to APFS. For example, attempting to create a file using one normalization behavior and opening that file using another normalization behavior may result in ENOENT, or “File Not Found” errors. Additionally, storing filenames externally, such as in the defaults database, CoreData, or iCloud storage may cause problems if the normalization scheme of the filename being stored is different from what exists on-disk.

But Apple doesn’t describe any solutions.

It’s also not documented how long APFS filenames can be. It would be nice to have an API for this.

Update (2017-03-31): I think “normalization-preserving, but not normalization-sensitive” means that (like HFS+ on the Mac, unlike APFS on iOS) you cannot have multiple files whose names differ only in normalization. And you can look up a file using the “wrong” normalization and still find it. Additionally, beyond what HFS+ offers, if you create a file and then read the directory contents, you’ll see the filename listed using the same normalization that you used.

Update (2017-04-02): Here’s a thread with someone confused because Apple’s guide said that using NSURL would handle the normalization issues, but it didn’t.

Update (2017-04-07): Howard Oakley:

The TL;DR is that both variants of APFS will cause problems – they are just different problems requiring different solutions. Either way, many current apps, tools, and scripts will perform strangely when run on APFS, and many will therefore need to be revised and updated to cope with it.

Update (2017-04-14): DropDMG 3.4.6 adds support for creating blank case-insensitive APFS disk images to help developers test their Mac apps with the new file system.

4 Comments

For the benefit of those of us that are technically-minded, but not with file-systems, can you provide examples of each scenario?

@Ted Here’s an example:

A precomposed character (alternatively composite character or decomposable character) is a Unicode entity that can be defined as a sequence of one or more other characters. A precomposed character may typically represent a letter with a diacritical mark, such as é (Latin small letter e with acute accent). Technically, é (U+00E9) is a character that can be decomposed into an equivalent string of the base letter e (U+0065) and combining acute accent (U+0301). Similarly, ligatures are precompositions of their constituent letters or graphemes.

So with the “bag of bytes” you can have two filenames that look like “é” but are made up of different byte sequences. On iOS, if you try to read the file “e followed by acute accent” but it was saved as “Latin small letter e with acute accent”, you will not find the file. With APFS on the Mac, you will. With HFS+, no matter which name you use when saving the file, you’ll get “e followed by acute accent” when you list the directory. With APFS on the Mac, you’ll get the one that you used when creating the file.

> It’s also not documented how long APFS filenames can be. It would be nice to have an API for this.

This should be available via pathconf(2) with _PC_PATH_MAX or _PC_NAME_MAX.

@Mark Thanks, however I’m not sure I trust that API. HFS+ is supposed to allow 255 UTF-16 encoding units (kHFSPlusMaxFileNameChars) but pathconf(_PC_NAME_MAX) returns 255 for me for both HFS+ and APFS paths. Same with _PC_NAME_CHARS_MAX, although I’m not sure what that is supposed to mean. The man page say that _PC_NAME_MAX is bytes, so 255 is the wrong answer for HFS+.

Also, there doesn’t seem to be a pathconf() equivalent to kHFSMaxVolumeNameChars.

Stay up-to-date by subscribing to the Comments RSS Feed for this post.

Leave a Comment