Monday, April 22, 2019

Performance Considerations When Reading Directories on macOS

Thomas Tempelmann:

Unfortunately, with Apple’s new file system APFS, and the fact that any macOS running High Sierra or Mojave got their startup volume converted from HFS+ to APFS, search performance has decreased by factor 5 to 6!

[…]

I’ve tried to find out which of the various methods of reading directories, looking only for file names, is the fastest: I had to scan the same directory tree with every method separately.

[…]

contentsOfDirectoryAtURL and getattrlistbulk perform equally indeed, just as predicted, with the latter usually being a bit faster once the data comes from the cache.

On APFS, NTFS and SMB, opendir() is significantly faster than the other methods, which is quite surprising to me.

[…]

When accessing a Mac via SMB, contentsOfDirectoryAtURL is faster than the other methods, but only on the first run (see red field). Once the caches have been filled, it’s slower. I can’t make sense of it, but it’s a very consistent effect in my tests.

Previously: APFS and Fast Catalog Search.

Update (2019-04-23): Thomas Tempelmann:

fts_open() / fts_read() are, in most cases, faster than readdir()contentsOfDirectoryAtURL and getattrlistbulk. Exceptions are network protocols, where especially the retrieval of additional attributes makes it slower than the other

4 Comments RSS · Twitter

Martin Wierschin

If I were doing this test I'd also benchmark NSDirectoryEnumerator. It doesn't exactly fit the narrow subtask of "reading directories", but it does seem entirely relevant to the overall goal of searching the entire file system. Maybe NSDirectoryEnumerator is just as slow as (or even implemented using) repeated calls to -contentsOfDirectoryAtURL. I wouldn't know since reading the file system has never been a bottleneck for me. But I'd want to give NSDirectoryEnumerator a fair shake, given that it theoretically could cache directory traversal information as the single enumerator tears through the hierarchy.

@Martin I believe NSDirectoryEnumerator is implemented with fts (which Tempelmann mentions in the updated post). Certainly, the Swift version is.

@Michael This code is only used on Linux, the Apple platforms foundation is closed source, so we don't know how it is implemented.

@Nick Yeah, sometimes the Linux implementations mirror the Mac ones, and I thought I had read that the Mac version used fts. However, disassembling the frameworks it seems like NSDirectoryEnumerator calls into the CF version, which uses getattrlist().

Leave a Comment