The Switch From File Paths to URLs
I don’t think we ever documented this officially, but to understand this choice you have to look at the history of macOS. Traditional Mac OS did not use paths a lot. Rather, files were identified by an
FSSpec
, which contains a volume identifier, a directory ID, and a name. The directory ID was an HFS [Plus] catalogue node ID (CNID), which is kinda like an inode number.Additionally, starting with System 7 it was possible to track a file with a volume identifier and the file ID, that is, the CNID of the file itself.
This was quite tricky to support on a Unix-y platform like Mac OS X. At the lowest levels of the system you needed the ability to manipulate files based on CNIDs rather than paths. For an explanation of how this was done, see QA1113 The “/.vol” directory and “volfs” (note, however, that
volfs
is no longer a thing and the same functionality is now implemented in a very different way).[…]
So far, so much obscure backward compatibility. However, since we made the decision to use file URLs we’ve exploited that to significant advantage[…]
Via Matt Gallagher:
There’s a lesson about attaching data (like security attributes) to an opaque interface (like
NSURL
). Because my mental model ofNSURL
is as plain RFC-3986 storage, these attributes are easy to lose and the security behaviours are easy to forget, when moving data around an app (I wish we received a bookmark type that made this explicit).
The original proposal was not to use a
NS
/CFString
object encapsulating the path or aNS
/CFURL
object, and instead use a new object type to identify a file’s location, to cache properties, etc. That idea was vetoed in early API reviews because there were already API that took file locations as paths or URLs. We were told to pick path or URL. We chose URL objects over string objects.I still think a new object type would have been cleaner and better in the long run. 🤷♂️
[…]
FSRef
s were not objects so they didn’t fit into the Cocoa (or CoreFoundation)API memory model. They were also a fixed size glob of memory so expanding their functionality was very difficult. One of the things I did in my last year at Apple was to make the old Carbon File Manager work well with APFS and its 64-bit inode numbers. That meant making shoehorning 64-bit file and folder ids intoFSRef
s and translating them to 32-bit ids for the old File Manager API. Fun hacking 😀
Previously:
- Swift URL Improvements at WWDC 2022
- Modern AppKit File Permissions
- The Weirdness of NSURL’s isDirectory Flag
- PATH_MAX Blackholing
Update (2024-11-25): Uli Kusterer:
I’m still kinda confused
(NS)Stream
was so late to arrive and has so little support for serializing objects. I guess PowerPlant spoiled me.
I was and remain surprised that the stream APIs are comparatively awkward and rarely used throughout Apple’s frameworks.
How bad was the URL performance? Using the new “efficient” Snow Leopard URL file system property API: getattrlist (the Apple extended lstat function that gets more file system data) was only around 5% of the time. Caching that data in newly created CF/Cocoa objects and returning those objects was another 25% of the time. The rest of the time was spent creating the URL from the file system path and getting the file system path back from the URL.
After fixing the URL performance issues, Apple found lots of other things to change that helped the performance of those API, and thus other parts of the system: like tagged pointers (file system properties are mostly numbers), and optimizing all-ASCII CF/NSStrings (like URL strings).
Better performance tools really helped us easily determine what was causing performance issues.
Until Lion or maybe Mountain Lion, the new “efficient” URL file system API was a lot slower than using the old Carbon File Manager API.
Those are the two primary reasons Apple did not add a URL property to get a directory’s valence (like was available in the deprecated Carbon File Manager API).
1 Comment RSS · Twitter · Mastodon
Interesting. In the early days of that switch, I never liked the URL as an abstraction of a path, because it basically just added extra bytes to a path and was just as fragile. Even worse, it had dismal performance in hashing containers due to collisions; I think this may have eventually been fixed. I always thought there was some grand idea of instantiating NSDocuments and so forth from any URL scheme, but in practice I never wanted to hand it a remote URL.
In BibDesk, we eventually ended up following the lead of Chris Hanson's BDAlias and making a class that represented a file. Internally (back when I was involved) it used an FSRef, but could give back Alias, path, URL, or -fileSystemRepresentation to hand off to Cocoa. We serialized aliases to disk, in the days before the more cumbersome (IMNSHO) bookmark stuff was added to NSURL. This was pretty lightweight, and we had good performance for the most part, at least in the days of spinning disks.