Getting Ready for Dataless Files
In a modern file system, a file’s content may not be available locally on the device. A file that contains only metadata is known as a dataless file. The file’s content typically resides on a remote server and is available to people or apps, transparently, when they access the file.
[…]
The system, or a person using the device, can make dataless files whenever they determine it’s appropriate, and your app needs to be ready to handle them. Specifically, avoid unnecessarily materializing dataless files and, when your app requires access to a file’s contents, perform that work asynchronously off the main thread.
[…]
UIDocument
andNSDocument
automatically access the file system in a coordinated and asynchronous manner.[…]
If your app or framework uses low-level POSIX APIs to access the file system and you’re unable to migrate to the preferred methods, consider the following two options[…] Be aware that
stat
andgetattrlist
both trigger the materialization of any intermediate folders in the file’s path, if they themselves are dataless.
I find this rather confusing. On macOS, it seems like nearly any file could potentially be dataless. It’s less likely for files in Library but probably possible via symlinking. Even an action as simple as checking whether a file exists can now take an unexpectedly long amount of time. This breaks many longstanding assumptions.
If your app deals with user-created files, I guess the best practice is to do everything asynchronously and using file coordination. Without coordination—at least on older systems—you can run into the opposite problem: instead of accessing an evicted file being slow, it might stay unmaterialized. So you need to use the special APIs even if you already have your file code on a background thread.
But the NSFileCoordinator
APIs are awkward, error-prone, and slow, and they infect your entire codebase. Hopefully you aren’t relying on any cross-platform code that’s not aware of them. And even with Apple-specific code, they make it hard to reuse the same code for working with folders that may or may not contain dataless files.
It all feels shoehorned in, like with the security scope URL APIs. Most APIs don’t do the right thing automatically, so you have to wrap uses of them. (But then some other APIs may secretly use coordination so you have to not use it yourself in order to avoid deadlocks.) Any file-related code could potentially need special handling, but there’s no way to make sure that you didn’t miss a spot somewhere. But then, once you’ve done this, your code is much harder to read and much slower for the common case of regular locally stored files.
Previously:
- Update on Cloud File Provider Extensions
- Modern AppKit File Permissions
- Not Relying on NSFileCoordinator
- Sandbox Limitation on Number of Files That Can Be Opened
- NSFileCoordinator Improvement in iOS 8.2
- Document-Based iCloud Problems
Update (2023-05-12): Thomas Clement:
Out of curiosity I tried to
stat()
a non-local file as described in the tech note, but I get a “no such file” error. Same when trying to access it from Terminal. Not sure how we are supposed to test whether a file is dataless then.Another thing that is not explained is what is the right way to monitor download progress in case the file is dataless.
Update (2023-08-10): Howard Oakley:
Over the last couple of weeks I have been exploring how macOS and its features handle dataless files. While apps that take advantage of AppKit’s NSDocument to read and write files should handle these problems seamlessly, there are some definite seams when it comes to macOS services. These result from three constraints:
- features reliant on the contents of file data can’t be used with dataless files;
- features reliant on file data stored outside the file aren’t available to other systems accessing that file from iCloud;
- limitations on the total size of extended attributes in iCloud storage may require some to be removed.