Exploring the New iWork File Formats
Nick Heer (via Oliver Taylor):
I think the new file format is a regression, though. I would love to know the justification for these obfuscated data files, and what advantages they bring over the previous XML-based format. I’d love to be able to tell you what advantages they bring, but they’re unreadable. This isn’t yet a problem for end users, aside from the lack of backwards compatibility, but it might be in the future.
No more documented XML format or included PDF version, which was much better for previewing than Quick Look. Note that Apple is not eating its own dog food here. The file format does not use property lists, NSKeyedArchiver
, or Core Data.
Update (2013-10-27): Drew McCormack:
They have split that potentially large XML document into many small binary files. Each file can now be loaded in isolation, and this is much better for iOS. Effectively, they have built a partial-loading document format. Closer inspection shows that each slide is a separate file, so they can just load what is on the current slide, and leave the rest on disk.
This makes a lot of sense except that Core Data is already a reasonably compact, partial-loading document format. It has efficient syncing support as a built-in feature, and the underlying SQLite format is robust and open. On paper, Core Data is what an app should use in this case. Yet the iWork team apparently had so little confidence in Core Data (or perhaps the iCloud portion) that they invented a whole new file binary format.
Update (2013-10-29): Drew McCormack:
Apple is apparently using Google’s Protocol Buffers for iWork’s file format.
Update (2013-11-08): Sean Patrick O’Brien has an in-depth look at the new file format:
Components are serialized into .iwa (iWork Archive) files, a custom format consisting of a Protobuf stream wrapped in a Snappy stream.