Thursday, December 18, 2014

Mac Document Model: Don’t Lose My Data

Glenn Reid:

I was editing an important file, but left it open (commonplace, and usually not destructive). Meanwhile, I edited the same file on another computer, with different changes, and saved it to my shared (Dropbox) location, so it sync’ed out from under TextEdit.

This has happened countless times in the past, and TextEdit was smart enough to notice it, and tell me not to save over the other file. This dialog box purports to do the same thing, BUT WITH A CRITICAL DIFFERENCE. It does not allow me to Save As… to preserve my changes (because Save As… is not a feature any more!).

My two choices are [LOSE CHANGES] and [LOSE OTHER CHANGES]. How is that a good choice?

I still find the new document model confusing. If I open a file in TextEdit and start typing, the indicator in the window’s close box shows that there are changes. This used to mean unsaved changes, but now it means something like changes since opening the document. Viewing the file with Quick Look or BBEdit shows that TextEdit has already saved the changes to disk. The file on disk does not match my last explicit save point, which is the way Macs worked for nearly 30 years. Instead, if I close the TextEdit window and elect not to save changes, TextEdit then fetches the last explicitly saved version of the file and writes it on top of the newer version on disk.

In short, the contents on disk always match the contents in the window, but you have the opportunity to revert if you want. This is inconsistent with history and with applications that don’t or can’t use the new document model. And it causes confusion in situations like the one Reid describes.

I’m glad to see Reid blogging again. His new post about Safari is also good.

Update (2014-12-21): Michel Fortin:

A problem I see with the new model is that version management isn’t that well done. Currently you have to open the document and navigate through a Time Machine-like UI to revert to an older version. If you just want to make a copy of the old version of a document somewhere, the way to accomplish that is terrible. It’d be much better if the Finder could let you browse previous save points for a document (perhaps in the Get Info window?). It should also let you open those previous versions, copy them elsewhere, and delete the ones you no longer want. The side-by-side view within the app can be convenient at times but is also is a terrible at other times (such as when the app crashes when reading your corrupted document!).

Perhaps another thing that is confusing is that there is actually three modes right now: the old model (save/don’t save/cancel), the autosaving model one with no prompt, and the new autosaving model with a prompt (save/revert changes/cancel).

Update (2014-12-27): Brian Webster:

This is the point where the aforementioned confusing language and UI comes in. If we take a closer look at this message, there is a key phrase that’s easy to miss:

Click Save Anyway to keep your changes and save the changes made by the other application as a version, or click Revert to keep the changes from the other application and save your changes as a version.

This is the “Versions” part of Versions & Autosave kicking in. No matter which button we choose, both versions of the document will be saved: our version from TextEdit, and the version written behind our backs by Dropbox/nano. The only difference between the two choices is which version we will see in the TextEdit window immediately afterwards.

After clicking “Save Anyway”, if we go to File > Revert > Browse All Versions…, we can see that both versions of the document are still available.

10 Comments RSS · Twitter

It turns out that I like the new document model. I think it makes more sense than the older one. But if you're accustomed to the old one, I agree that it can be hard to adjust. For my part I'm well accustomed to it now.

The conflict resolution model is rather crude and lacks options (it'd be nice to have something like the Versions navigator to see conflicting documents side by side). But note that the new document model is pretty much required for sync to work at all. That's because you'll have a lot more conflicts if you wait for the user to explicitly save a document before syncing it to other devices. You need to have a single true version of the document; you can't have various unsaved states on different devices that'll conflict with each other when you try to save.

I cannot stand the new document model--it takes control of document saves away from me. I'm so glad Microsoft and Adobe have not adopted this model. Can you imagine instant overwrites, without prompting, in something like Photoshop?

@Michel Another problem with the new document model: I modified a PDF in Preview and didn’t save the changes. Of course, Preview saved them to disk, anyway. When I close the window and try to revert, it crashes. Reverting using the Versions browser beachballs. So there seems to be no way to get back the original saved copy of the PDF.

Add me to the camp of disgruntled users.

I will often open a document from my email application which will open a locally cached version from its attachments folder.

If I make changes and need to save a copy then I used to be able to just Save As, but now I need to duplicate instead, here I have to give the duplicate a new name which is not the same as the old, since the duplicate goes into the same folder (which is my mail applications cache folder), after that, I can use Rename (which has no key equivalent) to place it in the proper folder, and remove the “copy” suffix from the name.

…and if I duplicate after I made a change, I then also need to revert changes in the original document.

So instead of the well understood Save As, we now need to duplicate, rename, and revert, to emulate that same function.

"In short, the contents on disk always match the contents in the window, but you have the opportunity to revert if you want. This is inconsistent with history and with applications that don’t or can’t use the new document model."

More importantly, it's inconsistent with they way it works in Hollywood Movies. Which means that someone in North Korea will not be pleased.

The old document model is primitive garbage; the new one an absolute bombsite. It's Apple trying to do everything by halves, glomming new functionality onto what's already there instead of starting over from scratch. Which is understandable: look how badly they got burned in the early 90s with Grand Plans like Taligent, and how successfully they incrementally evolved NeXTStep into OS X a few years later. But evolution has its own limitations: it can't make radical jumps, which is what they're trying to do here. It can only creep along, accumulating kludges that are "good enough".

First problem: hard disks. Modern geeks don't know their history: if they did, they'd know that using hard disks - aka secondary memory - is just an ugly, painful workaround for the physical shortcomings (high cost and volatility) of main memory. In an ideal world computers would contain enough fast, cheap, persistent main memory to hold all of the user's programs and data at once. This not being an ideal world, we have to hack it by endlessly shuffling chunks of our data between a small amount of fast, expensive, volative main memory (when we're using it) and a large amount of slow, cheap, persistent main memory (when we're not). Soon as you do that, of course, you've got two copies of your data in play, which means you've got synchronization to consider.

Back when PCs were non-networked, single-user (and even single-program) devices the software could get away with ignoring the risk of races because the hardware's own limitations made those unlikely. But now our hardware has evolved into ubiquitously networked systems spanning multiple devices, programs, and users; and the software has done almost nothing to keep up. Pretending it knows how to replicate is not the same thing as doing it for real: the slightest twitch, and the whole fake edifice collapses to dust. And vendors claiming it's the fault of users/networks/cosmic radiation/Act of God is NO excuse for dropping data on the floor: it's entirely their fault for building these fault-intolerant systems in the first place.

Second problem: Open and Save. These two dialogs infest every document-based application ever devised, and they are the Devil's own work. The *only* reason they exist is because the original Mac was so heinously underpowered it was impossible to run more than one program at a time. Thus, users could run either Finder OR SimpleText, but not both at the same time. The correct architecture was for Finder to supply all document management services while SimpleText, MacPaint, etc. focused solely on document editing services, which is what the Apple Lisa (and, I'm guessing, Xerox Star) did. But since Mac hardware was too crap to keep Finder running at all times, they had to kludge a sort of bare-bones mini-Finder into every single document editing application instead: just enough to minimally manage the documents being worked on by that app, though not enough to eliminate the need for Finder for managing those documents more generally.

What an absolute mess. iOS tried to reign in that madness by handing over all document management responsibilities to the individual applications. This was good in principle as eliminated all of the old ambiguity as to who is in charge of what. However, it fell down somewhat in practice because without any sort of standard IPC architecture there was no way for apps to share data with each other, thus restricting the user's ability to interact with any given document, since only the app that owned it could edit it. Things are slowly creeping forward with glommed-on IPC services such as Extensions, but it's all still very painful and there's no real sense of a guiding intelligence directing it all with a clear, logical vision as to how it should all ultimately work.

Third problem: the conflation of "document" and "file". Again, this is a consequence of history: early computers didn't have the power to provide more than a minimal abstraction between serialized bytes on disk and working bytes in RAM. But it's pathological nonsense, a lie that has long since outgrown any usefulness, and it needs to die in order that we can move on at last.

Consider: what is a "document"? If I give you something named `ReadMe.rtfd`, that's a document, yes? Is it a file? No. (It's a collection of associated files, packaged into a directory or compressed format for easy handling.) What about the document's own evolution? If I make some edits to that document, is not that history part of the document too? (Some apps such as Word kludge their own change tracking into the document structure itself, but that is hardly an efficient or reusable approach.) Likewise, what about documents shared and worked on by groups of people? Who can do what (permissions) and who did what where (multi-user change tracking) are part of that document's make-up too.

And where should a document live when it's not in main memory? A local hard drive is hardly an ideal choice: tied to a single hardware device, the document is vulnerable to loss and painful or impossible to access from any other device. Really, you want the document's data replicated across some subset of the larger network, partly for protection (redundancy), partly for efficiency (caching). That in turn demands replication, synchronization, and - as much as possible - conflict resolution be automated, because users have far better things to do than be the machine's own personal slave 24/7.

In other words, we can't fix any of these problems until everyone forgets what they think they know (because it's cargo cult nonsense), and stops waxing lyrical for the good old days (which are not only long gone, but never truly existed anyway). No, it's not a trivial problem to solve, but once you start thinking this way, suddenly a lot of ubiquitously horrific nightmares (e.g. "backups", "file sharing") completely vanish - and if that alone doesn't justify doing it then I don't know what does.


Anyway, back to Apple in particular... If Steve Jobs made one huge strategic mistake in rebuilding Apple, it was in failing to re-establish the Advanced Technology Group once the company was back on its financial feet and able to afford it again. While beancounters might consider such blue-sky brain trusts to be wasteful frivolities that contribute nothing themselves to the bottom line (e.g. look at Xerox PARC), freed from the need to deliver product today they can dedicate themselves to discovering the problems that future products will face, and come up with robust, elegant, unified technological solutions to those problems before they become critical, as well as plan the strategies that will migrate today's users and data to these new systems once the time is right.

Instead, we have this penny-ante amateur hour crap, where the clueless lead the blind into pitfall after pratfall. Layers of crap pile atop layers of crap, because those responsible fail to grasp that it's impossible to build good work on top of bad work: you have to rip out that bad work entirely and rebuild it as good before even trying to proceed further. Everyone is pissed, and rightly so because this stuff keeps promising to make our lives better and consistently lets us down instead.

The Old Ways need to go: that should be apparent to anyone. (Well, everyone except OCD make-work nerds who think diddling with one's file systems all day is a productive use of time. Needless to say, they're also the problem, not the solution.)

The question is: how can anyone create a New Way when nobody even knows what they're doing any more?

"My two choices are [LOSE CHANGES] and [LOSE OTHER CHANGES]. How is that a good choice?"

As the line goes in War Games: "The only winning move is to run Snowy. How about a nice game of chess?"

First problem: hard disks. Modern geeks don't know their history: if they did, they'd know that using hard disks - aka secondary memory - is just an ugly, painful workaround for the physical shortcomings (high cost and volatility) of main memory.

...I'll come in again. From []:

The main difference between The Machine and conventional computers is that HP’s design will use a single kind of memory for both temporary and long-term data storage. Existing computers store their operating systems, programs, and files on either a hard disk drive or a flash drive. To run a program or load a document, data must be retrieved from the hard drive and loaded into a form of memory, called RAM, that is much faster but can’t store data very densely or keep hold of it when the power is turned off.

HP plans to use a single kind of memory—in the form of memristors—for both long- and short-term data storage in The Machine. Not having to move data back and forth should deliver major power and time savings. Memristor memory also can retain data when powered off, should be faster than RAM, and promises to store more data than comparably sized hard drives today.

Huh, HP. Not as dead as we were led to believe...

Leave a Comment