Wednesday, April 16, 2014

WinFS, Integrated/Unified Storage, and Microsoft

Hal Berenson:

People have been bugging me to write about Integrated Storage for some time, and with Bill Gates having just disclosed that failure to ship WinFS was his biggest product regret  now seemed like a good time.  In Part 1 I’ll give a little introduction and talk about scenarios and why you’d want an Integrated (also refered to as unified) Store.

[…]

You can solve many of the problems I described for photos by putting an external metadata later on top of the file system and using an application or library to interact with the photos instead of interacting directly with the file system. And that is exactly how it is done without integrated storage. This causes problems of its own as applications typically won’t understand the layer and operate just on the filesystem underneath it. That can make functionality that the layer purports to provide unreliable (e.g., when the application changes something about the photo which is not accurately propagated back into the external metadata store). And with photos now stored in a data type-specific layer it is ever more difficult to implement scenarios or applications in which photos are but one data type.

Hal Berenson:

So from the earliest discussions I recall Integrated Storage was always a new, Win32-compatible, file system. Accessing new functionality would be done by a new API, but you always had to be able to expose traditional file artifacts in a way that a legacy Win32 app could manipulate them. Double-click on a photo in an Integrated Storage-based Windows Explorer and it had to be able to launch a copy of Photoshop that didn’t know about Integrated Storage. And since that version of Photoshop didn’t know about Integrated Storage it also couldn’t update metadata in the store, it could just make changes to the properties inside the JPEG file. So when it closed the file Integrated Storage had to look inside the file and promote any JPEG properties that had been changed into the external metadata it maintained about the object.

Much of the complexity of Microsoft’s attempts at delivering Integrated Storage is owed to all this legacy support. Property promotion and demotion (e.g., if you changed something in the external metadata it might have to be pushed down into the legacy file format) was one nightmare that wasn’t a conceptual requirement of Integrated Storage but was a practical one. Dealing with Win32 file access details was another.

Hal Berenson:

At Microsoft you can see numerous ways that the File System team tried to accommodate greater richness in the file system without perverting the core file system concepts. For example, the need for making metadata dynamic or adding some of the things that the Semi-Structured Storage world needs was met by adding a secondary stream capability to files.

[…]

The notion of a Property Bag seems easy enough and painless enough to understand, but it clashes with the world of Structured Storage. How does arbitrary definition of metadata clash with a world in which schema evolution is (mostly) tightly controlled? Do you add a column to a table every time someone specifies a new property? If two people create properties with the same name are they the same property? If a table with thousands of columns, all of which are Null 99.99% of the time, seems unwieldy then what is an alternate storage structure? And can you make it perform?

What was different about WinFS is that most of these barriers, including the organization structure, were addressed. And the failure to deliver an Integrated Storage File System when the conditions were as close to ideal as they’ll ever be is why the concept will probably never be realized. Meanwhile the world of storage has moved on in interesting ways.

Hal Berenson:

Because I was new to Microsoft (and thus could be objective) I was asked to intervene in a spat between the Exchange team (working on the first version of Exchange Server, nee Exchange 4.0) and the JET-Blue database engine over the performance of the Mailbox Store. What I learned along the way was that the intent was for Exchange Server to be built on OFS, but since OFS wasn’t ready Exchange was doing its own interim store for Exchange 4.0. The plan of record was for the second version of Exchange to move to OFS. However, in an email discussing the performance of the existing mailbox store the Exchange General Manager mentioned that he didn’t think Exchange would ever move to OFS. While the OFS project was still alive, it was clear to me that everyone in the company had already written it off.

[…]

Longhorn itself turned out to be too aggressive an effort and have too many dependencies. For example, if the new Windows Shell was built on WinFS and the .NET CLR, and WinFS itself was built on the CLR, and the CLR was a new technology itself that needed a lot of work to function “inside” Windows, then how could you develop all three concurrently? One story I heard was that when it became clear that Longhorn was failing and they initiated the reset it started with removing CLR. Then everyone was told to take a look at the impact of that move and what they could deliver without CLR by a specified date. WinFS had bet so heavily on CLR that it couldn’t rewrite around its removal in time and so WinFS was dropped from Longhorn as well.

The WinFS project continued with the thought that it would initially ship asynchronous to a Windows release before being incorporated into a future one. But now it had two problems. First, it was back to the problem of having no Microsoft internal client that was committed to use it. And second, they eventually concluded that there was no chance in the forseeable future of shipping WinFS in a release of Windows. With the move of Steven Sinofsky, who had been a critic of WinFS, to run Windows that conclusion was confirmed. WinFS was dead.

1 Comment RSS · Twitter

If WinFS ever does get resurrected, it should probably be made at least partially open-source, with the closed-source components being heavily documented. With decent documentation and/or advertising, developers would be more inclined to actually adopt and utilize it.

Similarly, Microsoft shouldn't try to deprecate Transactional NTFS, but instead try to work it into every single one of their products (like Office, Notepad, Windows Explorer, Microsoft Update, etc) and provide as much documentation as possible.

In the case of modern versions of Windows, you could probably integrate WinFS with UWP or the Windows Store, make a .NET Core version, etc. There has never been a better time to try bringing back WinFS, especially if it made use of Transactional NTFS.

Leave a Comment