Tuesday, March 18, 2014

How Was the PDF Format Created?

Alan Tracey Wootton:

John Warnock had the idea that every document that was ever printed, or ever would be printed, could be represented in a document. This was not an unreasonable idea since Postscript was designed for this purpose and Adobe also had some code from Illustrator that would handle the fonts and graphics and code from Photoshop to display images. So, Warnok started a project (the Carousel project) on his own initiative to pursue his idea that eventually the whole Library of Congress could be represented in an archival electronic format.

[…]

Peter Hibberd had written a demo of an ‘object oriented file format’ so Richard Cohn and Alan Wootton went to work trying to adapt his work for use on the Carousel project. After many weeks of struggle it was decided that adapting his work was going to be more work than writing new code and that some of the ‘object oriented’ concepts were not applicable since it was finally becoming obvious that a key-value format was going to be part of the solution. This was the third file format.

[…]

The name ‘Acrobat’ was created by a market research team from back east.

Leonard Rosenthol:

Concurrent with the release of Adobe Acrobat & Reader 1.0, the specification was published. So while it was proprietary, it was also published and open to all to use (even the patents were made available on a free basis!) This is how open source tools such as Ghostscript and PDFlib have been able to support PDF for most of those 20 years.

John Warnock (PDF):

This document describes the base technology and ideas behind the project named “Camelot.” This project’s goal is to solve a fundamental problem that confronts today’s companies. The problem is concerned with our ability to communicate visual material between different computer applications and systems. The specific problem is that most programs print to a wide range of printers, but there is no universal way to communicate and view this printed information electronically.

[…]

In this example the new redefined “moveto” and “lineto” definitions don’t build a path. Instead they write out the coordinates they have been given and then write out the names of their own operations. The resulting file that is written by these new definitions draws the same polygon as the original file but only uses the “moveto” and “lineto” operators. Here, the execution of the PostScript file has allowed a derivative file to be generated. In some sense this derivative file is simpler and uses fewer operators than the original PostScript file but has the same net effect. We will call this operation of processing one PostScript file into another form of PostScript file “rebinding.”

And, speaking of file format longevity, Kendall Whitehouse writes:

From its inception, PDF was, at least in part, a self-describing format. It specifies the filters used to encode its own data stream and, from the outset, Adobe’s Acrobat viewers were designed to interpret a PDF file through these filters. By changing the filter used to decode its own data, Acrobat was able to switch from a pure ASCII file to binary-encoded format. Acrobat Reader 1.0 could read the binary files created by the forthcoming Acrobat 2.0 products.

Jim King:

I have written a paper attempting to describe how Adobe managed the evolution of the PDF file format for over 15 years before turning its management over to ISO. […] This paper was derived from an internal Adobe technical note written by me and a task force of employees who studied the whole issue of versions and compatibility in 2006.

3 Comments

"And, speaking of file format longevity, Kendall Whitehouse writes:"

I remember back in the early 1990's, working on a Macintosh IIfx with ColorStudio, and seeing the vaporware promotional brochures from Adobe about what they intended with .pdf. I thought, if they could actually pull it off, it would be world dominance for them. And while I don't know how much they've been able to directly profit off .pdf, and I don't know how much it's helped the rest of the Adobe product line, I do know two things: the company is still around, and they really did deliver on what those seemingly vaporware brochures promised...

(Nice roundup, BTW, Michael.)

There sure were a lot of PDF competitors early on. I found a mention of a bunch of them in the still-online Internet Starter Kit for Macintosh here:

http://tidbits.com/iskm/iskm3html/pt4/ch25/ch25d.html

I recall using Farallon's Replica, whose icon was a bold italic sans serif "R" on a diamond, but I can barely find any information on it. WordPerfect Envoy at least has a Wikipedia page and some downloadable viewers. Even Apple wasn't immune — they had DocViewer they used for documentation, as well as the QuickDraw GX metafile format, PDD.

More recently, Microsoft came out XPS, but I notice in Office 2013, the button says "Create PDF/XPS" but it defaults to PDF. There's still DjVu, too, which occupies more of a niche.

Great summary, thanks!

@Nicholas Yeah, I remember preferring Common Ground in the early days, I think because the reader app worked better/faster. There’s also the DVI format.

Stay up-to-date by subscribing to the Comments RSS Feed for this post.

Leave a Comment