Monday, August 14, 2017

A Brief History of the UUID

Rick Branson (via Hacker News):

[Apollo’s] NCS introduced the concept of the UID (Universal IDentifier), which served as the unique primary identity for entities. UIDs are 64-bit numbers that combine a monotonic clock with a unique host ID permanently embedded in the hardware of all of their workstations. Under this scheme, identifiers could be generated thousands of times per second at each host and remain globally unique for all time with no scaling bottleneck. The only point of coordination was at Apollo’s factories — where the machines were permanently branded with their respective identifiers.


NCA introduced UUIDs, which built on the UID design, but accommodated a broader range of vendors by extending the number space to 128-bits. Thus the UUID was born. This concept was so useful that even after NCA became a distant memory and RPC fell out of fashion, the UUID remained popular, eventually being standardized by ISO, IETF, and ITU.


As the untrustworthy Internet became the dominant networking platform, UUID generation which depended on trust became obsolete. All of these concerns have lead most to abandon leveraging hardware identifiers in UUIDs.


Later on Paul went on to Microsoft, and I’m fairly certain that it was due to Paul that Microsoft adopted the OSF DCE RPC layer for its internal use, and UUID’s started being used extensively inside Microsoft. UUID’s also got used in Intel’s EFI specification for the GPT partition table, although somewhere along the way they got renamed “Globally Unique ID’s” --- it’s the same spec, though.


As far as uuidd is concerned, the reason why it exists is because a certain very large Enterprise Resource Planning system was using libuuid to generate uuid’s for its objects, and it needed to create them very, very quickly so they can initalize the customer’s ERP database in finite time. They were also using the time-based UUID’s, with the UUID stored in the database with the bytes cleverly rearranged so the time bits would be stored in the LSB, and the Ethernet MAC address would be in the MSB, so that a database using a B-tree (plus prefix key compression) for its indexing would be able to very efficiently index the UUID’s. This is similar to k-ordering trick that Flake was using, but this very large enterprise planning company was doing in 2007, five years before team at Boundary came up with Flake, and they were doing it using standard UUID’s, but simply storing the Time-based UUID bytes in a different order. (I believe they were also simply storing the ID in binary form, instead of base-62 encoding, since if you’re going to have jillions of objects in your ERP database, you want them to be as efficient as possible.)

Anyway, a certain Linux distribution company contacted me on behalf of this very large Enterprise Resource Planning company, and we came up with a scheme where the uuidd daemon could issue blocks of time-based UUID’s to clients, so we could amortize the UUID generation over blocks of 50 or 100 UUID’s at a time. (This ERP was generating a huge number of UUID’s.) I did it as a freebie, because I was tickled pick that libuuid was such a critical part of a large ERP system, and it wasn’t that hard to implement the uuidd extension to libuuid.

Comments RSS · Twitter

Leave a Comment