Robin Harris argues, unconvincingly in my opinion:
Therefore, by a process of elimination, Glacier must be using optical disks. Not just any optical discs, but 3 layer Blu-ray discs.
Not single discs either, but something like the otherwise inexplicable Panasonic 12 disc cartridge shown at this year’s Creative Storage conference. That’s 1.2TB in a small, stable cartridge with RAID so a disc can fail and the data can still be read. And since the discs weigh ≈16 grams, 12 weigh 192g.
For several years I didn’t see how optical disk technology could survive without consumer support. But its use by major cloud services explains its continued existence.
sintaks (August 22, 2012):
Former S3 employee here. I was on my way out of the company just after the storage engineering work was completed, before they had finalized the API design and pricing structure, so my POV may be slightly out of date, but I will say this: they’re out to replace tape. No more custom build-outs with temperature-controlled rooms of tapes and robots and costly tech support.
I’m not sure how much detail I can go into, but I will say that they’ve contracted a major hardware manufacturer to create custom low-RPM (and therefore low-power) hard drives that can programmatically be spun down. These custom HDs are put in custom racks with custom logic boards all designed to be very low-power. The upper limit of how much I/O they can perform is surprisingly low - only so many drives can be spun up to full speed on a given rack. I’m not sure how they stripe their data, so the perceived throughput may be higher based on parallel retrievals across racks, but if they’re using the same erasure coding strategy that S3 uses, and writing those fragments sequentially, it doesn’t matter - you’ll still have to wait for the last usable fragment to be read.
The author quickly dismisses hard drives because at the time of the Glacier launch SMR drives were to expensive because of the Thai flood. But after a few years of running S3 and EC2 Amazon must have tons of left-over hard drives which are now simply too old for a 24/7 service.
So what do you with those three year old 1 TB hard drives where the power-consumption-to-space ratio is not good enough anymore? Or can of course destroy them. Or you actually do build a disk drive robot, fill the disk with Glacier data, simply spin it down and store it away. Zero cost to buy the drives, zero cost for power-consumption. Then add a 3-4 hour retrieval delay to ensure that those old disk don’t have to spin up more than 6-8 at times a day anymore even in the worst case.
I worked in AWS. OP flatters AWS arguing that they take care to make money and assuming that they are developing advanced technologies. That’t not working as Amazon. Glacier is S3, with the added code to S3 that waits. That is all that needed to do. Second or third iteration could be something else. But this is what the glacier is now.
I am an AWS engineer but note that I am not affiliated with Glacier. However James Hamilton did an absolutely amazing Principals of Amazon talk a couple of years ago going into some detail on this topic. Highly recommended viewing for Amazonians.
From what I remember from it, its custom HDs, custom racks, custom logic boards with custom power supplies. The system trades performance for durability and energy efficiency.
Having a robot juggling the hard drives would not make that much sense. The reason why we have optical disc and tape robots is that the tape and discs need a separate device that reads/writes them. With hardware there’s not such need.
With hard drives it would make more sense to do some development on the electronics side and build a system where lots of drives can be simultaneously connected to a small controller computer. All of the HD’s don’t need to be powered on or accessible all the time, the controller could turn on only few of them at a time. And of course also part of the controllers could be normally powered off, once all the harddrives connected to them are filled.
Stay up-to-date by subscribing to the Comments RSS Feed for this post.