Friday, February 3, 2012

Git at Facebook Scale

Joshua Redstone (via Hacker News, which has some interesting comments):

The test repo has 4 million commits, linear history and about 1.3 million files. The size of the .git directory is about 15GB, and has been repacked with 'git repack -a -d -f --max-pack-size=10g --depth=100 --window=250'. This repack took about 2 days on a beefy machine (I.e., lots of ram and flash). The size of the index file is 191 MB.

Sam Vilain:

With the hard–working part of git on the other end of a network service, you could back it by a re–implementation of git which is written to be distributed in Hadoop. There are at least two similar implementations of git that are like this: one for cassandra which was written by github as a research project, and Google's implementation on top of their BigTable/GFS/whatever. As the git object storage model is write–only and content–addressed, it should git this kind of scaling well.

1 Comment RSS · Twitter

[…] Update (2015-10-21): Previously: Git at Facebook Scale. […]

Leave a Comment