Archive for June 17, 2010

Thursday, June 17, 2010

Interpolation Search

Jay from LinkedIn’s SNA team describes interpolation search, which is like binary search except that it looks at the values of the elements so that it can do better than guessing the middle each time (via Alan Odgaard):

The data structure for these uses a large sorted index file to do lookups, what is stored in this file is an MD5 of the key. Since the MD5s are used for the sort, the file is guaranteed to be uniformly distributed over the key space, and can often be many GBs in size. These files are memory mapped to help reduce the cost of a read, but the improved search algorithm can help to greatly reduce the number of seeks when the index is not fully memory resident. A disk seek comes at a price of around 10ms, so saving even one or two is a huge performance win.

WWDC 2010 Videos

Apple has posted the session videos for the 2010 Worldwide Developers Conference (via Michael Jurewitz). Incredibly, they are available less than a week after the end of the conference, and they’re free for registered developers. In 2009, it was more than a month before the videos were available (the fastest to that point, as I recall), and Apple charged $500.