Tuesday, April 26, 2016

Dropbox’s Project Infinite

Yesterday, I wrote about the superiority of BitTorrent Sync’s Selective Sync over Dropbox’s. Today, Dropbox gave a technology preview (via Mitchel Broussard, Hacker News):

Project Infinite will enable users to seamlessly and securely access all their Dropbox files from the desktop, regardless of how much space they have available on their hard drives. Everything in the company’s Dropbox that you’re given access to, whether it’s stored locally or in the cloud, will show up in Dropbox on your desktop. If it’s synced locally, you’ll see the familiar green checkmark, while everything else will have a new cloud icon.

It’s not clear whether this will be limited to business customers.


This feature is implemented at a low level, and works on the command line.

For example if you have a directory that is all stored in the cloud you can cd to it without any network delay, you can do ls -lh and see a list with real sizes without a delay (e.g., see that an ISO is 650 MB), and you can do du -sh and see that all the files are taking up zero space.

If you open a file in that directory, it will open, even from command line, then do du -sh and see that that file is now taking up space, while all the others in the directory are not.

Dan Luu (in 2015):

Something that’s occasionally overlooked is that hardware performance also has profound implications for system design and architecture. […] Consider the latency of a disk seek (10ms) vs. the latency of a round-trip within the same datacenter (.5ms). The round-trip latency is so much lower than the seek time of a disk that we can disaggregate storage and distribute it anywhere in the datacenter without noticeable performance degradation, giving applications the appearance of having infinite disk space without any appreciable change in performance. This fact was behind the rise of distributed filesystems within the datacenter ten years ago, and various networked attached storage schemes long before.


However, while it’s easy to say that we should use disaggregated disk because the ratio of network latency to disk latency has changed, it’s not as easy as just taking any old system and throwing it on a fast network. If we take a 2005-era distributed filesystem or distributed database and throw it on top of a fast network, it won’t really take advantage of the network. That 2005 system is going to have assumptions like the idea that it’s fine for an operation to take 500ns, because how much can 500ns matter? But it matters a lot when your round-trip network latency is only few times more than that. The caching and other obvious wins if you have 1ms latency may not buy you much at 10us latency, and it may even cost you something.

Latency hasn’t just gone down in the datacenter. Today, I get about 2ms to 3ms latency to Youtube. Youtube, Netflix, and a lot of other services put a very large number of boxes close to consumers to provide high-bandwidth low-latency connections. A side effect of this is that any company that owns one of these services has the capability of providing consumers with infinite disk that’s only slightly slower than normal disk.

Dan Luu:

We’ll see how well the implementation works, but when Dropbox was first released people said, “who cares, it’s just rsync”.

4 Comments RSS · Twitter

> We’ll see how well the implementation works, but when Dropbox was first released people said, “who cares, it’s just rsync”.

and we're still saying that in 2016

"Today, I get about 2ms to 3ms latency to Youtube"

I always wonder where people are finding this type of stable/reliable/low-latency internet connection. Even if you're fortunate enough to have one, don't assume all your customers have the same thing. Latency for YouTube and Apple are between 20-30ms for me, using a fiber-to-the-home service (AT&T).

Also, let's not forget ISP interference. A few years back when I had Time Warner Cable, one of their local transit nodes would just happen to fail every Saturday night, rendering Netflix unusable. My current AT&T gigabit connection still encounters buffering on YouTube, and downloads from Apple start at 300+Mbps, then often slow down to ~10Mbps a few seconds later for the duration of the download (sometimes even <1Mbps!). Again, this is AT&T's gigabit, fiber-to-the-home connection, and I regularly get sustained downloads below 10Mbps. Last time I had to update Xcode, it was faster to download the DMG through Tor than download it directly.

So all of this is to say, if Project Infinite ever becomes popular, I wouldn't be surprised if customers start seeing 'mysterious' slowdowns from certain ISPs. Oh, and don't forget the latest push for wireline data caps...

Just for fun, I did an experiment to test this with Dropbox as well:

I uploaded a 3MB video file to Dropbox via the web interface on my AT&T Gigapower fiber connection, rated 1 Gbps for upload. Took over 5 minutes, upload speed was around 10-50KB/s the whole time as measured via Activity Monitor. Then I logged into the web interface via Tor, and it uploaded at 3MB/s, so essentially within a second or two. I repeated this test multiple times, and the result was the same: Uploading files to Dropbox via Tor was about two orders of magnitude faster than uploading through a direct connection. Make of that what you will, but I think it's time to cancel my service...

[…] Previously: Dropbox’s Project Infinite. […]

Leave a Comment