Thursday, January 30, 2020

Dropbox Ignore Feature in Beta

Dropbox (via Hacker News):

You can set a file or folder to be “ignored” by Dropbox. This allows you to organize files and folders in the Dropbox folder on your computer without storing them on dropbox.com or on the Dropbox server at all.

[…]

  1. Open the Terminal application on your computer.
  2. Type
    xattr -w com.dropbox.ignored 1. 
    
  3. Type the location of the file next to that.
    • You can also drag and drop the file or folder that you want to ignore from your file browser into the Terminal and it will populate with the location of the file.
    • It should look something like this:
      xattr -w com.dropbox.ignored 1 /Users/yourname/Dropbox\ \(Personal\)/YourFileName.pdf

It’s hard to believe that Dropbox has lacked this feature for 12 years.

I think it would have been better to use a .dropboxignore file rather than an xattr:

Previously:

Update (2020-02-04): Dave Wood:

Also, this would make the process the same regardless of platform.

14 Comments RSS · Twitter

Former Dropbox engineer here. Excluding files by pattern seems like a nice idea, but it really causes problems in shared folders and even in non-shared folders. If I exclude a file by pattern and then the file is renamed, does it suddenly start syncing? What happens if a have a pattern and someone (not you) adds a file to a shared folder that matches the pattern? Does it sync to your computer? If so and you edit it, are changes synced back? Pattern exclusion is fine for a source control repository, where the path structure is really rigid and all the users are programmers, but it doesn't work so well for a lot of Dropbox customers who just want a folder that syncs and "does the right thing" when you reorganize your files.

With this new feature, I'm curious what happens if your sync-ignored file at path ~/Dropbox/foo.txt and then add a file foo.txt to your Dropbox on the web or via another client. I'm guessing that your sync-ignored file will get renamed to something like ~/Dropbox/foo (ignored conflict).txt

@Nick I guess I don’t see this as being that different from version control, so long as the pattern exclusion only applies at the level of the current folder. So everyone with the shared folder would also have the shared ignore file. The server could understand the exclusions and report an error if you upload something that’s ignored. If there’s a race with adding a file and updating the ignore file, you would get a conflict file of some kind. Yes, there are edge cases, but I don’t think the xattr design “just does the right thing,” either.

The difference between Dropbox and a VCS repository is the use case. In VCS, the path structure is very rigid and durable changes are typically manually reviewed. Dropbox's is for documents, photos, and other files. The structure is very fluid and people rename and reorganize files all the time.

Setting all that aside, though, I think the engineering cost of path-based exclusion would be very high for the customer benefit. It would be used by relatively few customers but would require a lot of effort and testing to get right. Basically, the behavior would require that syncing behavior of some files would depend on the contents of other files. This logic would have to be handled on both client and server and would have to work appropriately across the desktop client and API clients (web, mobile, and API). By comparison, the "dropbox ignore" feature described in the help article only needs to be implemented in the desktop client and the logic is very straightforward: if the file has this attribute, remove it from changelist before you sync from the server (plus dealing with naming conflicts, which is the more difficult part).

I was saying that Dropbox customers want a product that "does the right thing", not making any statement about this particular feature implementation. With file sync, you will never have a product that "does the right thing" (really "does what I mean") 100% of the time. There are just too many edge cases. Even if you think them all through and make a very good decision about what to do, your users will be surprised sometime. Dropbox's product decisions err on the side of data preservation at all costs—better to get a "conflicted copy" than silently overwrite a revision of a file.

@Nick As described, I think you’re right that would be a lot simpler to implement. However, if the server truly has no knowledge of the attribute, that means it won’t sync to your other devices. You would have to manually add it to each file/folder on each device. I wonder how that works, because in order to add the xattr you have to create the file first. When you create the file, it’s going to get uploaded to the server because the server doesn’t store ignored files. Then the client that does have it ignored is going to see a conflict between new file on the server and its ignored local one. If the client renames one of the files—to preserve the data and allow both to exist—that could cause problems. Or maybe it allows the server to have a newer copy of the ignored file, which is also kind of weird.

> wonder how that works, because in order to add the xattr you have to create the file first. When you create the file, it’s going to get uploaded to the server because the server doesn’t store ignored files.

From the help article, it sounds like the semantics of the feature depend on whether the file is one that is previously synced or not.

Case 1) the file is created outside the Dropbox folder and gets the xattr set. When it's added to the folder it is just ignored and the server never hears about it. It's always only local.

Case 2) the file was one that Dropbox was already tracking and then the xattr gets set. In this case, the addition of the xattr is sent up to the server as a delete. It is unsynced from the rest of your devices but your local copy stays local-only. In this case, Dropbox would still have the data of your file for a time, in order to support the "restore" function. It would be retained on the Dropbox server until either it was permanently deleted or timed out of your version history (30 days for free and plus users, 180 days for pro and business).

Customers have been asking to be able to put stuff in their Dropbox folder that doesn't sync literally since the very beginning, but the cost-benefit of implementing it was never above the line. I think a .dropbox-ignore file with path-based ignores could meet some use cases, but I suspect that this implementation was seen as a relatively cheap way to meet the majority of use cases for this request. I suspect the engineering cost to do the path-based thing would be greater by a factor of more than 3x.

@Nick I was thinking of Case 3, which is that the xattr is set on one device, and then the “same” file gets created on another device.

> I was thinking of Case 3, which is that the xattr is set on one device, and then the “same” file gets created on another device.

So the help article is pretty clear that when you set the xattr on one device, the file is removed from all other devices: "Once ignored, the file or folder remains where it is in your Dropbox folder and is synced to your computer’s hard drive, but it’s deleted from the Dropbox server and your other devices, can’t be accessed on dropbox.com, and won’t sync to your Dropbox account."

Now, suppose the file is then added back without the xattr on another device at the same path. I suspect what will happen in that case is that ignored file on the first device will get renamed with a deconflicted name and the file at that path will be the synced version. I think in the case that you want a different ignored file at the same path on multiple devices, the best bet will be to set the ignore flag before adding it to the Dropbox folder (case 1 above) on every device.

@Nick I think that’s what it will likely do as well, which is really not what I want for a build/derived file scenario.

The `.dropboxignore` file is necessary.

I'm not sure why Nick overcomplicated things.

If we have .dropboxignore file it's a simple switch that simply says "Dropbox does not see this folder or file". If stuff is commented in it to include or exclude folder or file pattern, it will simply completely disregard it. If it's turned on, it will start syncing just like somebody copied the files. If the folder or file pattern is added to the .dropboxignore for Dropbox to ignore it it will simply act as if somebody deleted those files and remove them from Dropbox server. I don't see an issue here at all.

It's a simple problem. i don't want my node_modules and vendor files to be synced in my projects. This is why most people actually ask for this feature.

I'm doing the exactly same things but I have to use 2 apps on Mac (Good Sync and Dropbox) to achieve this simple thing not to mention that I'm replicating files for no reason whatsoever.

Nick is definitely overcomplicating.

Also, talking about normal users... xattr it's just impossible to use.
They don't know what a terminal is. Even me I have to google every time. It just seems to be designed to be hard to use.

My understanding is that this feature is simply not there because Dropbox is scared users can do some space optimisation and avoid buying the pro version. Well... because of it, users are choosing other options...

Long-time user of Dropbox here, and software developer of 20+ years.

The usecase is, as many have suggested, quite simple: A local user (not the server) excludes what they don't want Dropbox to touch on their local machine. The fact that OTHER users may put files or folders in that same synced folder up on the server is irrelevant and conflating it with this usecase just generates confusion.

The "ignore these things" list is a bi-directional filter on Dropbox sync mechanism.

LOCAL FILE/FOLDER: If something new/changed appears in a synced folder on the local machine, that matches the "ignore these things" list, then Dropbox sync does nothing. It ignores it.

SERVER FILE/FOLDER: If something new/changed appears in a synced folder on the Dropbox server in the cloud, when the Dropbox sync on that user's local machine receives the notification and it matches the "ignore these things" list, then Dropbox sync does nothing. It ignores it.

When Dropbox rolled out the "Local" and "Online Only" smart sync feature, I was hoping they would add "Local Only" too.

In the end, the Dropbox business model doesn't permit this kind of solution. As Miro (and others) have noted: Dropbox makes money from storage subscription. Being able to exclude huge numbers of temporary or extraneous files would put unreasonable power in the hands of users to optimize their storage needs, thus reducing the need to move to "Pro."

In other words, I understand the need for Dropbox to complicate this simple Feature request by conflating it with other bizarre and esoteric usecases, in order to mask their true desire to never deliver such storage-optimizing power into the hands of its paying customers.

Hello everyone, I've recently implemented dropboxignore (https://github.com/sp1thas/dropboxignore) which is a simple shell script which facilitates you to generate .dropboxignore files based on your file patterns or even based on existing .gitignore files and ignore matched files from dropbox.

Hope to find it useful. Any feedback is more than welcome.

Revisiting this thread I see that a lot of people who have not worked on file syncing products, or Dropbox specifically, believe I've "overcomplicated" the problem.

I'll repeat that having the state of which file sync depend on another file that is also syncing is extremely complex to implement. If you just want each client to independently specify a set of paths that don't sync, that is simpler, and Dropbox previously had a backdoor feature that allowed you to do that, by using Selective Sync to not sync content at a path, allowing you to put local content there. This "feature" was intentionally *removed* when the new Sync Engine (Nucleus) was built because of the problems it caused (I can't remember the details).

There are also various product behaviors for Dropbox Business and Home products that need to be respected to ensure that all users with shared content have access to it at the same path.

Trust me, file sync is complex and Dropbox is not a VCS. There are many edge cases and behaviors that you are likely not considering.

Leave a Comment