Sunday, January 25, 2015

Deleting Folders Using the “find” Command

I’ve seen lots of Web pages about how to use the Unix find command to delete files or directories with a certain name. I do this pretty frequently. For example, the Makefile for deploying my apps deletes the Headers folders inside of the embedded frameworks. It also has legacy lines for deleting CVS and .svn folders. The standard advice seems to be to write something like this:

find -name Headers -type d -delete

This looks inside the folder for items named Headers that are directories and deletes them. This works with GNU find, but in the BSD version of find (which ships with Macs) the -delete option only works with files and empty directories. Confusingly, it won’t report failure for a non-empty directory; it will just leave it there. The common solution is to use a command like this:

find -name Headers -type d -exec rm -rf "{}" \;

This executes the rm -r {} command for each found item, with the {} replaced by the item’s path. The \; marks the end of the command and is escaped so that the shell sends the ; to find instead of interpreting it as a command separator. The problem with this version is that it can result in “No such file or directory” errors. find will execute the command for a folder (thus deleting it), then try to recurse into that folder and complain that it doesn’t exist. find succeeded in that all the directories did get deleted—it doesn’t stop when it encounters the first missing directory—but it reports a failing exit status that can halt your shell script or Makefile.

A common way to work around this is to silence the errors. You can use:

find -name Headers -type d -exec rm -rf "{}" \; || true

to make sure that the combined command exits with success. Or, in a Makefile, you can use:

-find -name Headers -type d -exec rm -rf "{}" \;

to tell make to ignore errors. In both cases, the command will succeed, but it will still print “No such file or directory”. So sometimes people silence the error:

-find -name Headers -type d -exec rm -rf "{}" \; 2>/dev/null

My preferred solution is to prevent the error in the first place. You can tell find to do a post-order traversal instead of a pre-order one. In other words, it will recurse first and delete later. This is done by specifying the -d option:

find -d -name Headers -type d -exec rm -rf "{}" \;

You also optimize this somewhat by using + instead of \;. This causes find to send multiple items to a single invocation of the rm command. I’ve also found that, at least with Bash, it is not necessary to quote the braces. This command looks cleaner and works well:

find -d -name Headers -type d -exec rm -rf {} +

Another way to do essentially the same thing is to use xargs:

find -d -name Headers -type d -print0 | xargs -0 rm -rf

In practice, it seems that if you use xargs you don’t need -d. Perhaps this is because a certain amount of find output is buffered before xargs begins deleting. However, I think it is more correct to leave it in.

Update (2015-01-26): As noted in the comments, you can use -prune (instead of -d) to tell find not to recurse into the matching folders that will be deleted anyway:

find -name Headers -type d -prune -exec rm -rf {} +

7 Comments RSS · Twitter

A simpler option is zsh's extended glob functionality:

rm -rf**/Headers(/)

The ** matches 0 or more path components; the (/) filters for directories, like -type d.

% mkdir -p foo/{Contents/,}{1..500}/Headers foo/Headers
% rm -rf foo/**/Headers(/)
% find foo -name Headers

This only works up to the maximum argument list length, which is usually OK for simple scripts; to be fully general, you can use `zargs`, which lets you use extended globbing . So, for example:

% mkdir -p foo/{Contents/,}{1..2000}/Headers foo/Headers
% mkdir -p foo/{Contents/,}{2001..5000}/Headers foo/Headers
% rm -rf foo/**/Headers(/)
zsh: argument list too long: rm
% autoload zargs
% zargs foo/**/Headers(/) -- rm -rf
% find foo -name Headers

One more nice thing about zsh is that because it's BSD licensed, it's actually kept reasonably current in OS X, rather than being stuck in 2006 like the bundled verison of bash.

Why don't you just add -prune, i.e.
find -name Headers -type d -exec rm -rf "{}" \; -prune

From the man page:
This primary always evaluates to true. It causes find to not descend into the current file. Note, the -prune primary has no effect if the -d option was specified

@Ingmar It looks like that would work, too. Thanks.

You don't need the -d in the last command because the entire find command runs first, piping all its results to xargs, which then packages them all up into a single command line, and then runs a single rm -rf with all the results. Because of the -f, rm will not complain about the errors, and because the rm does not run until after the find has already completed, the find wont complain about missing directories.

This is assuming the total size of the deleted paths is less than around 4096 bytes (ARG_MAX). On the other hand, if you have a large number of files to be deleted or your paths are long, then you might hit this limit.

Regardless, -prune would be a much better solution since you let rm -rf recurse down that deleted section, and fine does not have to.

@Peter Yes, the entire find command will run if its output is small enough to fit in a single xargs chunk. Otherwise, you could get unlucky. It’s not clear to me that there’s much difference between which tool recurses down the to-be-deleted hierarchy.

"For example, the Makefile for deploying my apps deletes the Headers folders inside of the embedded frameworks."

You should be canonized for doing that.

I see too many headers in released applications from within embedded frameworks (e.g. Sparkle frameworks). Even some that state that the contents of the file should not be disclosed to the public.

You can use shopt -s globstar ; rm -rf**/Headers/ on bash 4.0 or newer.

I was also going to say the same thing as Peter: a find-based solution should use -prune because it doesn’t matter if there’s a Headers directory inside another Headers directory – rm is going to delete the entire tree regardless of whether you tell it to delete both of them (in either order) or only the outer one. So it’s a waste of find’s time to descend into a Headers directory if it spots one, and -prune is its way of being told to not bother.

Leave a Comment