Monday, December 5, 2022

Swift Set Intersection Bug

Dave DeLong:

It turns out, there was a bug in Set.intersection(_:), but it had only been discovered this past June, and the fix only applies to macOS Ventura and later (my machine is running Monterey still). The scope of the bug is fairly limited: it only showed up if you were using the general intersection method, and the sequence had “exactly as many duplicate items as items missing from self”. As it turned out, Advent of Code happened to provide me with exactly the right input to hit this multiple times.

4 Comments RSS · Twitter

That's incredible. Writing test cases for fundamental datatypes is trivial to do... a task for interns, but is key since so much software relies on them. Given that NSSet and the STL both provide the same functionality, one even has an oracle that specifies the truth. It is shocking how little Apple bothers to test things these days. If they don't have the bandwidth to do due diligence, perhaps they should reconsider creating new languages and stick to building a functional OS?

To be clear, this bug is in Swift’s Set, not in NSSet. They do have tests for this, but apparently they weren’t extensive enough. I do wonder whether NSSet has a larger test suite and whether this could have been caught by porting that.

@Michael: yes I understood it was in Swift. My point is that one can write a variety of fuzzing algorithms that use the known truth (provided by NSSet or STL's set, or whatever) and look for divergences between the result computed by the random stream and the new implementation. That's for instance what I do when I reimplement set-like functionality for algorithms I optimize.

Perusing that code, it is a very lightweight test-suite, putting it mildly. Looks like the product of someone completing a check-box item, rather than the product of someone actually trying to break it (different mindset).

I’m a big fan of this sort of divergence testing. I found all sorts of unexpected corner cases comparing the results of my old and new e-mail parsers, operating over a large number of messages.

Yes, it does not look like they were trying to break it or even do white box coverage.

Leave a Comment