Tuesday, March 1, 2016

Fast Ordered Collections for Swift Using In-memory B-trees

Károly Lőrentey (via Nick Lockwood):

The benchmark above demonstrates this really well: insertion into a sorted array is O(n^2) when there are many items, but it is still much faster than a red-black tree with its attractive-looking O(n * log(n)) solution. At the beginning of the curve, up to about eighteen thousand items, a sorted array implementation imported from an external module is very consistently about 6-7 faster than a red-black tree, with a slope that is indistinguishable from O(n * log(n)). Even after it catches up to quadratic complexity, it takes about a hundred thousand items for the sorted array to become slower than the red-black tree! This remarkable result is due in large part to the vast number of (to a CPU, random-looking) memory references that are needed to operate on red-black trees.

[…]

Note that the big gap between collections imported from stdlib and those imported from external modules is caused by a limitation in the current Swift compiler/ABI: when this limitation is lifted, the gap will narrow considerably, which will reduce the element count at which you’ll be able to reap the benefits of lower asymptotic complexity.

[…]

The all-or-nothing copy-on-write behavior of Array, Dictionary and Set can lead to performance problems that are hard to detect and fix. If the underlying storage buffer is being shared by multiple collection instances, the modification of a single element in any of the instances requires creating a full copy of every element.

It is not at all obvious from the code when this happens, and it is even harder to reliably check for. You can’t (easily) write unit tests to check against accidental copying of items with value semantics!

With standard collection types, you often need to think about memory management.

1 Comment RSS · Twitter

[…] Fast Ordered Collections for Swift Using In-memory B-trees […]

Leave a Comment