Monday, May 9, 2016

Locking in WebKit

Compared to OS-provided locks like pthread_mutex, WTF::Lock is 64 times smaller and up to 180 times faster. Compared to OS-provided condition variables like pthread_cond, WTF::Condition is 64 times smaller. Using WTF::Lock instead of pthread_mutex means that WebKit is 10% faster on JetStream, 5% faster on Speedometer, and 5% faster on our page loading test.

Making WTF::Lock and WTF::Condition fit in one byte is not easy and the technique behind this has multiple layers. Lock and Condition offload all thread queueing and suspending functionality to WTF::ParkingLot, which manages thread queues keyed by the addresses of locks. The design of ParkingLot is inspired by futexes. ParkingLot is a portable user-level implementation of the core futex API. ParkingLot also needs fast locks, so we built it its own lock called WTF::WordLock.

[…]

Adaptive locks combine parking and spinning. Spinning is great because it protects microcontention scenarios from doing parking. Microcontention is when a thread fails the fast path lock acquisition because the lock is not available right now, but that lock will become available in less time than what it would take to park and then unpark. Before WTF::Lock::lock() parks a thread, it will spin 40 times, calling yield between spins.

[…]

One advantage of OS mutexes is that they guarantee fairness: All threads waiting for a lock form a queue, and, when the lock is released, the thread at the head of the queue acquires it. It’s 100% deterministic. While this kind of behavior makes mutexes easier to reason about, it reduces throughput because it prevents a thread from reacquiring a mutex it just released. It’s common for WebKit threads to repeatedly acquire the same lock.

[…]

It’s clear that the OS mutex is doing exactly what it set out to do: no thread gets to do any more work than any other thread. On the other hand, WTF::Lock does not guarantee fairness. Analysis of the algorithm shows that a thread that has been waiting the longest to get the lock may fail to get the lock and be forced to go to the end of the queue. But this chart shows that even without having a fairness guarantee, the unluckiest thread using WTF::Lock got better treatment than any thread using the guaranteed-fair mutex. It’s almost as if the OS mutex is not actually fair because while thread 1 is served equally to thread 2, all 10 threads are underserved relative to a hypothetical thread 11, which is using a different algorithm. Indeed, we can think of thread 11 as being the OS context switch handler.

Concurrency iOS iOS 9 Mac Mac OS X 10.11 El Capitan Optimization Programming Web WebKit

2 Comments RSS · Twitter

Pierre Habouzit

May 10, 2016 3:14 PM

Be very careful, WTF::Lock is meaningful to WebKit where all the lock holders will run at the same priority (because contributing to foreground UI). These locks are prone to priority inversion as these are essentially fancy spinlocks. These are NOT general purpose locks AT ALL.

Michael Tsai

May 10, 2016 3:44 PM

@Pierre Thanks. See also: OSSpinLock Is Unsafe.

Locking in WebKit

2 Comments RSS · Twitter

Leave a Comment