Locking in WebKit
Compared to OS-provided locks like
pthread_mutex
,WTF::Lock
is 64 times smaller and up to 180 times faster. Compared to OS-provided condition variables likepthread_cond
,WTF::Condition
is 64 times smaller. UsingWTF::Lock
instead ofpthread_mutex
means that WebKit is 10% faster on JetStream, 5% faster on Speedometer, and 5% faster on our page loading test.Making
WTF::Lock
andWTF::Condition
fit in one byte is not easy and the technique behind this has multiple layers.Lock
andCondition
offload all thread queueing and suspending functionality toWTF::ParkingLot
, which manages thread queues keyed by the addresses of locks. The design ofParkingLot
is inspired by futexes.ParkingLot
is a portable user-level implementation of the core futex API.ParkingLot
also needs fast locks, so we built it its own lock calledWTF::WordLock
.[…]
Adaptive locks combine parking and spinning. Spinning is great because it protects microcontention scenarios from doing parking. Microcontention is when a thread fails the fast path lock acquisition because the lock is not available right now, but that lock will become available in less time than what it would take to park and then unpark. Before
WTF::Lock::lock()
parks a thread, it will spin 40 times, calling yield between spins.[…]
One advantage of OS mutexes is that they guarantee fairness: All threads waiting for a lock form a queue, and, when the lock is released, the thread at the head of the queue acquires it. It’s 100% deterministic. While this kind of behavior makes mutexes easier to reason about, it reduces throughput because it prevents a thread from reacquiring a mutex it just released. It’s common for WebKit threads to repeatedly acquire the same lock.
[…]
It’s clear that the OS mutex is doing exactly what it set out to do: no thread gets to do any more work than any other thread. On the other hand,
WTF::Lock
does not guarantee fairness. Analysis of the algorithm shows that a thread that has been waiting the longest to get the lock may fail to get the lock and be forced to go to the end of the queue. But this chart shows that even without having a fairness guarantee, the unluckiest thread usingWTF::Lock
got better treatment than any thread using the guaranteed-fair mutex. It’s almost as if the OS mutex is not actually fair because while thread 1 is served equally to thread 2, all 10 threads are underserved relative to a hypothetical thread 11, which is using a different algorithm. Indeed, we can think of thread 11 as being the OS context switch handler.
2 Comments RSS · Twitter
Be very careful, WTF::Lock is meaningful to WebKit where all the lock holders will run at the same priority (because contributing to foreground UI). These locks are prone to priority inversion as these are essentially fancy spinlocks. These are NOT general purpose locks AT ALL.