The Surprising Cost of Checking Protocol Conformances in Swift
The entry point to our investigation is Mike Ash’s PR which implements a 13x faster cache that was released in Swift 5.4.
[…]
We now see that the speed of protocol conformance lookups is dependent on the number of conformances in your app. This will be influenced by how many Swift libraries you link to, and how many conformances you include in your own code.
otool -l Helix.app/Helix | | grep _swift5_proto -A 4
tells us Uber’s app has a 411200 byte protocol conformance section. Each 4 bytes is a relative pointer so 411200 / 4 = 102,800 conformances.[…]
One source of low hanging fruit that might be in your app is removing protocols that are used only for providing stub implementations in unit tests. These can be compiled out of release builds of the app to avoid them being included in runtime metadata.
[…]
Profiling your app using tools like Instruments or the Emerge startup time visualization can help you identify where conformance checks are most often used in your app. Then you can refactor code to avoid them entirely.
[…]
The concept behind zconform is to eagerly load all possible protocol conformances and store them in a map keyed by the protocol’s address in memory.
Previously:
- Faster App Launching in iOS 15 and Monterey
- How Uber Deals With Large iOS App Size
- Swift 5.4 Released
Update (2022-01-03): Checking protocol conformances can also be a bottleneck in Objective-C.
Update (2022-02-08): Saagar Jha:
This appears to have been fixed in Xcode Version 13.3 beta (13E5086k). There’s a new
DVTCachedConformsToProtocol
method that is now used extensively throughout the app forconformsToProtocol:
checks, including for the specific block I originally identified as being problematic.
Protocol conformance is one of those things that is never quite high enough priority to optimize in the OS. But some apps really do suffer, so they’re forced to work around it. Then the presence of those workarounds makes it less important to improve it in the OS.
Update (2023-02-16): Noah Martin:
The big change [in iOS 16] comes in the “dyld closure”, which is a per-app cache used to accelerate various dyld operations during app launch. The closure now contains pre-computed conformances, allowing each lookup to be much faster. Note that the dyld closure is not always used, e.g. because it’s out-of-date or because it’s being launched from Xcode, which complicates things.
Beyond the issues mentioned above, the Salesforce Service Cloud SDK spends 67ms running
class_conformsToProtocol
andobjc_copyClassList
(perhaps iterating over all classes to determine which ones conform to some protocol) in non-initializer setup. All of this setup can likely be moved out of startup.
Although this improvement is in iOS 16, it’s difficult to measure in practice because this dyld behavior is disabled when running the app from Xcode or Instruments. Emerge has a local performance debugging tool that works around this and can be used to profile apps that do have access to the dyld closure.