Friday, December 31, 2021

The Surprising Cost of Checking Protocol Conformances in Swift

Noah Martin (Hacker News):

The entry point to our investigation is Mike Ash’s PR which implements a 13x faster cache that was released in Swift 5.4.

[…]

We now see that the speed of protocol conformance lookups is dependent on the number of conformances in your app. This will be influenced by how many Swift libraries you link to, and how many conformances you include in your own code. otool -l Helix.app/Helix | | grep _swift5_proto -A 4 tells us Uber’s app has a 411200 byte protocol conformance section. Each 4 bytes is a relative pointer so 411200 / 4 = 102,800 conformances.

[…]

One source of low hanging fruit that might be in your app is removing protocols that are used only for providing stub implementations in unit tests. These can be compiled out of release builds of the app to avoid them being included in runtime metadata.

[…]

Profiling your app using tools like Instruments or the Emerge startup time visualization can help you identify where conformance checks are most often used in your app. Then you can refactor code to avoid them entirely.

[…]

The concept behind zconform is to eagerly load all possible protocol conformances and store them in a map keyed by the protocol’s address in memory.

Previously:

Update (2022-01-03): Checking protocol conformances can also be a bottleneck in Objective-C.

Update (2022-02-08): Saagar Jha:

This appears to have been fixed in Xcode Version 13.3 beta (13E5086k). There’s a new DVTCachedConformsToProtocol method that is now used extensively throughout the app for conformsToProtocol: checks, including for the specific block I originally identified as being problematic.

Greg Parker:

Protocol conformance is one of those things that is never quite high enough priority to optimize in the OS. But some apps really do suffer, so they’re forced to work around it. Then the presence of those workarounds makes it less important to improve it in the OS.

Update (2023-02-16): Noah Martin:

The big change [in iOS 16] comes in the “dyld closure”, which is a per-app cache used to accelerate various dyld operations during app launch. The closure now contains pre-computed conformances, allowing each lookup to be much faster. Note that the dyld closure is not always used, e.g. because it’s out-of-date or because it’s being launched from Xcode, which complicates things.

Michael Eisel:

Beyond the issues mentioned above, the Salesforce Service Cloud SDK spends 67ms running class_conformsToProtocol and objc_copyClassList (perhaps iterating over all classes to determine which ones conform to some protocol) in non-initializer setup. All of this setup can likely be moved out of startup.

Noah Martin:

Although this improvement is in iOS 16, it’s difficult to measure in practice because this dyld behavior is disabled when running the app from Xcode or Instruments. Emerge has a local performance debugging tool that works around this and can be used to profile apps that do have access to the dyld closure.

Comments RSS · Twitter

Leave a Comment