Tuesday, February 20, 2018

Optimizing Global Constant Data Structures Using Relative References

Joe Groff (tweet):

Building a native compiler for a programming language with rich reflection? Runtime type information, method dispatch tables, and other metadata require complex data structures full of cross references between related language entities. To reduce the size, memory usage, and launch time cost of these pointer-heavy constant data structures, you can try building them out of relative references instead of pointers. Pointers are one of C’s defining features, and seemingly the simplest mechanism for building data structures, but they carry hidden costs when used in global constants, which we’ll explore in this post and look at how we can avoid them. Even if you’re not writing a compiler, understanding this optimization is a fun chance to peel back some of the mystique of C, explore a bit of the runtime machinery that makes C programs work in contemporary operating systems, and see how a compiler can make different tradeoffs when not constrained by the abstractions C provides.


The object_vtable structure here is a global constant, full of pointers to other global constants—it should be “free”, right? In reality, operating systems provide a fairly elaborate runtime environment in order to make C programs work. Data structures that contain global pointers will in fact allocate memory at program launch before even entering main(). To understand why, we need to peek behind the scenes and look at how the dynamic linker works.


One tradeoff to using relative references is that they do require slightly more generated code on average to dereference than absolute pointers, leading to small performance and code size costs.

Update (2018-02-20): McCloud recommends DYLD_PRINT_STATISTICS.

Comments RSS · Twitter

Leave a Comment