Monday, January 25, 2016

7 Scandalous Weird Old Things About the C Preprocessor

Robert Elder (via Peter Steinberger):

Bjarne Stroustrup even points out that the standard isn’t clear about what should happen in function macro recursion[1]. With reference to a specific example he says “The question is whether the use of NIL in the last line of this sequence qualifies for non-replacement under the cited text. If it does, the result will be NIL(42). If it does not, the result will be simply 42.”. In 2004, a decision was made to leave the standard in its ambiguous state: “The committee’s decision was that no realistic programs “in the wild” would venture into this area, and trying to reduce the uncertainties is not worth the risk of changing conformance status of implementations or programs.”

[…]

Many languages are first tokenized, and then the list of tokens doesn’t change throughout further processing of the program. In the C preprocessor, new tokens can be created at run time! This makes it impossible to build a parse tree ahead of time because you don’t know what tokens would be included in the final tree.

[…]

In general, when evaluating a function macro body, you need to consider both the pre-expanded version of the arguments, and the untouched tokens that were passed for that argument. This behaviour is unlike how C arguments and functions are evaluated, because in C you can always replace an argument that’s described by an expression with the result of that expression and have the same meaning (ignoring any side effects).

He also mentions handling of whitespace, which is confusing and also buggy in Clang.

Comments RSS · Twitter

Leave a Comment