constant bikeshedding

by on
17 minute read

Whenever something can be done in two ways, someone will be confused.
Whenever something is a matter of taste, discussions can drag on forever.
— Bjarne Stroustrup

A Foolish Consistency 1 has finally compelled me to write some of my thoughts on this sterile and dreadful subject, which has been recurrently trending recently, over and over. Where will this evangelism strike force lead us?

Take some breath, as this is a brain dump of all I got in my mind over the years around a mostly irrelevant topic.

There are two common forms of learning things: through comprehension or by rote.

When it comes to teach how to declare variables in C++, it often happens on by rote terms, completely dismissing the original idea, which is crucial for proper comprehension. So, first let's dive in the original idea, and later on, at the fabricated teaching tricks that appeared later and their flaws.

Learning through comprehension

In 1974, Dennis M. Ritchie published in his C Reference Manual 2, 3, Section 8.4:

MEANING OF DECLARATORS

Each declarator is taken to be an assertion that when a construction of the same form as the declarator appears in an expression, it yields an object of the indicated type and storage class. Each declarator contains exactly one identifier; it is this identifier that is declared.

This also appears verbatim in The C Programming Language, First Edition 4 and in the The C++ Programming Language Reference Manual 5, 6. The reason it's also in the The C++ Programming Language Reference Manual, at the same section, is on its abstract:

ABSTRACT

This manual was derived from the Unix System V C reference manual, and the general organization and section numbering have been preserved wherever possible.

In the The C Programming Language, Second Edition 7, it's slightly reworded to:

MEANING OF DECLARATORS

A list of declarators appears after a sequence of type and storage class specifiers. Each declarator declares a unique main identifier, the one that appears as the first alternative of the production for direct-declarator. The storage class specifiers apply directly to this identifier, but its type depends on the form of its declarator. A declarator is read as an assertion that when its identifier appears in an expression of the same form as the declarator, it yields an object of the specified type.

Simply put, what this means is that "declaration reflects use"8. Some people at first don't grasp it because the connection between expressions and declarations isn't made, others admire it for its economy of resources in reusing the same rationale for expressions, on declarations.

In C it means that for the expression *p, it can be inferred that p is a pointer to something, because dereference is being applied, whose result is that something. While still just reasoning about expressions, what if you put some parentheses around *p? Would (*p) change the result?, or *(p)?, or (*(p))? No, it won't in this case. Curiously enough, just like you can play with parentheses and precedence at will on expressions, you too can do the same on declarations. All of this is valid and the result is the same (declaring a pointer to int):

int *p;
int (*p);
int *(p);
int (*(p));

Just like Ritchie follows in his original explanation:

If an unadorned identifier appears as a declarator, then it has the type indicated by the specifier heading the declaration.

If a declarator has the form

*D

for D a declarator, then the contained identifier has the type "pointer to ...", where "..." is the type which the identifier would have had if the declarator had been simply D.

At this point, in his manual, Bjarne diverges to also explain how const and & (for reference declarations) fits in this picture. These don't explicitly "reflect on use", and solely explicitly appear at declarations, but still, the original rationale is mostly preserved (more on that later).

On The Design and Evolution of C++ 9 Bjarne tells how he didn't like this aspect of C and how he considered an alternative left-to-right declaration syntax. It was abandoned, and, looked much worse in my opinion.

I still remember when I first learned how to declare function pointers at school:

int (*p)();

The sole explanation was "parentheses are needed, otherwise it just becomes a prototype to a function returning pointer to int". There was no connection with expressions, and hence, with simple precedence rules; once that is done, it automatically makes sense. In an expression, due to precedence rules, if you do *p(), p is being called first, and then the result of p() is being dereferenced, so you have to have a function that returns a pointer. So, to have dereference applied first, and the function call afterwards, you have to use parentheses to tailor precedence, (*p) first, and the function call next, (*p)(), which reads like, p is dereferenced (so it's a pointer) and the result of that is called (so the result of the dereference is a function, and so p is a pointer to function).

Little known fact is that this reasoning also applies when declaring functions. Say f is a function that when called it returns an int, would use parentheses change the result? f() vs (f)() vs (f()) vs ((f)())? No, it won't. Like before, just like you can play with parentheses and precedence at will on expressions, you too can do the same on declarations, even function declarations. All of this is valid and the result is the same (declaring a function that returns an int):

int f() {
    return 42;
}
int (f)() {
    return 42;
}
int (f()) {
    return 42;
}
int ((f)()) {
    return 42;
}

Did you ever see this elsewhere? I didn't. This demonstrates the predictive power of actually comprehending things, I've never seen this in my life, but I was able to realize it could be valid. The other methods lack prediction, as it will be shown.

Even though "declaration reflects use" (reusing the rationale for expressions and their precedence rules), "declaration expressions" and "use expressions" have distinct flavors, with some non overlapping elements worth noticing, so such "reflection" is not in the strict sense10.

When you use an array like in int x = a[7];, 7 is an index, when you declare an array like in int a[42];, 42 is its size, not an index, const and & (for reference declarations) are also elements that will visibly show up in "declaration expressions" only, while you will see & (the address-of operator), unary +, unary -, etc, in "use expressions" solely. You can see the role for operators can even change (reference vs address-of), but precedence doesn't. One interesting case is the ellipsis (...) of parameter packs and pack expansions, where at declaration they bind to left of the identifier (e.g. template<void (...Fs)()>), while at expansion they bind to the right (e.g. (Fs(),...)).

Learning by rote

There's a chicken, plays tic-tac-toe, never loses. He's famous.
— Milton (The Devil's Advocate, 1997)

So, that was how declarations were originally designed to be read and, most importantly, constructed. It draws from how to build expressions making use of usual precedence rules.

But for the C++ audience today, that's rarely how it's taught, quite rarely by the way. Instead, when declarations get just slightly non-trivial, you often find concocted recipes for how they should be read, and then you start to behave like a chicken playing tic-tac-toe while not comprehending it.

Oversimplification

For example, one often repeated statement is "declarations are read right-to-left". While this can help you read int * const p as "constant pointer to int", it can easily lead to confusion when one wants to write (not just read) a declaration. See for example this mistake taken from the first printing of C++ Primer, 5th Edition 11:

void foo(const char[3]&);

While the parameter type is completely wrong and doesn't follow the principles of "declaration reflects use" to derive the correct form of the declaration, it does follow what the statement says (reading that right-to-left one gets a reference to an array of 3 const chars). I've found this many times in the field, even in well-known books like, recurrently, in the C++ Primer, which was fixed in the latest printings — after my complains. After correction, const char (&)[3] can't be read right-to-left at all either.

On the same lines derives another gem regarding const placement: "const applies to the thing before it", which is valid only if "east const" is adopted. Actually, the statement is often used first as a dogma for "east const" adoption and to reinforce the notion of reading declarations right-to-left.

These superficial techniques for how to read and write declarations just serve for the most basic forms of declaration and crack as soon as one gets involved with anything slightly non-trivial. There are people that find in this another reason for abolishing C arrays, pointers, etc, the issue is that you simply have no such option when in the field reading other's people code, so I assume such argument is just for those wishing to live in denial.

Intentionality

After oversimplification, intentional style is another driving force on how declarations are taught, let's consider what Bjarne says about operator binding in declarations12:

The choice between int* p; and int *p; is not about right and wrong, but about style and emphasis. C emphasized expressions; declarations were often considered little more than a necessary evil. C++, on the other hand, has a heavy emphasis on types.

A "typical C programmer" writes int *p; and explains it "*p is what is the int" emphasizing syntax, and may point to the C (and C++) declaration grammar to argue for the correctness of the style. Indeed, the * binds to the name p in the grammar.

A "typical C++ programmer" writes int* p; and explains it "p is a pointer to an int" emphasizing type. Indeed, the type of p is int*. I clearly prefer that emphasis and see it as important for using the more advanced parts of C++ well.

Here Bjarne personally prefers to put style over grammar, and indeed, one often repeated quote is that "C++ enforces types", but is this "heavy enforcement" for real or just utopia? The thing is that this "type enforcement" manifests in the realm of type semantics (the abstract) and style (the convention), but not in the language's grammar (the practice). I once found a person that intended to "enforce types" even when it was an obvious failed attempt:

void (* foo)();

Putting whitespace before foo here doesn't make it less "syntax-like", the parentheses in (* foo) make the (required) expression character of the declaration stand out, the whitespace is completely overruled, besides not being even clear why it's there when compared to int* p. When Bjarne argues "indeed the type of p is int*", does that reasoning carry any meaning for an ordinary function pointer too? No.

Hidden symmetries

Last, but not least, are the many technically backed arguments you can find that look to demonstrate a hidden symmetry, pattern or something in the language for which one should care about, but that in the end are ultimately limited, superficial and/or flawed.

Let's pick one example from C++ Templates - The Complete Guide, Second Edition 13 which (sadly) adopts "east const" based on what it calls the "syntactical substitution principle":

Our second reason has to do with a syntactical substitution principle that is very common when dealing with templates. Consider the following two type declarations using the typedef keyword:

typedef char* CHARS;
typedef CHARS const CPTR;        // constant pointer to chars

or using the using keyword:

using CHARS = char*;
using CPTR = CHARS const;        // constant pointer to chars

The meaning of the second declaration is preserved when we textually replace CHARS with what it stands for:

typedef char* const CPTR;        // constant pointer to chars

or:

using CPTR = char* const;        // constant pointer to chars

However, if we write const before the type it qualifies, this principle doesn't apply. Consider the alternative to our first two type definitions presented earlier:

typedef char* CHARS;
typedef const CHARS CPTR;        // constant pointer to chars

Textually replacing CHARS results in a type with a different meaning:

typedef const char* CPTR;        // pointer to constant chars

The same observation applies to the volatile specifier, of course.

The "principle" is presented but it's not shown at this stage how it's so common when dealing with templates... It's just shown how it works, again, for a trivial example. Of course, it's easy to verify that the "principle", as is, won't extend for ordinary function pointers or anything that involves parentheses:

typedef int (*CALLBACK_PTR)();
typedef CALLBACK_PTR const CALLBACK_CPTR;   // constant pointer to callback

There's no way to get the first typedef and simply paste it on the second.

It does work though (for typedefs), if the reverse of what is taught in the book is done, by replacing CALLBACK_PTR in the first typedef with const CALLBACK_CPTR from the second one!

typedef int (*const CALLBACK_CPTR)();       // constant pointer to callback

So, that would be an amended principle that would be more general, because it actually mimics the inside out expansion that happens in the grammar, instead of the former bare textual substitution which is not as elaborated. This inside out text substitution, as is, would still break though, by adding a spurious parentheses pair 😈:

typedef int (*(CALLBACK_PTR))();
typedef CALLBACK_PTR const CALLBACK_CPTR;   // constant pointer to callback
typedef int (*(const CALLBACK_CPTR))(); // constant pointer to callback? Try it

Spurious parentheses would also cause havoc in the more trivial "CHARS" example.

Denial

Not limited to all that was said already, there can be additional technical drawbacks. On operator binding style for example, both Bjarne Stroustrup's C++ Style and Technique FAQ and C++ Templates - The Complete Guide, Second Edition admits that binding operators to type can be harmful for reading multiple declarations:

The critical confusion comes (only) when people try to declare several pointers with a single declaration:

int* p, p1;   // probable error: p1 is not an int*

Placing the * closer to the name does not make this kind of error significantly less likely.

int *p, p1;   // probable error?

— Bjarne Stroustrup's C++ Style and Technique FAQ

Regarding whitespaces, we decided to put the space between the ampersand and the parameter name:

void foo (int const& x);

By doing this, we emphasize the separation between the parameter type and the parameter name. This is admittedly more confusing for declarations such as

char* a, b;

where, according to the rules inherited from C, a is a pointer but b is an ordinary char.
— C++ Templates - The Complete Guide, Second Edition

const placement is also affected by this:

int *const p = nullptr, p1 = 0;     // p1 is just an int, not an int *const
int const c = 4, c1 = 2;            // now c1 is a int const, not just an int

The solution for this in both references is denial. Multiple declarations don't fit the convention, so let's avoid them altogether:

Declaring one name per declaration minimizes the problem - in particular when we initialize the variables.
[...]
Stick to one pointer per declaration and always initialize variables and the source of confusion disappears.
— Bjarne Stroustrup's C++ Style and Technique FAQ

To avoid such confusion, we simply avoid declaring multiple entities this way.
— C++ Templates - The Complete Guide, Second Edition

Even though I agree that one shouldn't abuse multiple declarations, I disagree with simply removing it from one's vocabulary. There are situations where it's simply the most idiomatic feature at hand. At the initialization of for loops for example, sometimes it's handy being able to initialize more than one variable. More than that though, C++17 brings even more places where it can be handy and which I've already made use of, see for example the following illustrative example:

if (const auto dice1 = dice(), dice2 = dice();
    dice1 == dice2) {
    cout << "Matching dices on 1st play: " << dice1 << endl;
} else {
    cout << "Non matching dices on 1st play: " << dice1 << " and " << dice2
         << endl;
}

if (const auto dice1 = dice(), dice2 = dice();
    dice1 == dice2) {
    cout << "Matching dices on 2nd play: " << dice1 << endl;
} else {
    cout << "Non matching dices on 2nd play: " << dice1 << " and " << dice2
         << endl;
}

Notice how I'm able to make use of immutability while being able to reuse variable names. I can't do that if the variables are declared outside of the if statement, I'd have to make them mutable for reuse, or always use new variable names to preserve immutability, because they would be in the same outer scope, unnecessarily polluting it — to evade that I'd need to employ extra scope blocks. Also, notice how using "west const" avoids the possible confusion of constness being associated with the first variable alone.

What about, instead of avoiding myriad useful programming constructs because they don't fit the style, doing the reverse and consider a style that won't exclude any constructs?

Anything that can be put on the left of the type (const, volatile, static, etc) applies to all variables, putting it on the left makes that completely clear. Anything that can't be put on the left (&, * const, etc), applies to each variable in particular, so binding it to variable avoids confusion too.

Conclusion

The fundamental thesis of A Foolish Consistency is that the status quo is preventing us from evolving to a much more "logical" world. This brain dump is to demonstrate that not all consistency is necessarily foolish and that I don't see much logic in this new world. In fact, C and C++ is more in the realm of insanity than logic.

There are many other arguments to weight on this battle, I've already heard some people prefer const on the left to make it stand out on a code base where immutability struggles to survive, or because it reads better, or because most, if not all, compilers display error messages with "west const" and operators binding to variable, and on and on. If compilers ever change on that, I hope it comes as an option because "east const" and friends make my eyes bleed!

As Bjarne states on const placement: "I put it before, but that's a matter of taste"14. I agree, taste!, not logic in the absolute sense. I'm pragmatic, so I prefer to honor the original grammar's rationale, not preclude useful language constructs, be predictive and speak the same language as in the compiler's error messages. Other people may prefer the oversimplifications (because it's easier to move on when teaching) or to express intentions that weren't materialized in the grammar's rationale, or because a form of textual replacement works (or other symmetry), to a limited extent, and that it can be helpful in some coding niche.

That's the fact, there's no absolute better route, it's just a choice. What is bad, really bad, though, is to "teach by rote" solely, and never introduce comprehension, or even bury it.

Bonus

When you comprehend stuff, their predictive power sometimes can provide some surprises.

There were times I wished "forwarding references" (a.k.a. "universal references") got disabled, so that I could have generic rvalue-only references as parameters, instead of accepting all value categories:

auto f = [](auto &&r) {};

f(42); // OK
int x;
f(x);  // Also OK...

Though there should be a correct and non-trivial way of achieving that, I thought: what if I parenthesized the declarator, would that stop the reference forwarding mechanism?

auto f = [](auto (&&r)) {};

f(42); // OK
int x;
f(x);  // Error?

It was an obscure inner working's intuition, which I probably can't verbalize. The thing is, the standard states that spurious parentheses in the declarator shouldn't make any difference, but fact is that up to Clang 4.0.1 this "worked"! Coincidentally, it was the compiler I was using at the time of the test. I'm not sure about the full implications of that if it were ever made into the language, I find it neat though, but sadly, it was just a (predicted) bug.


  1. Jon Kalb (February 2018). A Foolish Consistency. [return]
  2. Dennis M. Ritchie (January 1974). C Reference Manual, Section 8.4 "Meaning of declarators". [return]
  3. Dennis M. Ritchie. C Reference Manual, Section 8.4 "Meaning of declarators". [return]
  4. Brian W. Kernighan; Dennis M. Ritchie (February 1978). The C Programming Language, First Edition, Appendix A, Section 8.4 "Meaning of declarators". ISBN 0-13-110163-3. [return]
  5. Bjarne Stroustrup (January 1984). The C++ Programming Language Reference Manual, Section 8.4 "Meaning of declarators". [return]
  6. Bjarne Stroustrup. The C++ Programming Language Reference Manual, Section 8.4 "Meaning of declarators". [return]
  7. Brian W. Kernighan; Dennis M. Ritchie (April 1988). The C Programming Language, Second Edition, Appendix A, Section 8.6 "Meaning of declarators". ISBN 0-13-110362-8. [return]
  8. C types are inside-out. [return]
  9. Bjarne Stroustrup (April 1994). The Design and Evolution of C++, First Edition, Section 2.8.1 "The C Declaration Syntax". ISBN 0-201-54330-3. [return]
  10. C IAQ, Section 1.11. [return]
  11. Stanley B. Lippman; Josée Lajoie; Barbara E. Moo (August 2012). C++ Primer, 5th Edition (First Printing), Section 16.4 "Variadic Templates". ISBN 0-321-71411-3. [return]
  12. Bjarne Stroustrup's C++ Style and Technique FAQ, "Is 'int* p;' right or is 'int *p;' right?" [return]
  13. David Vandevoorde; Nicolai M. Josuttis; Douglas Gregor (September 2017). C++ Templates - The Complete Guide, Second Edition, "Some Remarks About Programming Style". ISBN 0-321-71412-1. [return]
  14. Bjarne Stroustrup's C++ Style and Technique FAQ, "Should I put 'const' before or after the type?" [return]
c, cpp, const, east-const, west-const, bikeshedding
Spotted a mistake in this article? Why not suggest an edit!