A Tour of Rare C++ Features 6 of 7: Declaring Variables in If Statements, Function Style Casts and Digit Grouping

In the previous installment of this series we dealt with the function declaration from hell. This post will focus on the if/else that follows it in the example that guides us through this series. For your convenience, I have highlighted the only parts of the example that remain completely undiscussed as of now:

/* This is valid C++ */  
auto main() -> decltype('O.o') try  
<%[O_O = 0b0]<%
https://gha.st/a-tour-of-rare-cpp-features/  
typedef struct o O;  
o*(*((&&o(o*o))<:'o':>))(o*o);  
if(O*O = decltype(0'0[o(0)](0))(0)) 1,000.00;  
else return 0==O==0 ? throw O_O : O_O;  
%>();%>
catch(...) { throw; }  

We are about to inspect line 7, and at the moment we have the following names in scope:

  • main is a function that returns int and takes no arguments1.
  • O_O was created by the generalized lambda capture in line 3, has type int and the value 0.
  • https is a label defined in line 4.
  • struct o is an incomplete class type declared in line 5.
  • O is a typedef for struct o declared in line 5 as well.
  • o is a function declared in line 6, and the reason we cannot refer to struct o simply as o anymore. The type of o is that of a function that:
    • takes one argument of type struct o* and
    • returns an (r-value) reference to an array of 'o' pointers to a function that:
      • takes one argument of type struct o* and
      • returns an struct o*.

Similar to for, an if can declare a new variable that is available in its scope. Unlike the for, which has specific room for a variable declaration in its syntax, the if reuses its condition by simply evaluating the variable to decide which path to take. For example if(bool no = false) { } would not enter the body of the if.

Obviously, this feature only makes sense for variables of some type that is (contextually) convertible to bool, but not bool itself, as the variable would always be true in the one branch and false in the other. Its primary usage in practice is with pointers, as in:

if(int* p = static_cast<int*>(malloc(sizeof(int)))) {  
    // p points to valid memory
} else {
    // p is always nullptr here
    // do some error handling like logging an error and terminating
}

Function Style Casts

Looking back at line 7 of the example, we can see that it creates a variable of type O*2 that gets named O and therefore will shadow the type O:

if(O*O = decltype(0'0[o(0)](0))(0)) 1,000.00;  
else return 0==O==0 ? throw O_O : O_O;  

To understand how the expression on the right hand side can possibly initialize O, allow me to highlight two important sets of parentheses:

if(O*O = decltype(0'0[o(0)](0))(0)) 1,000.00;  

As we have discussed previously, decltype gives the type of the expression it is applied to. Therefore, the right hand side is something along the lines of T(0), with T being some type that is convertible to O*.

This kind of expression is called a function style cast due to its similarity to a function call. While it looks very much like an explicit constructor call, that is not really the case. In reality, it does the exact same thing as an C style cast, which would look like (T)(0).

Why is that bad? Because C style casts (and, by extension, function style casts) really want to convert to the target type. In fact they will try all of the following in order until one conversion does the trick:

  1. Do a const_cast
  2. Do a static_cast
  3. Do a static_cast followed by a const_cast
  4. Do a reinterpret_cast
  5. Do a reinterpret_cast followed by a const_cast

Even worse, one can also write similar expressions with different arity (e.g. T() or T(1, 2)) which do pretty much what one would expect them to do.

However, there is a silver lining provided by C++11: Using curlies, as in T{0}, always "creates a temporary object of the specified type direct-list-initialized (8.5.4) with the specified braced-init-list, and its value is that temporary object as a prvalue." [ISO/IEC 14882:2011] - which is very close to what one would have expected the function-style cast to always do.

Inverted Array Indexing

The next thing that needs to be disected is the target type of the functional style cast: decltype(0'0[o(0)](0)). Peeking inside the decltype we see that it is an index operation followed by a call with the argument 0.

To understand the indexing, we have to know that array indexing in C++ is defined the same way it has been for a long time, all the way back to the C programming language: Unless operator[] is overloaded for the left hand side, an expression a[b] is equivalent to *(a+b).

Two sneaky facts are of importance here: The addition operator is commutative (i.e. x+y is the same as y+x) and it has not been stated anywhere that the array or pointer must be a! This means, that one can write array indexing like 42[array], since it is by definition equivalent to *(42+array), which in turn is the same as *(array+42). The ordinary array[42], is by definition equivalent to *(array+42) as well, meaning that for ordinary arrays or pointers and built-in integers, it does not matter which part is in the square brackets and which stands before them!

Digit Grouping

Having gathered this knowledge, we can now rewrite the expression under scrutiny: 0'0[o(0)](0) is the same as o(0)[0'0](0). Only how does is that any better than the previous version? After all 0'0 is not exactly something that we might expect inside an array index either!

By using a rule of thumb that was helpful earlier, we can already see that 0'0 might be simpler than it looks: Anything that starts with a number is likely a number itself. In fact, it is a lot simpler than it looks like at first glance. What we are looking at here is a C++14 digit seperator.

Since C++14 it is allowed to use ' inside numbers to group the digits in any way the developer feels useful. For example, you could separate the bytes in this binary literal 0b00000000000000001000000000000000 giving something a little bit more readable: 0b00000000'00000000'10000000'00000000. Alternatively one could use traditional groups of three as in 100'000.

However, there was something about leading zeros in number literals... Removing the digit grouping, the literal reads 00, which is in fact an octal literal - but that does not matter, as 0 is the same in any base.

Putting it together

So far we have learned that the branch that is chosen depends on a new variable named O and type O*, which is initialized by casting 0 to the same type as the expression o(0)[0](0).

Going by what we know about o (see the top of this page for a reminder), its argument must be a pointer to struct o. Using 0 in a place where any pointer is expected is perfectly legal and results in the null pointer3. While, in general, we cannot call o without first defining it, it is still legal to use it in a decltype expression, as it is not actually called.

The return type is a reference to an array, which perfectly fits with the indexing that happens next. The elements of that array are pointers to functions that again take a single struct o*. Although function pointers are pointers, they can be directly called without needing to be dereferenced, leading to a case we have seen before, as the 0 is used as a null pointer constant.

Finally, the return type of that function pointer is struct o*, which is the type of the whole decltype(0'0[o(0)](0)) expression. For the third time in this article, we use 0 as a null pointer constant for a value of type struct o*, and are thus able to resolve the function style cast.

Remembering that the type of the variable, O* is the same as the type struct o* (due to the typedef in line 5), we can see that the whole thing collapses down to:

if(O*O = nullptr) 1,000.00;  
else return 0==O==0 ? throw O_O : O_O;  

While we can now see that the if will always take the else branch, we leave the analysis of what happens on either branch for the next installment of this series.

Gimme more!

This post is part 6 of 7 of a series on rarely used C++ features. To find the other parts of the series around this example of how to not write C++, see this overview page.

Directly continue to the next part, Part 7.

Footnotes

  1. While we may not use the main function the same way we could use an ordinary function, its name and type are still in scope.

  2. Remember that O is a typedef for the incomplete type struct o at this point.

  3. In fact, in C++ the macro NULL is defined as 0. In C its definition is somewhat different (this is due to differences in the implicit casting rules concerning void): ((void)(0)). Of course modern C++ programmers will prefer to use nullptr over any other option.

Daniel Schemmel

is currently employed at the Chair of Communication and Distributed Systems at RWTH Aachen University, where he researchs the testability of distributed systems. He can be reached at blog(at)gha.st.

Aachen, Germany, Terra, Sol, Milky Way, Laniakea SC https://gha.st/about/