A Tour of Rare C++ Features 7 of 7: The Comma and Ternary Operators

In the previous installment of this series we dealt with if statements, function style casts and grouping digits of numbers. This time, our focus will lie on the only remaining two statements of our guiding example.

By now, most of the guiding example of this series should be pretty well known, with the only parts that may still be somewhat enigmatic being highlighted below:

/* This is valid C++ */  
auto main() -> decltype('O.o') try  
<%[O_O = 0b0]<%
https://gha.st/a-tour-of-rare-cpp-features/  
typedef struct o O;  
o*(*((&&o(o*o))<:'o':>))(o*o);  
if(O*O = decltype(0'0[o(0)](0))(0)) 1,000.00;  
else return 0==O==0 ? throw O_O : O_O;  
%>();%>
catch(...) { throw; }  

Having already analyzed the if previously, we know that it will always take the else branch, meaning that the innocent looking 1,000.00; may wreak all kinds of havoc if it ever were to be executed anyway.

It is however almost as innocent as it looks. In fact, if it were to be executed, nobody would ever know, as it has no effect at all. This should not stop us from trying to understand what it really is, though. As anyone reading this post can be expected to be somewhat fluent in English, the deception that this innocent little expression tries to pull of might even work on some of you. It is not just a number, even though it looks just like $1,000.00.

Looking more closely at the definitions for integer and floating point literals, we can see that it cannot possibly be one number, since neither number literal sports a comma. Instead, there are two numbers separated by a comma. Where we to encounter this in a different context (e.g., f(1, 000.00)), the comma might make more sense.

The Comma Operator

Googling "C++ comma" results1 in a whole bunch of links to articles about the comma operator, which is exactly what is at work here: An int with value 1 and a double with value 0.0 that are operands to the comma operator.

To clarify: The commas in a function call are not comma operators: f(1, 000.00) will call f with two parameters, while f((1, 000.00)) will call f with the result of the comma operator applied to 1 and 0.0. Similarly do the commas in a variable declaration like int x, y; have special meaning, and are not some kind of weird invocation of the comma operator.

As it turns out, the comma operator has a far more interesting syntax than semantics: It simply evaluates the left hand side, discards it and then evaluates and returns the right hand side2.

Concluding the analysis of this branch, it is never executed, and if it were, it would first ignore an unused value, 1, followed by ignoring the other unused value, 0.0.

The Ternary Operator

The else branch consists of a single statement as well: return 0==O==0 ? throw O_O : O_O;. The first thing that we have to notice is that this is where the return type of the surrounding lambda function is finally determined. Since the lambda function did not explicitly declare a return type, the type of the expression given to its sole return statement becomes the return type of the whole lambda function.

The ternary operator or conditional operator a ? b : c is fairly simple: It works like an if for expressions that if a is true, returns b, otherwise it returns c. Most of the time, its type is determined as a general type that fits both b and c by attempting to convert b to the type of c and vice versa3.

Alright, that means we need to figure out the types of the middle and right operands if we want to know the type of the whole expression. The right operand is easy, since we already know that O_O has type int (and value 0). The middle operand however is a throw expression, which has type void.

When first learning that a throw actually has a type at all, I was pretty surprised - after all its whole purpose is to not act as a nice little expression that has a tidy little result, but rather to escape the constraints of ordinary control flow. However, seeing it in this light, it does make sense that its type is void, even if it is just to allow it to be classified as an expression, not a statement4.

As it turns out, there is a special rule regarding the use of throw as the middle or right argument to the conditional operator that recognizes the fact that its return type does not really matter. For our example, this means that *takes a deep breath*: The return type of the lambda expression is the type of the argument to the return statement that is the type of the conditional expression that is the type of O_O that is the type of 0b0 that is int. *Pfew*.

Until now I have evaded the question of what the condition does, which is rather simple, but may be counter-intuitive to someone used to mathematical notation. Recalling the condition (0==O==0) we see two equality comparisons. Unlike in mathematical notation, this does not mean that the whole expression is true if all three operands are equal, but rather is evaluated left-associative, meaning it could have been parenthesized as (0==O) == 0 without changing the meaning.

Since we are still in the if (although in the else branch), the meaning of O is still the variable that was declared in line 7 and analyzed in the previous installment. Important for now is only that it has type struct o* and was evaluated to false (we are in the else branch after all). Since it is a pointer type, this means that it must be a null pointer. Comparing 0 to a value of pointer type causes it to be the null pointer constant5, and the inner equality test to be true.

The outer comparison is therefore between true and 0 which means the outer equality test evaluates to false in turn, as true is converted to 1 which is obviously not zero.

Summary or TL;DR

if(O*O = decltype(0'0[o(0)](0))(0)) 1,000.00;  
else return 0==O==0 ? throw O_O : O_O;  

The if chooses the else branch, which returns the right hand operand of the conditional, which is an int with value 0. The result of the lambda is then discarded in the main function6.

Although, for me, this compiles with a warning that there are paths for this lambda to exit without returning a value, this is not quite true, as it always does the above, without depending on any kind of input.

While it has been a wild ride, the example has now been fully analyzed and should hold no further surprises for anyone who actually bothered to read the full series. As a final comment: Yes, although it may look intimidating, this program does absolutely nothing.

Gimme more!

This post is part 7 of 7 of a series on rarely used C++ features. To find the other parts of the series around this example of how to not write C++, see this overview page.

This is the last part of this series, as the example has now been fully analyzed.

Footnotes

  1. At the time of writing, with my browser preferences, IP address, Google's mood and so forth.

  2. Note that this means that all effects of the left hand side are sequenced before those of the right hand side. Therefore, the comma operator is really similar to a semicolon, only that it does not separate two statements, but two expressions. To get an inkling of what this means, consider the following snippet trying to do an xor swap: x^=y^=x^=y; This snippet leads to undefined behavior, basically because it modifies a variable twice without sequencing one before the other. The built-in (non-overloaded) comma operator allows us to easily write a correct version of this snippet: x^=y, y^=x, x^=y;, as it creates a sequence to which the evaluation must adhere. Note that this does not apply for the case in which a user-provided operator, is used, which is just treated as a ordinary function call.

  3. While that is mostly correct and easy to reason about, it is not the full truth, even considering the exception for throw expressions that the example abuses. The full rules can be found in the standard or at cppreference.com.

  4. This allows a variety of syntactic contortions - most of them about as pretty as an Elder God abusing an extra-fluffy kitten:

  5. void stage_death(bool die) {  
        // look ma, no curlies:
        if(die) ::std::cout << "[groans]\n", throw "blood";
    }
    
    void stage_death() {  
        // my teacher told me to always use a return statement
        // if I want to leave a function:
        return throw "blood";
    }
    
    void is_this_function_pure() {  
        if(throw "no", "something that can be ctxt converted to bool") {
            // unreachable
        }
    }
    
  6. Yes, there is a difference between the null pointer constant and an integer with value zero. Which meaning of 0 is chosen greatly depends on the context.

  7. A final kink: The main function is the only non-void function in C++ that may legally not return a value when executed. Its result then implicitly indicates success (0 on all relevant platforms).

Daniel Schemmel

is currently employed at the Chair of Communication and Distributed Systems at RWTH Aachen University, where he researchs the testability of distributed systems. He can be reached at blog(at)gha.st.

Aachen, Germany, Terra, Sol, Milky Way, Laniakea SC https://gha.st/about/