A Tour of Rare C++ Features 3 of 7: Digraphs, Trigraphs and Lambda Functions

In the previous installment we explored function try blocks, catch-all exception handlers and the correct way to rethrow exceptions. This time, we are going to talk about ancient and fairly modern features: Digraphs have been around for so long that the reason for their existence might be surprising, while lambda functions are one of the big new features in C++11, which have been further refined with C++14.

The example for this post is still the same one as in the previous ones and should now start to make a tiny bit of sense if you have read the previous posts in this series:

/* This is valid C++ */  
auto main() -> decltype('O.o') try  
<%[O_O = 0b0]<%
https://gha.st/a-tour-of-rare-cpp-features/  
typedef struct o O;  
o*(*((&&o(o*o))<:'o':>))(o*o);  
if(O*O = decltype(0'0[o(0)](0))(0)) 1,000.00;  
else return 0==O==0 ? throw O_O : O_O;  
%>();%>
catch(...) { throw; }  

Digraphs and Trigraphs

Once upon a time, when CPUs where still hand-etched by the most skilled artisans of the whole empire, keyboards were still crafted in the traditional manner. Sadly, the acolytes painting the letters had vastly differing skill levels. For example, there was a lack in those able to properly curl their braces...

Well, at least there were keyboards around which were lacking some keys commonly used by the C programming language, like curly braces. To deal with that, workarounds were added to the language. Digraphs are two-letter representations that are replaced at a very early compilation stage with a different (single) letter. Similarly, Trigraphs are three-letter representations that are replaced at a very early compilation stage with a different (single) letter.

The example uses four different Digraphs, which are replaced as follows:

  • <% and %> become { and }
  • <: and :> become [ and ]

In practice, two replacements are of importance even today:

  • <: because it can be accidentally triggered when trying to use a fully qualified name as the first parameter of a template, as in: ::std::vector<::std::uint8_t>. Since C++11 this is not a problem anymore, as the language rules were adapted for this corner case.
  • ??/ because it can create total confusion, especially for those using a US keyboard layout. First, notice that on that keyboard layout, the question mark is reached by pressing [shift]+[forward slash], so creating a sequence like ?????????/ can easily happen by accident, when writing a lot of question marks. So, what happens when someone writes a comment that ends in a bunch of question marks? Well, have a look at this code snippet (and ignore the faulty syntax highlighting, if you please):
// why is the line below never executed??????/  
::std::abort();
// because ??/ at the end of a line becomes \
which is the line continuation character, meaning it is still a comment!  

Trigraphs are (probably) being removed with the upcoming C++ standard C++1z/C++17.

Lambda Functions

After mentally replacing <% and %> with { and }, we can finally spy the next construct:

int main() try  
{[/*stuff*/]{
// more stuff
}();}
catch(...) { throw; }  

Inside the try block, there is something that looks vaguely like [/*stuff*/]{ /* more stuff */}();. As we are now at block scope, it has to be a statement, but which statement starts with something in brackets?

The answer is a lambda function. In general, lambda functions are anonymous functions, usually able to capture values of their dynamic environment. What does that really mean? It means that it is possible to create a function that has access to the variables around it.

A lambda expression (i.e. an expression that creates a lambda function) begins with what is called its capture group, which is a list surrounded by brackets of variables that it captures. In this instalment of the series, we will not look at the actual capture group used in the example, but at boring "old" C++11 captures. Those simply list variables that should be captured, which can be captured by value or by reference. A few examples:

int x, y; // stuff for lambdas to capture  
[] {}; // a lambda that captures nothing and does nothing
[&] { x = 42; } // a lambda that captures everything by reference
[=x, &y] { y = x; } // capture x by value and y by reference 

Note that all the expressions in the snippet above only create lambdas, not call them. Calling lambda functions is done as you would call any other function: []{}() creates a lambda that captures nothing, does nothing, and is immediately called. The big example from the beginning does something fairly similar: It creates a lambda function that takes no parameters, and immediately calls it. The next parts of this series will further explore its capture list and what it actually does when called.

Until now, we have only created lambda functions with no parameters. This is pretty boring in the general case, but it is also pretty simple to create a lambda expression that takes a parameter:

[](){}; // a lambda that captures nothing, has no parameters and does nothing  
[](int& x){ x = 42; } // a lambda that takes an int by reference when called
[](auto& x){ } // a lambda that takes any object by reference when called

With lambda functions it is important to decide whether a parameter should better be captured when the lambda function is created, or whether a parameter should be passed at each call.

Of course, lambda functions can also return stuff:

[]() ->void {}; // a really boring lambda  
[x](int y) -> int { return x + y; }; // captures x, and can add it to different ys.
[x](auto y) { return x + y; }; // same, but with everything deduced by the compiler

Note, how the syntax for explicitly specifying the return type of a lambda expression follows the trailing return type syntax for normal functions explored in part 1 of this series, while auto parameters1 are only available for lambda expressions at this point2.

For a more comprehensive survey of the lambda expression syntax, please check the overview at cppreference.com. Lambda expressions have been added with C++11, and auto parameters for them have been added with C++14.

Gimme more!

This post is part 3 of 7 of a series on rarely used C++ features. To find the other parts of the series around this example of how to not write C++, see this overview page.

Directly continue to the next part, Part 4.

Footnotes

  1. An auto parameter is basically an implicit template, which would be syntactically really awkward for lambda expressions. Of course, it is also pretty awkward for normal functions...

  2. I am still hoping that they will become available for normal functions with C++1z.

Daniel Schemmel

is currently employed at the Chair of Communication and Distributed Systems at RWTH Aachen University, where he researchs the testability of distributed systems. He can be reached at blog(at)gha.st.

Aachen, Germany, Terra, Sol, Milky Way, Laniakea SC https://gha.st/about/