In the previous installment we discovered the ancient Digraphs and Trigraphs, as well as the more modern lambda expressions. We did however leave out what exactly the capture list that is used in the guiding example of this series means, which is the first topic of this post. Afterwards we will spend a tiny bit of time on how there can be an URL directly in the code.

The guiding example of this series is still the same and should now start to make a small bit of sense if you have read the previous posts in this series:

/* This is valid C++ */
auto main() -> decltype('O.o') try
<%[O_O = 0b0]<%
https://gha.st/a-tour-of-rare-cpp-features/
typedef struct o O;
o*(*((&&o(o*o))<:'o':>))(o*o);
if(O*O = decltype(0'0[o(0)](0))(0)) 1,000.00;
else return 0==O==0 ? throw O_O : O_O;
%>();%>
catch(...) { throw; }


The capture list in this example uses a C++14 feature that generalizes how captures are done. Instead of simply listing variables to capture, it is possible to capture the result of an expression. The most common case where one would want to do that is explored below:

struct S {
int x;

// this is disabled, as it won't compile
#if 0
auto f() { // the type of lambdas needs to be deduced automatically
return [&x]{ return x; };
}
#endif

auto g() {
return [this]{ return x; };
}

auto h() {
return [&x = this->x]{ return x; };
}
};


The first function, f, would not even compile, because you cannot capture x directly. This is due to the fact that in member functions of S, the usage of x is only a shorthand for this->x, meaning there really is no variable called x in that scope.

The second function, g, instead simply captures this, which allows the usage of the members inside the lambda. Interestingly enough, the normal ease of syntax is preserved and we can simply use x directly inside the lambda. This does however capture more than necessary: Instead of just capturing x, we capture the whole object around it.

Finally, the third function, h, makes a reference variable x available inside the lambda that is initialized with the expression this->x.

All three functions use a feature that is really interesting: Their return type is deduced automatically by the compiler. Note how similar this is to the trailing return type syntax introduced in part 1 of this series: It simply leaves the trailing return type empty.

Binary Literals

Going back to the guiding example, we can now see that the capture list O_O = 0b0 makes a variable O_O available inside the lambda that is initialized with the expression 0b0. As a rule of thumb, anything that starts with a number is likely going to end up a number of some kind1, for example 0xaa is the hexadecimal number aa, which is 170 in decimal.

You may have noticed that the zero in hexadecimal is therefore written as 0x0, which is eerily similar to 0b0. There is a very good reason for this similarity: The same way that something starting with 0x is a hexadecimal number, anything starting with 0b is a binary number, with 0b0 simply being the binary number 0.

assert(0x0 == 0); // zero is zero
assert(0b0 == 0); // no matter which base it is in
assert(0b1 == 1 && 0x1 == 1); // same goes for 1
assert(0b10 == 2); // 0b10 is (1 * 2**1) + (0 * 2**0)
assert(0b10000 == 0x10); // four digits of binary make up one digit of hex


"URL Literals"

Finally, it is time to talk about the inside of the lambda function that is created and executed inside the function try block of our main function. (In reverse order as they are explored in this series.) The first line that we are going to explore is line 4 in the guiding example, and contains what looks like a literal URL, directly inside the source code!

https://gha.st/a-tour-of-rare-cpp-features/


To understand this, we really have to think more like a compiler and less like a human being. The line begins with what looks like the identifier https. This is followed by a colon. The rest is pretty easy now: Something that starts with two forward slashes is a line-comment!

An identifier followed by a colon is something that already has meaning in the C programming language: A label, to be used as the target of a goto. In fact, we could now write goto https; as the next statement to cause an infinite loop.

As it turns out, there is no such thing as an url literal2, but it really is nothing more than a label followed by a comment. This also poses a restriction on the number of fake "URL literals": Labels must be unique in each function, so we could not add another link to an https-enabled address inside the same function. Additionally, it may cause your compiler to give a warning; for example, Clang reports unused labels when -Wunused-label is used.

Since this construction does not even give a value that could be assigned to a variable, I would suggest that this construction, while interesting intellectually, should be banned on pain of, well, pain. A normal comment does the same job without any of the drawbacks associated with the label and only requires two additional characters.

Gimme more!

This post is part 4 of 7 of a series on rarely used C++ features. To find the other parts of the series around this example of how to not write C++, see this overview page.

Footnotes

1. There is a sort-of-exception to this rule. The introduction of user-defined-literals allows the creation of an operator that will immediately consume the literal, and will usually return a class type. For example, the standard library comes with custom literals for hours and minutes, allowing one to write code such as assert(1h + 30min == 90min);

2. Since C++11, it is possible to add URL literals as user-defined literals, but they would look like something along the lines of "https://gha.st/a-tour-of-rare-cpp-features"_url.