Proving one's identity is hard business. Part of that hardness comes from philosophical questions like "If I stop liking computers, does that future me constitute a different person?". In a way, some of these problems are repeated in the biometrics area, which have to distinguish between "real identities" and things that may look the same to their sensors, or even people that lose the things that provided their identity in accidents.
The most common method to deal with any of this is to rely on a Proof of Knowledge instead of a Proof of Identity, as this is not only easier to quantify, but also does not require special sensors and can be exchanged at will — after all if you accidentally left your fingerprints anywhere your biometric identity could be compromised for life1...
The most basic variant of such a Proof of Knowledge is to simply have a shared secret, that you tell your communication partner to establish that you know it. While this is delightfully simple, it suffers from several problems:
- If anyone can hear/see/etc you perform the proof, they can then impersonate your identity.
- You need to use a different secret for every communication partner, or the same thing can happen. Are you using different passwords for every single logon you ever create?
- It is a common misconception that passwords can be memorized by the average person. It is hard enough memorizing even a few actually secure passwords, doing so for every single logon is pretty much impossible, especially since you only need to logon to some services very rarely.
- The secrecy of the shared secret also depends on the other communication partner. Are you sure that none of your passwords has been compromised in some way?
Pretty much all of these problems are due to one simple, basic truth: If you tell a secret to anybody, it is not a secret anymore!
Zero Knowledge Proofs of Knowledge
Most of those problems would simply go away if it were somehow possible to prove knowledge of a secret without giving away anything else. Surprisingly, such methods do exist, and are called Zero Knowledge Proof of Knowledge, or short Zero Knowledge Proof or ZKP, because they perform a proof of knowledge, but give away zero knowledge beyond that.
In spite of the very lofty terminology, ZKPs are really easy to understand. For example, Peter and Victor live next door to one another. One day Peter informs Victor that he knows of a secret door that connects both houses. Of course, Victor is skeptical and wants to see that door – but Peter wants to keep the secret.
To solve the dilemma, they meet up in front of the houses, where Victor has set up video surveillance of the gardens to prevent Peter from running around the back, and perform the following simple test:
- Victor will close his eyes, and Peter will enter one of the houses through the front door.
- Victor opens his eyes again, and rings one of the door bells.
- Peter must now come out of the house where Victor rung the bell.
After performing the protocol, Victor, understandably, is still not convinced, as Peter may have simply gotten lucky. To convince Victor, Peter simply proposes to repeat the test a few times — after all, how often can one get lucky in a row?2
After performing the test for a few times, Victor starts to become convinced that Peter must be telling the truth. Thus, Peter proved his knowledge of a secret passage without giving away anything3 about the secret door beyond the fact that it must exist!
But wait, what if Peter really did just get lucky? After all, people do win the lottery! To show the power of a protocol that can increase confidence by repetition, let us do two comparisons: Winning the Powerball (1 in 292 201 338) in a single try is more likely than successfully cheating 29 times. Guessing a random 12 character (printable) ASCII password (1 in 540 360 087 662 636 962 890 625) in a single try is more likely than successfully cheating 79 times. As you can see, depending on the expense for a single guess (hundreds of dollars for some lotteries down to almost nothing for a single password attempt against an encrypted file), it is easy to ensure that the ZKP is more secure than the other components4.
In the previous example, Peter and Victor performed three steps in turn. Many more realistic ZKP schemes have a similar structure. These Σ-Protocols perform one round of the ZKP in the following way:
- The prover (Peter) commits to something (which door he walks into)
- The verifier (Victor) gives out a challenge (by ringing a door bell)
- The prover (Peter) must now correctly respond to that challenge (by coming out of the right house)
Usually, the commitment and the response are mathematically connected in such a way that the prover either has to correctly divine the challenge or be in possession of the secret whose knowledge is to be proven. This way, a probability $0<p<1$ (in the example $p=0.5$) can be given that gives the chances of successfully cheating during one round. The protocol is then repeated (possibly in an interleaved fashion) $k$ times to give the total cheating probability $p^k$ which can be tuned to perfection by choosing $k$.
While it seems intuitively clear that Victor cannot garner any knowledge from the protocol, we are still lacking a mathematical definition of what "zero knowledge" really means and how the property can be proven. The basic idea how to do so is very simple: If anyone can perform the protocol without knowing the secret, then they can gain nothing by observing someone else doing the protocol!
To implement this concept, we need the idea of a simulator, which is an algorithm that can be used to (efficiently) create a communication trace that cannot (efficiently5) be distinguished from a "real" communication trace. Giving such a simulator is possible since it plays both sides of the communication, and can thus predetermine what challenge will be given.
A simulator for the secret passage between Peter's and Victor's can be easily given: Simone can perform the protocol by simply deciding beforehand which door bell will be rung — and then simply committing to the right house. This way the cheating possibility is used to create a simulated communication that cannot be distinguished from a recorded transmission. It also shows quite well that such ZKPs need to be interactive; after all it is impossible to distinguish a recorded logon by someone knowing the secret from a simulated one by someone not knowing it.
Proves of Knowledge are a cornerstone of the internet, where they are often used to prove identity. Common password authentication is one example that is particularly bad, as it requires the user to give the secret away to someone else — and a secret told is not a secret anymore.
It is possible to perform a so called Zero Knowledge Proof of Knowledge that proves the knowledge of a secret with arbitrary security. A common scheme to construct such protocols is that of Σ-Protocols, which consist of a commitment by the prover, followed by a challenge by the verifier and a response from the prover.
Mathematically the whole concept is supported by the idea of giving an efficient simulator algorithm that can generate something that looks identically to a recording of a successful Zero Knowledge Proof of Knowledge. The existence of a simulator not only proves the "Zero Knowledge" property, but also ensures that a recorded logon attempt cannot be used afterwards to prove anything.
In the next post a ZKP is given that is built on a more interesting mathematical problem.
I am sorry to say that I made not only that mistake, but have also shown my face in public (potentially compromising face detection) and even touched things without properly bleaching everything immediately afterwards (potentially compromising DNA based systems). Since neither fingerprints nor DNA will probably change significantly soon, I can only hope that face detection prevails — and that I can find a good cosmetic surgeon... ↩
Assuming Victor chooses the bell to ring at random with equal probability, Peter has a 50% chance of guessing correctly each turn. Since only one failure to come out of the correct door will cause him to fail the test, he has to guess correctly each time to fool Victor. The chance to guess correctly $k$ times in a row is simply $0.5^k$, giving a chance of only about one in a thousand for only ten repetitions. ↩
Yeah, yeah, the model example breaks down when Victor brings in CSI, or installs video surveillance not just in the garden, but also in his house, peeks during the first step, etc. pp. ↩
Note that in practice, the security of the ZKP also relies on how hard the secret can be found. If Victor where to bring in some kind of Ground Penetrating Radar, he might be able to find the door on his own. Similarly for real ZKPs, advances in other areas may very well allow an attacker to simply compute the secret themselves or to break cryptographic primitives (mainly commitment schemes) that are used by the ZKP. ↩
Most ZKP schemes require a way to create a commitment, i.e. a message that is both binding, meaning that the person that created the commitment cannot change what they committed to afterwards, and hidden, which means that nobody can divine the value that they committed to. This is usually done by assuming that some problem is so hard, as to be impossible in the real world. One possibility is to use a cryptographic hash function with a large per commitment salt; it is binding, as the issuer of the commitment would need to find multiple preimages mapping to the same value and hiding, as the attacker would have to guess the salt. An attacker with ridiculously magically double-infinite resources could nevertheless break both properties. ↩