Saying A Lot While Saying Nothing At All

by Jonathan Kujawa

In 1969, the mathematician Hans Freudenthal proposed a seemingly unsolvable puzzle:

Abbas says to Sally and Penelope: I have two whole numbers between 1 and 100 which are not equal and whose sum is not more than 100. Abbas tells the sum of these numbers to Sally only, and their product to Penelope only. These communications remain secret. You must figure out the pair of numbers. The only other thing you know is that once Abbas spoke to Sally and Penelope, you overheard the following conversation:

1. Penelope says, “I know them not.”
2. Sally says, “I already knew that.”
3. Penelope says, “Now I know them.”
4. Sally says, “Now I know them, too.”

Amazingly, even though it seems you know hardly anything and the conversation between Sally and Penelope seems useless, it is in fact possible to determine the two numbers! Possible, but not easy! When Martin Gardner wrote about this puzzle in Scientific American, he called it the Impossible Puzzle. For a discussion of the Impossible Puzzle (including a solution!) and several similar brain-teasers, see John Berkhardt’s webpage.

The invasion of Ukraine by Russia seems to put us in a new Cold War. Certainly, it’s destroyed several decades of uneasy trust between NATO and Russia. This brought to mind a clever bit of mathematics that can be helpful when you don’t trust someone: zero-knowledge proofs. They are in the same spirit as the Impossible Puzzle, but are rather different creatures.

The general scenario is that I want to convince you that I know something or have done something, but besides being convinced, you don’t learn anything new from our conversation. Even if I trust you, I could be worried that our communications will be intercepted by a baddie and still want our conversation to exchange no actual information.

For example, maybe I solved one of the million-dollar Millenium Prize problems and I would like to convince you that I actually have a solution. The easy thing to do would be to lay out all the details. That would convince you. But what if I fear that you’ll claim the prize for yourself? Or fear the NSA is listening in and will steal my solution to the P vs. NP problem for their own nefarious purposes?

You can imagine many real-world scenarios where this might come up:

• Two nuclear powers might want to be able to convince each other that they are upholding their end of a treaty without giving their opponent any information about their weapon systems.
• You might want to be able to verify your identity on your bank’s website without giving up any revealing information to hackers who might be listening in.
• Online voting systems might want to let you verify that they have recorded your vote without revealing any information about your vote.
• A company might want to convince investors they have successfully built a quantum computer, but don’t want to risk revealing any information about how it works to potential corporate spies.
• An informant might want to convince the FBI that they have the goods on the Icelandic Mafia but not reveal any information until after a plea deal is finalized.
• I might want to convince you that I solved today’s Sudoku puzzle, but not give any spoilers in case you haven’t yet tried it.

In 1982, three theoretical computer scientists Goldwasser, Micali, and Rackof considered this problem and showed that, as paradoxical as it sounds, such scenarios can actually be created. This is what they called zero-knowledge proofs.

To solve this problem, we need to be a bit more precise about what we are trying to achieve. The setup is that Abi knows how to solve some problem (or has some information), and she wants to convince Lilly that she has the solution. Furthermore, even after Lilly is convinced, she should know nothing more than she could have figured out without interacting with Abi. So no information was transmitted from Abi to Lilly. This is the “zero knowledge” aspect of a zero-knowledge proof.

The “proof” part of a zero-knowledge proof is the procedure or algorithm Abi uses to convince Lilly. Here “proof” should really be in quotes. In pure mathematics, a proof is meant to be 100% convincing. Euclid’s proof that there are infinitely many prime numbers is still unassailable by all skeptics even after 2,300 years. But in applied mathematics and in the real world, we are willing to allow for the possibility of error.

In a zero-knowledge proof, while 100% is not usually possible, Lilly does get to decide how certain she would like to be. If it is something like a Sudoku, maybe Lilly only needs to be 90% certain. If it is thermonuclear warfare, then maybe Lilly would like 99.999999999999999999999% certainty. The required level of certainty typically determines how many times the proof is rerun. For example, it could be that each time it is run the chances of an error goes down by half: 50%, 25%, 12.5%, 6.25%,…

It is important to point out that Abi is assumed to be acting honestly here. She isn’t allowed to cheat or lie. We are assuming that she actually has the claimed information and is sincere about trying to convince Lilly.

Let’s think about the scenario of a Sudoku puzzle. Remember, in Sudoku you play on a 9 x 9 grid [1]. You are given some of the numbers and must fill in the rest with the rule that each row, column, and 3 x 3 subgrids must contain exactly the numbers 1, …, 9. Abi has solved today’s Sudoku and would like to convince Lilly that she has the solution.

Abi’s first try at a “proof’ is simple. She will let Lilly ask her about any of the rows, columns, and subgrids, and Abi will reveal her solution to that part of the puzzle so that Lilly can verify it for herself.

Lilly says she will be satisfied with 90% certainty. Since there are 9 rows, 9 columns, and 9 subgrids, this means she needs to be convinced that at least 25 out of the 27 rows/columns/subgrids are correctly filled out. This means that Abi will need to run her procedure of revealing a row/column/subgrid at least 25 times. If all 25 checks work out, then Lilly will be satisfied.

There is an obvious flaw in Abi’s procedure. It’s not zero-knowledge. In fact, if she is a little clever, Lilly only needs nine checks to see all the boxes of the Sudoku and, hence, to know Abi’s solution. In the very first check, Lilly will already learn some of the entries of Abi’s solution to the puzzle. This is far from zero knowledge! Abi is rightly worried that Lilly (or eavesdroppers) will learn information about her solution.

Here is the clever bit. To make a procedure zero-knowledge, a useful trick is to inject randomness. With Abi’s one solution to the Sudoku puzzle, she is able to make a library of potentially thousands of other solutions. Namely, if the 9 x 9 grid is filled with a valid solution, then she can do a substitution (say replacing all 1’s with 3’s, all 3’s with 2’s, and all 2’s with 1’s). Since all rows, columns, and subgrids have exactly the numbers 1, …, 9, a substitution will just rearrange the entries of every row, column, and subgrid, so result in a new valid solution to Sudoku. There are 362,880 possible substitutions [2].

Abi’s modified procedure includes this randomness. When Lilly asks about a particular row, column, or subgrid, Abi will randomly choose one of the thousands of valid solutions related to her’s by a substitution, and show the row from that solution. Since this randomly chosen Sudoku puzzle is still solved, the row still checks out and Lilly is that much more convinced that Abi’s solution is valid. However, if Lilly later asks about that same row, she is likely to see a row from a completely different puzzle. It will also check out, but has no meaningful relation to the one she saw previously, nor Abi’s original solution.

So, while each time Lilly asks to see a part of the puzzle she becomes more and more convinced that Abi has a solution, Lilly is as ignorant as ever about Abi’s solution!

Of course, in the case of Sudoku, there are only 27 checks needed to verify the entire puzzle. It is entirely practical to achieve 100% certainty by checking them all. So this would be a zero-knowledge proof that can easily achieve 100% verification. In more elaborate situations one can easily need billions of checks (or even infinitely many!) and 100% may be forever out of reach.

At first glance, it seems that situations which allow for a zero-knowledge proof must be awfully specialized. It doesn’t help world peace if this method can only be applied to Sudoku. In 1991, Goldreich, Micali, and Wigderson showed that the solution to any NP problem has a zero-knowledge proof. NP problems are problems that are hard to solve, but if you have a solution it is easy to check correctness.

For example, it is hard to factor a number into its prime factors, but if you give me the factors, it is easy for me to multiply them together on a computer to see if the result is your original unfactored number.

What does this mean? Well, if a problem is easy, then probably nobody is going to care if you have a solution. So hard problems with checkable solutions are exactly the sort for which you might want to be able to provide a zero-knowledge proof. That is, NP problems are just the sort you’d like to use with zero-knowledge proofs. If you have a quantum computer that can easily factor large numbers, you’d sure like to convince investors it works without revealing the inner workings of your computer.

It is worth mentioning that while the Goldreich, Micali, and Wigderson result tells us that every interesting problem has a zero-knowledge proof, it does so in an indirect way. It is an active area of research to develop zero-knowledge proofs for the solutions of interesting problems (including Sudoku!).

There is a great Numberphile video starring Avi Wigderson where he explains zero-knowledge proofs, a particular zero-knowledge proof problem involving map colorings, and how that plus an amazing result from the 1970s yields the theorem of Goldreich, Micali, and Wigderson.

[1] You can learn the rules of Sudoku here on Wikipedia. To play Sudoku, you can find countless games online.

[2] Sudoku purists will point out that at the beginning of a puzzle you are given some of the numbers, so many of these substitutions will not be valid solutions for the puzzle of the day. After all, if there is a 2 in the upper left corner, you aren’t allowed to replace that 2 with a 3 and still have today’s puzzle. Experts will know that it is customary for Sudoku puzzle designers to make it so that a given puzzle has exactly one solution, so this substitution trick won’t work on real-world Sudoku puzzles. Nevertheless, it gives you the flavor of how to make things zero knowledge.