by Jonathan Kujawa
In September the New York Times reported on an interesting tidbit in the annual budget request by the National Security Agency (NSA). You can read the full NYT article here, but the relevant lines are highlighted in this image from the NYT's article:
The NSA claims to be able to insert hidden weaknesses into the cryptographic technologies which are used to keep our data safe on the internet. Admittedly, in recent weeks alone the NSA has been accused of interdicting internet routers during shipment and inserting bugging devices, targeting human rights groups, and recording all audio from phone calls to/from the Bahamas. By now it's hard to keep track of all the various outrages committed by the US government in the name of security.
So why do I want to talk about something which is, comparatively speaking, old news? First, because it is an ongoing issue and inserting weaknesses into widely used cryptographic schemes has the potential to affect everyone — not just those targeted by the NSA. If there are weaknesses, then other nefarious characters may be able to exploit them. So much for the “They aren't interested in me” and the “I have nothing to hide” arguments.
Second, this isn't a case of bureaucrats over-promising and under-delivering. If we dig into the mathematics we find plausible independent evidence that the US government did indeed insert a backdoor into a widely used cryptographic system. It seems there is at least one branch of the federal government who can deliver on their IT promises.
First, let me tell you the good news:
- If the NSA put the effort into installing a backdoor it is reasonable to infer that it is still hard (or impossible) for them to breach properly implemented cryptographic schemes.
- The NSA unarguably far exceeds anyone else in code-breaking. If they can't breach it without a backdoor, nobody can.
- The backdoor I'll discuss below gives no advantage to anyone who doesn't know the backdoor. Run-of-the-mill hackers gain no advantage from its existence.
- The NIST recently dropped the algorithm in question from its list of accepted standards.
- The generally feeble NSA law currently under consideration by Congress does have an amendment forbidding the NSA's involvement setting future cryptography standards.
So how is it the NSA is involved in setting standards for internet cryptography? The National Institute for Standards and Technology (NIST) is the federal agency charged with setting the standards for commerce within the US. Everything from weights and measures to cryptography standards. Under federal law NIST is required to consult with the NSA before setting cryptography standards. Whatever standards NIST sets, they becomes the standard for the US government, US commerce, and, by extension, the world.
On the one hand this gives the NSA the opportunity to suggest ways to beef up the security of proposed cryptography standards. And, indeed, they've done just this in the past. But now it seems the NSA has gone in the other direction.
If you are the NSA, how would you go about inserting a weakness while appearing to favor increased security?
Let's first talk about an alternative scenario. Imagine you are head of security for your local bank — let's call it the National Savings Association — and you're in cahoots with your felonious cousin Louie. Every person has an ATM card and is required to have a four digit PIN number. If a customer can choose from any of the ten possible digits for each of the four numbers, then there 10*10*10*10 = 10,000 possible PINs. That is, Louie would have 10,000 different PINs to try when doing a trial-and-error search for a customer's PIN.
But now imagine that in the name of “security” you prohibit customers from using a number more than once in their PIN. So 7926 would be okay, but 1242 would be outlawed. That seems reasonable and you could probably even sell it to your boss since it would prevent “easy to guess” PINs like 1111 or 1212. If the National Savings Association implemented your suggestion, then you can have any digit for the first number, any number for the second digit except the one chosen for the first digit, any number for the third digit except the two used for the first two digits, etc. In short, the number of possible PINs is now 10*9*8*7 = 5,040.
This “sensible” precaution to “enhance” security actually makes it dramatically weaker. Louie now needs to search through only half the number of PINs. Quite the favor you've handed your cousin Louie! [1]
This is the worst sort of weakening since it makes it easier for everyone to breach the security. The better sort is a backdoor. Like its name suggests, this is something which allows those in the know to get through while keeping everyone else out. Not only a tired movie trope, a backdoor is exactly what the NSA claims to have.
The fundamental principle behind all cryptographic systems is to have a procedure which is easy for computers to do but extraordinarily hard to undo. The widely used RSA encryption scheme depends upon the fact that it is easy to multiple two extremely large prime numbers, but if I keep those numbers secret and just give you their product, then it is extremely hard to factor what I gave you and discover my secret numbers.
As computers and factoring algorithms get more powerful we are forced to use ever larger numbers to maintain the security of systems based on the multiplication of numbers. In the mid-80's Neal Koblitz and Victor Miller suggested that we could instead use multiplication in alternative number systems. Since these number systems arise from gadgets called elliptic curves, this is known as elliptic curve cryptography (ECC).
An elliptic curve is just a special sort of curve drawn on the xy-plane. The “numbers” of an elliptic curve number system are the points on the curve. The key fact is that each straight line which intersects the elliptic curve in two points always also intersects it in a third point (we introduce one extra point called infinity to count as the third intersection point of vertical lines). To “add” two points, you draw the line through them. The third point hit by the line is said to be their sum (or, more precisely, the negative of their sum). It can be checked that this rule has all the nice properties you expect from addition. Once you have addition you also have multiplication and are good to go. Here's a nice picture from Wikipedia showing how the addition rule works to give you P + Q = R:
One advantage of an ECC is that each elliptic curve gives a new number system. More importantly, it is significantly harder to factor in a properly chosen elliptic curve number system than in the usual number system. According to Wikipedia, at the moment the largest RSA key which has been broken has 768 bits while the largest broken ECC key has 112 bits.
In short, ECC gives you better security for less effort. And indeed ECC is widely used. If you go to google.com and click on the little padlock in the bar of your browser, you can bring up information about their encryption scheme. When you do you'll see ECDHE. This stands for Elliptic Curve Diffie-Hellman Ephemeral.
The claim is that the NIST set standards for ECC which allow for a backdoor. Remember, these are the same standards which were created with the advice of good ol' NSA!
How do we know a backdoor is possible? A wee bit of math: The NIST standard gives an elliptic curve and two points on that curve. Let's call them P and Q. For some strange reason, the elliptic curve chosen by NIST has a prime number of points. This doesn't have to be the case, but since it is we know that if we add P to itself some number of times, then it must equal Q [2]. This means that the two points are not completely independent. This small interdependency is enough to crack the code. Specifically, you just need to know exactly how many times you must add P to itself to equal Q. Since NIST picked P and Q, it's not much of a stretch to imagine they also know this number. Fortunately, knowing there is a number is not enough and finding it after the fact is nearly impossible. This is why hackers can't use the backdoor.
It is fairly easy to prevent this backdoor even existing by letting P and Q vary and not always be the same fixed points. Or by choosing an elliptic curve with a non-prime number of points. This would then let them fix a P and Q which are truly unrelated. For some reason they didn't do this.
We can't prove they actually made a backdoor. What we can say is that the standards chosen by NIST allows for a backdoor and whomever knows this backdoor can easily bypass the security of anyone who uses their standards. Either the clever code-breakers at the NSA failed to notice this possibility or hoped we would.
The fact that the NIST standards allowed for a backdoor was noticed and publicized by Microsoft security researchers way back in 2007. This was the good old days before Snowden when the fact that the NSA could insert a backdoor was only an idle curiosity.
If you'd like a readable overview of the backdoor in question, I suggest this article by Slate. If you'd like a rather more technical discussion of how ECC works, I suggest this article by Arstechnica. And if you'd like to know the actual, undergraduate math major level mathematics which shows how a backdoor is made possible by the NIST ECC standards, I recommend this article by Thomas Hales which appeared in a recent issue of the Notices of the American Mathematical Society. It was this last article which impressed upon me that the backdoor was not just a theoretic possibility but a likely reality.
[1] Shockingly, despite having 10,000 PINs to choose from, over 20% of people choose one of the following five: 1234, 1111, 0000, 1212, 7777. Don't be one of those people!
[2] Why does having a prime number of points on the elliptic curve make P and Q interdependent?
Imagine you have broken clock where the hour hand jumps by two at every tick. If it starts at 3, then it will go to 5, 7, 8, 11, 1, and finally back to 3. You will only get the odd numbers. If instead it jumps by four at every tick, then you will get 3, 7, 11, and back to 3. On the other hand, if you have a clock which jumps by 5 at every tick, then you get 3, 8, 1, 6, 11, 4, 9, 2, 7, 12, 5, 10, and finally back to 3. In the first two cases, you miss some numbers as you cycle around, but in the last case you hit them all before you get back to the start. The difference is that 2 and 4 share divisors with 12, but the only number which divides both 5 and 12 is 1.
Now say you have a clock-face with 13 numbers on it. There is no number which shares divisors with 13 (because 13 is prime!) so any size jump will traverse through all the numbers. In just the same way, if your elliptic curve has a prime number of points (like 13), then if you keep doing jumps of size P you must eventually land on Q.
[3] The Electronic Frontier Foundation.