Unbreakable Cryptography in 5 Minutes

What if I told you that unbreakable cryptography exists? What if I told you that this article has content which is illegal in certain countries, and may be under export control in the US? Well, that’s precisely what I’m about to tell you. Unbreakable cryptography exists. Cryptographic technology is illegal, or heavily restricted, in certain countries [0], and some forms remain under export control within the vagaries of US law [1].

This article will teach you: (1) a bit of the history of cryptography, and (2) how to implement unbreakable cryptography on your own in 5 minutes or less.

There are three steps to consider:

  1. How will you generate a key?
  2. How will you encrypt a message?
  3. How will you decrypt a message?

To ensure we hit the 5 minute mark, I will begin by showing you an example in Python which we will discuss for the rest of the article. First, generating a key (Gist):

1
2
3
4
from os import urandom

def vernam_genkey(length):
    return bytearray(urandom(length))

second, encrypting a message (Gist):

1
2
3
4
5
def vernam_encrypt(plaintext, key):
   return bytearray(
           [ plaintext[i] ^ key[i]
               for i in xrange(len(plaintext))
           ])

finally, decrypting a message (Gist):

1
2
3
4
5
def vernam_decrypt(ciphertext, key):
   return bytearray(
           [ ciphertext[i] ^ key[i]
               for i in xrange(len(ciphertext))
           ])

The Python code listed above implements a Vernam Cipher [2]. Coupled with the Mauborgne Constraint [3]—you may never reuse a key, it must be the same size as the plaintext, and it must be from a true random bit source—you have unbreakable cryptography. Vernam was granted a patent in 1919 for this idea, and the US NSA considers this patent, “…one of the most important in the history of cryptography.” [4]

Vernam XOR Stream Cipher

Well, that was quite a mouthful. Don’t give up, we’ll go through things one at a time until you have a good understanding of what’s going on! The good news: although provably unbreakable [5], meaning it has been shown and can be shown through rigorous mathematical proof that no information from the original message may be gleaned from the encrypted form, the algorithm I am presenting is incredibly simple and easy to understand.

The entire technique hinges on the logical XOR function and its symmetry. XOR is a logical function which is equivalent to true precisely when the number of true inputs is an odd number.

Here is a truth table showing XOR’s value when provided two inputs (B1 and B2):

B1 B2 XOR
0 0 0
0 1 1
1 0 1
1 1 0

Note: Here we assume two possible bits of input—B1 and B2. We then show the XOR result for all combinations of true (1) and false (0).

I claimed that XOR is symmetric, because if you XOR an original bit sequence
with a second bit sequence twice, you will get the original again. Here is an
example, as another truth table:

A B XOR1 B XOR2
1 1 0 1 1
0 0 0 0 0
0 1 1 1 0
0 1 1 1 0
1 1 0 1 1
1 0 1 0 1
0 1 1 1 0

Note: When we XOR A—our ‘original message’—with B—our ‘key’—and we XOR the result—our ‘encrypted message’—with the B—‘key’—again, we are left with A—the ‘original message.’

It is because of this property that XOR works as an encryption and decryption function—there is nothing else to it. Once you choose a key, you simply XOR the message you want to encrypt and that gives you the encrypted form. On the other end, when you want to decrypt, you simply XOR the encrypted form with the key again and you will be left with the original message.

Encryption: Explained

Let’s revisit the Python code and make sure we’re all on the same page. First, the encryption method:

1
2
3
4
5
def vernam_encrypt(plaintext, key):
   return bytearray(
           [ plaintext[i] ^ key[i]
               for i in xrange(len(plaintext))
           ])

Okay, so this Python code takes in two arguments:

  1. plaintext – this is the original message (assumed to be of type bytearray)
  2. key – this is the key to use for encrypting (assumed to be of type bytearray)

and it returns something called a bytearray. bytearray‘s in Python are what you think they are, just an array of raw bytes with no further interpretation. By operating on bytearray‘s we maintain the highest level of abstraction and can encrypt or decrypt any message with any content.

The bytearray it returns is constructed by iterating over the elements within the plaintext, and using the XOR ^ operator to XOR each byte bit-by-bit; thus, 8 XOR’s are accomplished per byte. We iterate through elements by using an index variable i which goes from 0 to the length of plaintext via the xrange Python built-in function.

Decryption: Explained

I stated earlier that XOR is symmetric; thus, encryption and decryption are fundamentally the same function:

1
2
3
4
5
def vernam_decrypt(ciphertext, key):
   return bytearray(
           [ ciphertext[i] ^ key[i]
               for i in xrange(len(ciphertext))
           ])

we’ve only changed the name of an input here from plaintext to ciphertext, but everything remains the same as before.

Key Generation: Explained

We still need to generate a key and for that we need a source of randomness. Because finding sources of randomness is hard, we turn to a Python library function to generate random bytes for us. In this case, we use the urandom function from the os library distributed with Python:

1
2
3
4
from os import urandom

def vernam_genkey(length):
    return bytearray(urandom(length))

Our key generation function takes in one argument:

  1. length – the length, in bytes, of the key to be generated

and returns one result, a bytearray filled with random bytes of the length specified.

What’s Left? Millenia of Cryptographic Frustration!

Simple right? Well, mostly. If you’re thinking ahead, you might wonder how will we distribute these keys? You’ve inadvertently stumbled upon a problem [6] that has bothered cryptographers for millenia.

The nasty business with cryptography is how to securely share a secret such as a key without having a secure channel in the first place. It’s a manifestation of a chicken-and-egg or catch-22 problem: to create a secure channel, we must first have a secure channel. For millenia, this remained a central problem in cryptography. Until Diffie-Hellman-Merkle [7] (often cited as just Diffie-Hellman) published a secure key exchange method using mathematical tricks which nullified the need for a secure channel to exchange a secret key, the best way to exchange keys was physically in person. Their method is thought to have been the earliest, but secret government organizations may have invented this technology before them. I’d love to go through Diffie-Hellman-Merkle as one of the simplest, yet secure, key exchange algorithms, but that is a discussion best left for a future article to tackle.

Let me know if you’d like Diffie-Hellman-Merkle explained!

Errata:

  1. Fixed first table XOR values were incorrect (B1=1,B2=1,XOR=1 -> B1=1,B2=1,XOR=0). Thanks to Kenton Murray for pointing this out.

23 thoughts on “Unbreakable Cryptography in 5 Minutes

  1. Another way is to have the key based on a formula that combines of the date / location of the message thus negating the requirement of secure key distribution.

    • Yes, that’s true; however, I think this would be vulnerable to a patient attacker who could synchronize his clock to yours and also know your location (or just know exactly when the message was sent—or able to try several keys within a few hundred seconds of you sending).

  2. Well explained, thanks.

    A question comes to mind: if it’s so easy, why do so many people and companies get it wrong so often ?

    Also, what are the potential risks of using urandom as a source of randomness for keys (instead of a true random source) ?

    • Well, the fact is that it isn’t easy. Generating and distributing these keys is hard. Cryptosystems always have weaknesses that can be attacked—meaning you have to “update” your crypto practices to remain safe.

      In addition, most errors creep in from operator error: using weak keys/passwords, reusing keys when they shouldn’t, etc.

      Using a pseudo-random number generator (PRNG) like urandom means that your “randomness” has a period—it repeats after a certain time point. Attackers would benefit from you then reusing keys and in addition they only need to find the “seed” of your PRNG to regenerate the key stream. Fundamentally, a PRNG cannot provide true randomness.

  3. As a matter of style,
    I suggest that the author actually footnote the references (like item [6] above, for example), rather than make them links off of your article. Allow your article to stand on its own, without having to leave page, and in the footnote, make the link explicit. I found it annoying to click on the reference and be sent off of the page.

  4. Of course the “Mauborgne Constraint ” is what makes this form of encryption impractical for most modern communication since you would keep having to send sufficiently long keys to encrypt large streams of data.

    • Yes, although I hear militaries still distribute codebooks for OTP cryptography today (physically). But in general, impractical.

      If I had to do this myself I’d choose some reference text or infinite number to sample from and create keys. Yes, it’s less secure than true random (the key is out there in the wild), but, it would allow participating parties access to the key without needing to necessarily distribute it. For example, one could use positions of digits of pi as the key in some fashion.

  5. Simple and direct. This is really the best “show don`t tell” on cryptography i have read!

    ps: really like to read the “Diffie-Hellman-Merkle” thingy. 🙂

  6. Diffie-Hellman-Merkle doesn’t solve the MITM (man-in-the-middle) problem. As far as I know this cannot be solved mathematically or through code which is why SSL is fundamentaly centralized and broken (if you don’t trust the central authorities). So what is there for solutions to this?

Leave a Reply

Your email address will not be published. Required fields are marked *