Chapter 4 of Real World Algorithms.
Panos Louridas
Athens University of Economics and Business
The simplest encryption and decryption scheme is probably a substitution cipher, also known as Caesar cipher.
In Python, we takeeach character's encoding, via ord()
.
We calculate its position in the alphabet by subtracting the encoding for A
.
We add a shift
to it, taking care to wrap around if we get past 26 (the number of letters in our alphabet).
The result is the position in the alphabet of the encrypted character, so we add it to the position of A
to get its actual encoding.
We convert the encoding to a character with chr()
.
In our example we assume that we encrypt texts containing only upper case latin characters.
In [1]:
import io
def caesar_cipher(plaintext, shift):
ciphertext = [ chr((ord(c) - ord('A') + shift) % 26 + ord('A'))
for c in plaintext ]
return ''.join(ciphertext)
ciphertext = caesar_cipher('IAMSEATEDINANOFFICE', 5)
print(ciphertext)
Decryption is really the same think as encryption, we just use the opposite shift:
In [2]:
print(caesar_cipher(ciphertext, -5))
Needless to say, this is not a good cipher.
As we have not taken out the spaces between the words, one can even make out the structure of the text.
To break the decryption, one only needs to try different shifts.
In [3]:
def ord0(c):
return ord(c) - ord('A')
This is then the one-time pad encryption function:
In [4]:
def otpad_encrypt(plaintext, randomtext):
encrypted = [ chr((ord0(c) + ord0(r)) % 26 + ord('A'))
for c, r in zip(plaintext, randomtext) ]
return ''.join(encrypted)
To see it in action:
In [5]:
plaintext = "IFYOUREALLYWANTTO"
randomtext = "RDUUWJEMCJJXZDOWJ"
ciphertext = otpad_encrypt(plaintext, randomtext)
print(ciphertext)
Decryption is similar, but we subtract instead of adding:
In [6]:
def otpad_decrypt(ciphertext, randomtext):
decrypted = [ chr((ord0(c) - ord0(r)) % 26 + ord('A'))
for c, r in zip(ciphertext, randomtext) ]
return ''.join(decrypted)
So we can now decrypt the message we encrypted:
In [7]:
plaintext = otpad_decrypt(ciphertext, randomtext)
print(plaintext)
A one-time pad is impossible to decrypt unless you know the random text that has been used as the encryption key.
In fact, you can always invent a random text that decrypts to something sensible, but completely irrelevant to the original plaintext.
Let us write a function that does that, for the purpose of confusing our opponents.
In [8]:
def otpad_red_herring(plaintext, ciphertext):
red_herring = [ chr((ord0(c) - ord0(p)) % 26 + ord('A'))
for p, c in zip(plaintext, ciphertext) ]
return ''.join(red_herring)
red_herring = otpad_red_herring("IAMSEATEDINANOFFI", ciphertext)
print(red_herring)
So now we can leak GPNVLLUVZLTCBJJLK
as the key, and the opponent will then decrypt:
In [9]:
print(otpad_decrypt(ciphertext, red_herring))
Or anything else we like, for instance:
In [10]:
red_herring = otpad_red_herring("RENDEZVOUSATNIGHT", ciphertext)
print(red_herring)
That will happily decrypt our initial ciphertext again:
In [11]:
print(otpad_decrypt(ciphertext, red_herring))
When working with a one-time pad, we don't really need to do addition and subtraction on the positions of the letters in the alphabet.
It is more convenient and faster to use the XOR operation.
For encryption, we just XOR the plaintext and the randomtext:
In [12]:
def otpad_xencrypt(plaintext, randomtext):
encrypted = [ chr(ord(p) ^ ord(r)) for p, r in zip(plaintext, randomtext) ]
return ''.join(encrypted)
So we can now see how it works:
In [13]:
ciphertext = otpad_xencrypt(plaintext, randomtext)
print(ciphertext)
In fact we see nothing!
That is because the encrypted string is composed entirely of non-printable ASCII characters or whitespace.
If we really want to, we can print its hexadecimal representation.
In [14]:
print(ciphertext.encode('utf8'))
Decryption now is exactly the same with encryption, but we use the ciphertext and the random text:
In [15]:
decrypted = otpad_xencrypt(ciphertext, randomtext)
print(decrypted)
For example, see what is happening for $g = 2$ and $p = 13$.
$g^x$ is predictable, but $g^x \bmod p$ does not show any pattern.
Note that, in what follows, we are going to use Python's function pow()
.
Although Python has an operator for exponentiation (**
), pow()
is useful because it can either be called as pow(x, y)
to return $x^y$, or pow(x, y, p)
to return $x^y \bmod p$.
In [16]:
g = 2
p = 13
for x in range(1, 13):
print(pow(g, x), pow(g, x, p))
To see how Diffie-Hellman works, let us take Alice and Bob.
Alice and Bob agree on $g$ and $p$.
They choose $p = 2^{4096} - 2^{4032} - 1 + 2^{64} \times [ (2^{3966} \pi) + 240904 ]$.
Yes, this is a terrifying number. It is 4096 bits long.
No, Alice and Bob did not got it out of thin air. They chose it from a list of published recommendations.
The hexadecimal value of $p$ is equal to:
FFFFFFFF FFFFFFFF C90FDAA2 2168C234 C4C6628B 80DC1CD1
29024E08 8A67CC74 020BBEA6 3B139B22 514A0879 8E3404DD
EF9519B3 CD3A431B 302B0A6D F25F1437 4FE1356D 6D51C245
E485B576 625E7EC6 F44C42E9 A637ED6B 0BFF5CB6 F406B7ED
EE386BFB 5A899FA5 AE9F2411 7C4B1FE6 49286651 ECE45B3D
C2007CB8 A163BF05 98DA4836 1C55D39A 69163FA8 FD24CF5F
83655D23 DCA3AD96 1C62F356 208552BB 9ED52907 7096966D
670C354E 4ABC9804 F1746C08 CA18217C 32905E46 2E36CE3B
E39E772C 180E8603 9B2783A2 EC07A28F B5C55DF0 6F4C52C9
DE2BCBF6 95581718 3995497C EA956AE5 15D22618 98FA0510
15728E5A 8AAAC42D AD33170D 04507A33 A85521AB DF1CBA64
ECFB8504 58DBEF0A 8AEA7157 5D060C7D B3970F85 A6E1E4C7
ABF5AE8C DB0933D7 1E8C94E0 4A25619D CEE3D226 1AD2EE6B
F12FFA06 D98A0864 D8760273 3EC86A64 521F2B18 177B200C
BBE11757 7A615D6C 770988C0 BAD946E2 08E24FA0 74E5AB31
43DB5BFC E0FD108E 4B82D120 A9210801 1A723C12 A787E6D7
88719A10 BDBA5B26 99C32718 6AF4E23C 1A946834 B6150BDA
2583E9CA 2AD44CE8 DBBBC2DB 04DE8EF9 2E8EFC14 1FBECAA6
287C5947 4E6BC05D 99B2964F A090C3A2 233BA186 515BE7ED
1F612970 CEE2D7AF B81BDD76 2170481C D0069127 D5B05AA9
93B4EA98 8D8FDDC1 86FFB7DC 90A6C08F 4DF435C9 34063199
FFFFFFFF FFFFFFFF
In [17]:
p = 0xFFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1\
29024E088A67CC74020BBEA63B139B22514A08798E3404DD\
EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245\
E485B576625E7EC6F44C42E9A637ED6B0BFF5CB6F406B7ED\
EE386BFB5A899FA5AE9F24117C4B1FE649286651ECE45B3D\
C2007CB8A163BF0598DA48361C55D39A69163FA8FD24CF5F\
83655D23DCA3AD961C62F356208552BB9ED529077096966D\
670C354E4ABC9804F1746C08CA18217C32905E462E36CE3B\
E39E772C180E86039B2783A2EC07A28FB5C55DF06F4C52C9\
DE2BCBF6955817183995497CEA956AE515D2261898FA0510\
15728E5A8AAAC42DAD33170D04507A33A85521ABDF1CBA64\
ECFB850458DBEF0A8AEA71575D060C7DB3970F85A6E1E4C7\
ABF5AE8CDB0933D71E8C94E04A25619DCEE3D2261AD2EE6B\
F12FFA06D98A0864D87602733EC86A64521F2B18177B200C\
BBE117577A615D6C770988C0BAD946E208E24FA074E5AB31\
43DB5BFCE0FD108E4B82D120A92108011A723C12A787E6D7\
88719A10BDBA5B2699C327186AF4E23C1A946834B6150BDA\
2583E9CA2AD44CE8DBBBC2DB04DE8EF92E8EFC141FBECAA6\
287C59474E6BC05D99B2964FA090C3A2233BA186515BE7ED\
1F612970CEE2D7AFB81BDD762170481CD0069127D5B05AA9\
93B4EA988D8FDDC186FFB7DC90A6C08F4DF435C934063199\
FFFFFFFFFFFFFFFF
Then, again following the recommendations, they choose $g = 2$.
In [18]:
g = 2
After settling for $p$ and $g$ with Bob, she picks a secret number $a$.
We want $a$ to be big, 256 bits long.
Alice can pick $a$ in random.
In [19]:
import random
a = random.getrandbits(256)
print(a)
She calculates $A = g^a \bmod p$.
She sends $A$ to Bob.
In [20]:
alice_to_bob = pow(g, a, p)
print(alice_to_bob)
Bob uses the same values for $p$ and $g$.
He picks a secret number $b$.
He calculates $B = g^b \bmod p$.
He sends $B$ to Alice.
In [21]:
p = 0xFFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1\
29024E088A67CC74020BBEA63B139B22514A08798E3404DD\
EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245\
E485B576625E7EC6F44C42E9A637ED6B0BFF5CB6F406B7ED\
EE386BFB5A899FA5AE9F24117C4B1FE649286651ECE45B3D\
C2007CB8A163BF0598DA48361C55D39A69163FA8FD24CF5F\
83655D23DCA3AD961C62F356208552BB9ED529077096966D\
670C354E4ABC9804F1746C08CA18217C32905E462E36CE3B\
E39E772C180E86039B2783A2EC07A28FB5C55DF06F4C52C9\
DE2BCBF6955817183995497CEA956AE515D2261898FA0510\
15728E5A8AAAC42DAD33170D04507A33A85521ABDF1CBA64\
ECFB850458DBEF0A8AEA71575D060C7DB3970F85A6E1E4C7\
ABF5AE8CDB0933D71E8C94E04A25619DCEE3D2261AD2EE6B\
F12FFA06D98A0864D87602733EC86A64521F2B18177B200C\
BBE117577A615D6C770988C0BAD946E208E24FA074E5AB31\
43DB5BFCE0FD108E4B82D120A92108011A723C12A787E6D7\
88719A10BDBA5B2699C327186AF4E23C1A946834B6150BDA\
2583E9CA2AD44CE8DBBBC2DB04DE8EF92E8EFC141FBECAA6\
287C59474E6BC05D99B2964FA090C3A2233BA186515BE7ED\
1F612970CEE2D7AFB81BDD762170481CD0069127D5B05AA9\
93B4EA988D8FDDC186FFB7DC90A6C08F4DF435C934063199\
FFFFFFFFFFFFFFFF
g = 2
b = random.getrandbits(256)
print('b =', b)
bob_to_alice = pow(g, b, p)
print(bob_to_alice)
Then Alice calculates $B^a \bmod b$:
In [22]:
shared_secret_alice = pow(bob_to_alice, a, p)
print(shared_secret_alice)
And Bob calculates $A^b \bmod p$:
In [23]:
shared_secret_bob = pow(alice_to_bob, b, p)
print(shared_secret_bob)
They arrived at the same result.
That is their secret secret.
In [24]:
shared_secret_alice == shared_secret_bob
Out[24]:
We saw that we are dealing with huge numbers.
We need efficient implementations of the arithmetic operations.
In particular, we want fast exponentiation and fast modular exponentiation.
Python already implements fast operations---otherwise we would not be able to run this workbook.
But it is worth seeing how such algorithms look like.
Exponentiation by repeated squaring is a direct application of the algorithm in the book.
Instead of calculating $d \bmod 2$, we calculate simply d & 0b1
.
That performs a bitwise AND operation between d
and the single bit 1
. It therefore checks whether the last digit of d
is one, which is equivalent to whether it is an odd number.
Then, to perform integer division by 2, that is, $\lfloor d / 2\rfloor$, we only need to shift right d by one bit, d >> 1
.
We also add a statement to print the progress of the calculation, so that we get the values of $c$, $r$, and $d$ of Table 4.9 in the book.
In [25]:
def exponentiation(g, x):
print('{0:>10} {1:>10} {2:>10}'.format('c', 'r', 'd'))
c = g
d = x
r = 1
while d > 0:
print(f'{c:-10d} {r:-10d} {bin(d):>10}')
if d & 0b1 == 1:
r = r * c
d = d >> 1
c = c * c
return r
exponentiation(13, 13)
Out[25]:
The modular exponentiation by repeated squaring is almost the same function.
The only change is that the calculations of r
and c
inside the loop are done modulo p
.
In [26]:
def mod_exponentiation(g, x, p):
print('{0:>10} {1:>10} {2:>10}'.format('c', 'r', 'd'))
c = g % p
d = x
r = 1
while d > 0:
print(f'{c:-10d} {r:-10d} {bin(d):>10}')
if d & 0b1 == 1:
r = (r * c) % p
d = d >> 1
c = (c * c) % p
return r
mod_exponentiation(155, 235, 391)
Out[26]:
You should never, ever, use any of the code in this notepad for production use if you care a bit about security.
The code we have shown serves for demonstration purposes only.
However, the development of secure code is a full-blown art. There are many details that one has to take into account when writing code for a robust, secure system.
You should always use libraries that have a good security record, that are actively supported, and are well documented.
You should never use your own cryptographic code unless you are an expert programmer, you have shared your code with the programming community, and the consensus is that the implementation is sound.