Secrets

Chapter 4 of Real World Algorithms.


Panos Louridas
Athens University of Economics and Business

Substitution Cipher

The simplest encryption and decryption scheme is probably a substitution cipher, also known as Caesar cipher.

In Python, we takeeach character's encoding, via ord().

We calculate its position in the alphabet by subtracting the encoding for A.

We add a shift to it, taking care to wrap around if we get past 26 (the number of letters in our alphabet).

The result is the position in the alphabet of the encrypted character, so we add it to the position of A to get its actual encoding.

We convert the encoding to a character with chr().

In our example we assume that we encrypt texts containing only upper case latin characters.


In [1]:
import io

def caesar_cipher(plaintext, shift):
    ciphertext = [ chr((ord(c) - ord('A') + shift) % 26 + ord('A')) 
                  for c in plaintext ]
    return ''.join(ciphertext)
    
ciphertext = caesar_cipher('IAMSEATEDINANOFFICE', 5)
print(ciphertext)


NFRXJFYJINSFSTKKNHJ

Decryption is really the same think as encryption, we just use the opposite shift:


In [2]:
print(caesar_cipher(ciphertext, -5))


IAMSEATEDINANOFFICE

Needless to say, this is not a good cipher.

As we have not taken out the spaces between the words, one can even make out the structure of the text.

To break the decryption, one only needs to try different shifts.

One-time Pad

The only truly unbreakable cipher is the one-time pad.

To implement a simple version of it, we will use a helper function that returns the position of a character in the alphabet.

Again, we assume that we work with messages composed only of upper case latin characters.


In [3]:
def ord0(c):
    return ord(c) - ord('A')

This is then the one-time pad encryption function:


In [4]:
def otpad_encrypt(plaintext, randomtext):
    encrypted = [ chr((ord0(c) + ord0(r)) % 26 + ord('A')) 
                 for c, r in zip(plaintext, randomtext) ]
    return ''.join(encrypted)

To see it in action:


In [5]:
plaintext = "IFYOUREALLYWANTTO"
randomtext = "RDUUWJEMCJJXZDOWJ"

ciphertext = otpad_encrypt(plaintext, randomtext)
print(ciphertext)


ZISIQAIMNUHTZQHPX

Decryption is similar, but we subtract instead of adding:


In [6]:
def otpad_decrypt(ciphertext, randomtext):
    decrypted = [ chr((ord0(c) - ord0(r)) % 26 + ord('A')) 
                 for c, r in zip(ciphertext, randomtext) ]
    return ''.join(decrypted)

So we can now decrypt the message we encrypted:


In [7]:
plaintext = otpad_decrypt(ciphertext, randomtext)
print(plaintext)


IFYOUREALLYWANTTO

A one-time pad is impossible to decrypt unless you know the random text that has been used as the encryption key.

In fact, you can always invent a random text that decrypts to something sensible, but completely irrelevant to the original plaintext.

Let us write a function that does that, for the purpose of confusing our opponents.


In [8]:
def otpad_red_herring(plaintext, ciphertext):
    red_herring = [ chr((ord0(c) - ord0(p)) % 26 + ord('A'))
                   for p, c in zip(plaintext, ciphertext) ]
    return ''.join(red_herring)

red_herring = otpad_red_herring("IAMSEATEDINANOFFI", ciphertext)
print(red_herring)


RIGQMAPIKMUTMCCKP

So now we can leak GPNVLLUVZLTCBJJLK as the key, and the opponent will then decrypt:


In [9]:
print(otpad_decrypt(ciphertext, red_herring))


IAMSEATEDINANOFFI

Or anything else we like, for instance:


In [10]:
red_herring = otpad_red_herring("RENDEZVOUSATNIGHT", ciphertext)
print(red_herring)


IEFFMBNYTCHAMIBIE

That will happily decrypt our initial ciphertext again:


In [11]:
print(otpad_decrypt(ciphertext, red_herring))


RENDEZVOUSATNIGHT

When working with a one-time pad, we don't really need to do addition and subtraction on the positions of the letters in the alphabet.

It is more convenient and faster to use the XOR operation.

For encryption, we just XOR the plaintext and the randomtext:


In [12]:
def otpad_xencrypt(plaintext, randomtext):
    encrypted = [ chr(ord(p) ^ ord(r)) for p, r in zip(plaintext, randomtext) ]
    return ''.join(encrypted)

So we can now see how it works:


In [13]:
ciphertext = otpad_xencrypt(plaintext, randomtext)
print(ciphertext)





In fact we see nothing!

That is because the encrypted string is composed entirely of non-printable ASCII characters or whitespace.

If we really want to, we can print its hexadecimal representation.


In [14]:
print(ciphertext.encode('utf8'))


b'\x1b\x02\x0c\x1a\x02\x18\x00\x0c\x0f\x06\x13\x0f\x1b\n\x1b\x03\x05'

Decryption now is exactly the same with encryption, but we use the ciphertext and the random text:


In [15]:
decrypted = otpad_xencrypt(ciphertext, randomtext)

print(decrypted)


IFYOUREALLYWANTTO

AES

No, we are not going to give an implementation of AES here.

The algorithm is very complicated, and a real implementation would go well beyond the scope of this notebook.

Diffie-Hellman Key Exchange

The Diffie-Hellman key exchange scheme is based on the discrete logarithm problem.

That is, it is very difficult to find $x$ from the value of $g^x \bmod p$, if $g$ and $p$ are chosen carefully.

For example, see what is happening for $g = 2$ and $p = 13$.

$g^x$ is predictable, but $g^x \bmod p$ does not show any pattern.

Note that, in what follows, we are going to use Python's function pow().

Although Python has an operator for exponentiation (**), pow() is useful because it can either be called as pow(x, y) to return $x^y$, or pow(x, y, p) to return $x^y \bmod p$.


In [16]:
g = 2
p = 13

for x in range(1, 13):
    print(pow(g, x), pow(g, x, p))


2 2
4 4
8 8
16 3
32 6
64 12
128 11
256 9
512 5
1024 10
2048 7
4096 1

To see how Diffie-Hellman works, let us take Alice and Bob.

Alice and Bob agree on $g$ and $p$.

They choose $p = 2^{4096} - 2^{4032} - 1 + 2^{64} \times [ (2^{3966} \pi) + 240904 ]$.

Yes, this is a terrifying number. It is 4096 bits long.

No, Alice and Bob did not got it out of thin air. They chose it from a list of published recommendations.

The hexadecimal value of $p$ is equal to:

FFFFFFFF FFFFFFFF C90FDAA2 2168C234 C4C6628B 80DC1CD1
29024E08 8A67CC74 020BBEA6 3B139B22 514A0879 8E3404DD
EF9519B3 CD3A431B 302B0A6D F25F1437 4FE1356D 6D51C245
E485B576 625E7EC6 F44C42E9 A637ED6B 0BFF5CB6 F406B7ED
EE386BFB 5A899FA5 AE9F2411 7C4B1FE6 49286651 ECE45B3D
C2007CB8 A163BF05 98DA4836 1C55D39A 69163FA8 FD24CF5F
83655D23 DCA3AD96 1C62F356 208552BB 9ED52907 7096966D
670C354E 4ABC9804 F1746C08 CA18217C 32905E46 2E36CE3B
E39E772C 180E8603 9B2783A2 EC07A28F B5C55DF0 6F4C52C9
DE2BCBF6 95581718 3995497C EA956AE5 15D22618 98FA0510
15728E5A 8AAAC42D AD33170D 04507A33 A85521AB DF1CBA64
ECFB8504 58DBEF0A 8AEA7157 5D060C7D B3970F85 A6E1E4C7
ABF5AE8C DB0933D7 1E8C94E0 4A25619D CEE3D226 1AD2EE6B
F12FFA06 D98A0864 D8760273 3EC86A64 521F2B18 177B200C
BBE11757 7A615D6C 770988C0 BAD946E2 08E24FA0 74E5AB31
43DB5BFC E0FD108E 4B82D120 A9210801 1A723C12 A787E6D7
88719A10 BDBA5B26 99C32718 6AF4E23C 1A946834 B6150BDA
2583E9CA 2AD44CE8 DBBBC2DB 04DE8EF9 2E8EFC14 1FBECAA6
287C5947 4E6BC05D 99B2964F A090C3A2 233BA186 515BE7ED
1F612970 CEE2D7AF B81BDD76 2170481C D0069127 D5B05AA9
93B4EA98 8D8FDDC1 86FFB7DC 90A6C08F 4DF435C9 34063199
FFFFFFFF FFFFFFFF

In [17]:
p = 0xFFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1\
29024E088A67CC74020BBEA63B139B22514A08798E3404DD\
EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245\
E485B576625E7EC6F44C42E9A637ED6B0BFF5CB6F406B7ED\
EE386BFB5A899FA5AE9F24117C4B1FE649286651ECE45B3D\
C2007CB8A163BF0598DA48361C55D39A69163FA8FD24CF5F\
83655D23DCA3AD961C62F356208552BB9ED529077096966D\
670C354E4ABC9804F1746C08CA18217C32905E462E36CE3B\
E39E772C180E86039B2783A2EC07A28FB5C55DF06F4C52C9\
DE2BCBF6955817183995497CEA956AE515D2261898FA0510\
15728E5A8AAAC42DAD33170D04507A33A85521ABDF1CBA64\
ECFB850458DBEF0A8AEA71575D060C7DB3970F85A6E1E4C7\
ABF5AE8CDB0933D71E8C94E04A25619DCEE3D2261AD2EE6B\
F12FFA06D98A0864D87602733EC86A64521F2B18177B200C\
BBE117577A615D6C770988C0BAD946E208E24FA074E5AB31\
43DB5BFCE0FD108E4B82D120A92108011A723C12A787E6D7\
88719A10BDBA5B2699C327186AF4E23C1A946834B6150BDA\
2583E9CA2AD44CE8DBBBC2DB04DE8EF92E8EFC141FBECAA6\
287C59474E6BC05D99B2964FA090C3A2233BA186515BE7ED\
1F612970CEE2D7AFB81BDD762170481CD0069127D5B05AA9\
93B4EA988D8FDDC186FFB7DC90A6C08F4DF435C934063199\
FFFFFFFFFFFFFFFF

Then, again following the recommendations, they choose $g = 2$.


In [18]:
g = 2

After settling for $p$ and $g$ with Bob, she picks a secret number $a$.

We want $a$ to be big, 256 bits long.

Alice can pick $a$ in random.


In [19]:
import random

a = random.getrandbits(256)
print(a)


10785370814603608441518857787858058637153974855022796500916983771692574257075

She calculates $A = g^a \bmod p$.

She sends $A$ to Bob.


In [20]:
alice_to_bob = pow(g, a, p)
print(alice_to_bob)


671425984618948302448611142441008551416532291933434802887217965806396170749282091764614443905154961056786612845568999702298089816522559867643901486149790183141777271675394667219790906279776275644015340967387588793666313216956414046508130275881384288723249038447851902146021492393772254755191754605460967641356477274167471162790960172190649832379042821512969846350719700176042851042478430991289768704355955392067269334455047660909115526102467862450268147432374112537090677780304758320808697440747965815101357859669307964779793357854615057251109614957565322899067793950365020186642161154593164266273963289091973654164086364856053685755736797479257479177485681600850861779167434403845324322191599275398479574150594400602903620562332156167518761322362607541463395568007240166994998324979829327938158665546957801457711927848911013542155741781019670127257852412081321260736229561532250922455167024349225604843742462913282963893423133239202511890857058027713803042288323363163687067811599099895612457039381006375006703059794746490736865156372114755722814612125996124874349514897726192022967819768118194579720958301186207606473777073742164031285397390162728614751087172929448950135845042097965832814213056490183997384055743658918709393126983

Bob uses the same values for $p$ and $g$.

He picks a secret number $b$.

He calculates $B = g^b \bmod p$.

He sends $B$ to Alice.


In [21]:
p = 0xFFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1\
29024E088A67CC74020BBEA63B139B22514A08798E3404DD\
EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245\
E485B576625E7EC6F44C42E9A637ED6B0BFF5CB6F406B7ED\
EE386BFB5A899FA5AE9F24117C4B1FE649286651ECE45B3D\
C2007CB8A163BF0598DA48361C55D39A69163FA8FD24CF5F\
83655D23DCA3AD961C62F356208552BB9ED529077096966D\
670C354E4ABC9804F1746C08CA18217C32905E462E36CE3B\
E39E772C180E86039B2783A2EC07A28FB5C55DF06F4C52C9\
DE2BCBF6955817183995497CEA956AE515D2261898FA0510\
15728E5A8AAAC42DAD33170D04507A33A85521ABDF1CBA64\
ECFB850458DBEF0A8AEA71575D060C7DB3970F85A6E1E4C7\
ABF5AE8CDB0933D71E8C94E04A25619DCEE3D2261AD2EE6B\
F12FFA06D98A0864D87602733EC86A64521F2B18177B200C\
BBE117577A615D6C770988C0BAD946E208E24FA074E5AB31\
43DB5BFCE0FD108E4B82D120A92108011A723C12A787E6D7\
88719A10BDBA5B2699C327186AF4E23C1A946834B6150BDA\
2583E9CA2AD44CE8DBBBC2DB04DE8EF92E8EFC141FBECAA6\
287C59474E6BC05D99B2964FA090C3A2233BA186515BE7ED\
1F612970CEE2D7AFB81BDD762170481CD0069127D5B05AA9\
93B4EA988D8FDDC186FFB7DC90A6C08F4DF435C934063199\
FFFFFFFFFFFFFFFF 

g = 2

b = random.getrandbits(256)
print('b =', b)

bob_to_alice = pow(g, b, p)
print(bob_to_alice)


b = 85683632716075647490468894737054020435795289996303502038870365640053543961208
342679346139328404094596667028843589010078116349706026625336869729589261898982386212425318536065212538984412314931253380181511065051931846665290927340474823927059095619189735482288701017197437808964409337207224538540230055701012417549321775430552677549122701047948609062484733541584331831271665750946668362564099771684237862823255156992140337828642757722886727567489837212325958608671423412930950742574585110513877045425948511566732644635196931765116696613589653403013148159853726825373317822764288866251127990135361424205576478280796587385206388113044638664298584443458812113956039868961425464512008904553677677409072610672186323390195708683546326121952969386432024406716672043105218900478431077297639369563630449321808160756315077496453340767811070829411843375756392563019621973643623609913406555324098584884376838192083810961644766385375514861417314852259617932981332889170180606236001600602915879489470231450194109481247796001852721814374640990593928342860534428969873726810207777622983444796595308932047088367353976965725048879549520876544197906907666288010119138775784495500540945432136610456036013451346785996318310208168637258056516639471294810556482765189116706508384437260064846588919540880792122966024037047008774574051629

Then Alice calculates $B^a \bmod b$:


In [22]:
shared_secret_alice = pow(bob_to_alice, a, p)
print(shared_secret_alice)


245554008258270390136997903471167057137399892189659288408448811849949849522305235403412809270093600822949032210217722577999095872030495848064677287658184206593439364676379811403390285260174167555068661358006285941157956064520423268828479855629061113063301682041889008131816851110602469683014765983653523947355398182882435870567110752945351370078133471999705003950441029147956015841364526109334273769591645637662670554009436847477275550527375287961323767983606012561865903043587193251480159940649867249935748290907421094358040263137348646922286933564311828016858846700596379858765274469786891064399939220343495735056886835686996235531496543028654689689529503746724287814219831602377130596432662640346482390999604542734925592017528213705136869152272511308888556254485076388942436725761883377233002926434424937778934392696175318120499911283004959017109977537438230082901794493958532380694887712943337533674140253021746152136666964656506760223165855640227592032809602200843644007600567745999809841265827493294156627598335686262975065809801214736668818042012051925662065422910496907515513144227134497076377037759834725416682199368495307422551764846536898646774273250758968107114177559826603493173687033801743842493086342030778450072378683

And Bob calculates $A^b \bmod p$:


In [23]:
shared_secret_bob = pow(alice_to_bob, b, p)
print(shared_secret_bob)


245554008258270390136997903471167057137399892189659288408448811849949849522305235403412809270093600822949032210217722577999095872030495848064677287658184206593439364676379811403390285260174167555068661358006285941157956064520423268828479855629061113063301682041889008131816851110602469683014765983653523947355398182882435870567110752945351370078133471999705003950441029147956015841364526109334273769591645637662670554009436847477275550527375287961323767983606012561865903043587193251480159940649867249935748290907421094358040263137348646922286933564311828016858846700596379858765274469786891064399939220343495735056886835686996235531496543028654689689529503746724287814219831602377130596432662640346482390999604542734925592017528213705136869152272511308888556254485076388942436725761883377233002926434424937778934392696175318120499911283004959017109977537438230082901794493958532380694887712943337533674140253021746152136666964656506760223165855640227592032809602200843644007600567745999809841265827493294156627598335686262975065809801214736668818042012051925662065422910496907515513144227134497076377037759834725416682199368495307422551764846536898646774273250758968107114177559826603493173687033801743842493086342030778450072378683

They arrived at the same result.

That is their secret secret.


In [24]:
shared_secret_alice == shared_secret_bob


Out[24]:
True

Modular Exponentiation

We saw that we are dealing with huge numbers.

We need efficient implementations of the arithmetic operations.

In particular, we want fast exponentiation and fast modular exponentiation.

Python already implements fast operations---otherwise we would not be able to run this workbook.

But it is worth seeing how such algorithms look like.

Exponentiation by repeated squaring is a direct application of the algorithm in the book.

Instead of calculating $d \bmod 2$, we calculate simply d & 0b1.

That performs a bitwise AND operation between d and the single bit 1. It therefore checks whether the last digit of d is one, which is equivalent to whether it is an odd number.

Then, to perform integer division by 2, that is, $\lfloor d / 2\rfloor$, we only need to shift right d by one bit, d >> 1.

We also add a statement to print the progress of the calculation, so that we get the values of $c$, $r$, and $d$ of Table 4.9 in the book.


In [25]:
def exponentiation(g, x):
    print('{0:>10} {1:>10} {2:>10}'.format('c', 'r', 'd'))
    c = g
    d = x
    r = 1
    while d > 0:
        print(f'{c:-10d} {r:-10d} {bin(d):>10}')
        if d & 0b1 == 1:
            r = r * c
        d = d >> 1
        c = c * c
    return r

exponentiation(13, 13)


         c          r          d
        13          1     0b1101
       169         13      0b110
     28561         13       0b11
 815730721     371293        0b1
Out[25]:
302875106592253

The modular exponentiation by repeated squaring is almost the same function.

The only change is that the calculations of r and c inside the loop are done modulo p.


In [26]:
def mod_exponentiation(g, x, p):
    print('{0:>10} {1:>10} {2:>10}'.format('c', 'r', 'd'))
    c = g % p
    d = x
    r = 1
    while d > 0:
        print(f'{c:-10d} {r:-10d} {bin(d):>10}')
        if d & 0b1 == 1:
            r = (r * c) % p
        d = d >> 1
        c = (c * c) % p
    return r

mod_exponentiation(155, 235, 391)


         c          r          d
       155          1 0b11101011
       174        155  0b1110101
       169        382   0b111010
        18        382    0b11101
       324        229     0b1110
       188        229      0b111
       154         42       0b11
       256        212        0b1
Out[26]:
314

Secrets in Practice

You should never, ever, use any of the code in this notepad for production use if you care a bit about security.

The code we have shown serves for demonstration purposes only.

However, the development of secure code is a full-blown art. There are many details that one has to take into account when writing code for a robust, secure system.

You should always use libraries that have a good security record, that are actively supported, and are well documented.

You should never use your own cryptographic code unless you are an expert programmer, you have shared your code with the programming community, and the consensus is that the implementation is sound.