RSA tutorial

Caution: Fidelio is an educational program. Do not use it as a serious encryption tool!

It has known weaknesses, including:


In [1]:
from fidelio_functions import *

Public-key encryption

Alice wants to send Bob a message. She does not want anyone else to read it.

Bob buys a padlock and keeps the only key. He mails the padlock to Alice.
Alice locks the message in a sturdy box and mails it to Bob.
If the lock and box are very difficult to break, then nobody but Bob can read the message.

To save shipping costs, Alice and Bob decide not to use a physical padlock.
Instead, Bob sends Alice instructions for creating a mathematical puzzle.
The puzzle is easy to create but hard for anyone (including Alice) to solve.
Bob keeps a secret hint which helps him solve the puzzle.

In RSA encryption, the puzzle is the RSA problem.
The instructions are Bob's public key $k$ and RSA number $n$. The hint is Bob's private key $x$.


In [2]:
message = "Most of the department chiefs are already mine."
print(message)


Most of the department chiefs are already mine.

Packing and padding integers

Fidelio's other schemes represent a message as a list of 2-digit integers. For RSA, we'd prefer bigger numbers.

The packetize() function converts a list of 2-digit integers into a list of larger integers. It pads the last packet (or creates a new packet) with random digits. The last digit of the last packet is how many random digits (including itself) were added.


In [3]:
ints = text_to_ints(message)
print(ints,'\n')

packets = packetize(ints)
print(packets,'\n')

test_ints = unpacketize(packets)
print(test_ints,'\n')

test_text = ints_to_text(test_ints)
print(test_text)


[45, 79, 83, 84, 0, 79, 70, 0, 84, 72, 69, 0, 68, 69, 80, 65, 82, 84, 77, 69, 78, 84, 0, 67, 72, 73, 69, 70, 83, 0, 65, 82, 69, 0, 65, 76, 82, 69, 65, 68, 89, 0, 77, 73, 78, 69, 14] 

[45798384, 797000, 84726900, 68698065, 82847769, 78840067, 72736970, 83006582, 69006576, 82696568, 89007773, 78691401] 

[45, 79, 83, 84, 0, 79, 70, 0, 84, 72, 69, 0, 68, 69, 80, 65, 82, 84, 77, 69, 78, 84, 0, 67, 72, 73, 69, 70, 83, 0, 65, 82, 69, 0, 65, 76, 82, 69, 65, 68, 89, 0, 77, 73, 78, 69, 14] 

Most of the department chiefs are already mine.

Key generation

The RSA number $n$ is generated by multiplying two randomly-chosen primes.
Finding the public and private keys is more complicated. (Scroll down for details.)

Fidelio uses prime numbers from this list compiled by Peter Alfeld of the University of Utah Mathematics department.


In [4]:
rsa_number, public_key, private_key = generate_keys(verbose=True)


Loading prime numbers from Primes.txt
RSA number is 79691 * 31121 = 2480063611
Euler totient is 79690 * 31120 = 2479952800
Public key is 16091
Private key is 1123846611

RSA encryption

Alice converts the message to a sequence of integers $M_1, M_2, \ldots$ For each integer $M$, she creates a cipher $C$ by modular exponentiation of each $M$ to the $k$th power (mod $n$), where $k$ is Bob's public key:

$$ C = M^k \ \% \ n $$

Bob decrypts $C_1, C_2, \ldots$ by raising each $C$ to to the $x$th power (mod $n$), where $x$ is his secret private key:

$$ C^x \ \% \ n = M^{kx} \ \% \ n = M $$

It takes some effort to find a public key $k$ and private key $x$ which make this scheme work. The details are explained below.

Python's built-in pow() function can do modular exponentiation. Undoing this operation requires solving the discrete logarithm problem, which is extremely difficult if you don't know the prime factorization of $n$.

It is important that $M < n$ for all $M$'s in the message. By default, Fidelio chooses primes between 10K and 100K, which ensures 100,000,000 < $n$ < 10,000,000,000. The packetize() function always generates a list of 8-digit (or less) decimal numbers. The largest possible packet is 99,999,999, so Fidelio guarantees that $M < n$.


In [5]:
ciphertext = rsa_encrypt(message,rsa_number,public_key)
print(ciphertext)


1681403511 1425504807 214061875 2327159134 2199619577 820184527 1235600105 1063190149 614534062 1129021466 1577762428 2007211462

In [6]:
plaintext = rsa_decrypt(ciphertext,rsa_number,private_key)
print(plaintext)


Most of the department chiefs are already mine.

In [7]:
# Close doesn't count in RSA encryption. You need the exact private key.
rsa_decrypt(ciphertext,rsa_number,private_key-1)


Out[7]:
"97KTC:i<∀&g'aJK8q¬F"

In [8]:
# Good luck guessing the private key. There are many possibilities.
# (But most guesses decrypt to invalid characters, which is a potential weakness of Fidelio!)
for j in range(10):
    badkey = random.randint(0,rsa_number)
    print( rsa_decrypt(ciphertext,rsa_number,badkey) )



~!#,2dt3S




@:@$∃.XH~'$r~ft∞F%4gFD
W1:zsjj\∑NW∃v
Z<&soB=$l`L∃Z,!!~OZa1$<I5r{jKk

How decryption works

Bob decrypts each $C$ by exponentiating it to the power $x$ mod $n$:

$$ C^x \ \% \ n = (M^k)^x \ \% \ n = M^{kx} \ \% \ n $$

The prime factorization of $n$ is $pq$. The Chinese remainder theorem says that $M^{kx} \ \% \ n = M$ if and only if

$$ M^{kx} \ \% \ p = M \qquad \textrm{AND} \qquad M^{kx} \ \% \ q = M $$

Let's do the mod $p$ test first. In the unlikely event that $M$ is a multiple of $p$, we know $M \ \% \ p = 0$ and it's easy:

$$ M^{kx} \ \% \ p = 0^{kx} \ \% \ p = 0 $$

What if $M$ is not a multiple of $p$? The trick is to choose a private key $x$ such that

$$ kx \ \% \ (p-1)(q-1) = 1 $$

which means $kx-1 = h(p-1)(q-1)$ for some $h$. We don't know what $h$ is, but we do know that

$$ M^{kx} = M \cdot M^{kx-1} = M \cdot M^{h(p-1)(q-1)} $$

Since $p$ is prime and $M$ is not a multiple of $p$, we can quote Fermat's Little Theorem:

$$ M^{p-1} \ \% \ p = 1 $$

which means

$$ M \cdot M^{h(p-1)(q-1)} \ \% \ p = M \cdot 1^{h(q-1)} \ \% \ p = M \ \% \ p $$

To prove that $M^{kx} \ \% \ q = M$, repeat the same logic with $p$ and $q$ trading places.


In [9]:
packets = packetize(text_to_ints("Hello, world!"))
print(packets,'\n')

cipher = [ pow(m,public_key,rsa_number) for m in packets ]
print(cipher,'\n')

decipher = [ pow(c,private_key,rsa_number) for c in cipher ]
print(decipher,'\n')

plaintext = ints_to_text(unpacketize(decipher))
print(plaintext)


[40697676, 79120087, 79827668, 1250703] 

[677878470, 563705075, 2435297304, 2045180102] 

[40697676, 79120087, 79827668, 1250703] 

Hello, world!

How Fidelio generates the RSA number and key pair

Generating $n$ is easy: just choose two primes, multiply them together, and don't tell anyone what the two primes are.
Choosing a good public key $k$ and private key $x$ is more complicated.

When Fidelio generates $n$, it uses the Euler totient method to calculate the keypair $(k,x)$. This is only possible because Fidelio knows the $p$ and $q$ it chose to generate $n = pq$. The method is a bit complicated, but can be computed quickly:

  1. Calculate Euler's totient of $n$: $\phi(n) = (p-1)(q-1)$.

  2. Choose a public key $k$ such that:
    a. $k$ is prime
    b. $k < \phi(n)$
    c. $k$ is not a factor of $\phi(n)$.

  3. Find $x$ such that $kx \ \% \ \phi(n) = 1$.

This $x$ is the multiplicative inverse of $k$ using modular arithmetic (mod $\phi(n)$).

In the decryption proof above, we used Steps 1 and 3 when we assumed

$$ kx \ \% \ (p-1)(q-1) = 1 $$

Step 2 guarantees $x$ exists and is unique. There is a unique positive $x < \phi(n)$ such that $xk \ \% \ \phi(n) = 1$ if and only if $x$ and $\phi(n)$ are relatively prime. Since $k$ is chosen from a list of primes, it's enough to check that $k < \phi(n)$ and $k$ is not a factor of $\phi(n)$.

Note that anyone can replicate this method if they can factor $n$ into its prime factors $p*q$. This is why it's important to use large prime numbers. If $n$ is large enough, then figuring out $p$ and $q$ is extremely slow - unless you have a reliable quantum computer, in which case you can use Shor's algorithm.


In [10]:
# Let's use tiny primes for this example
small_primes = load_primes(too_large=50)
print(small_primes)


Loading prime numbers from Primes.txt
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

In [11]:
# Choose p and q at random from our list of primes
small_n, small_totient = choose_rsa_number(small_primes,verbose=True)


RSA number is 41 * 7 = 287
Euler totient is 40 * 6 = 240

In [12]:
# Choose a public key which meets the criteria
small_public_key = choose_public_key(small_primes,small_totient,verbose=True)


Public key is 19

In [13]:
# Check that the public key and totient are relatively prime
check_gcd = gcd(small_public_key,small_totient)
show_numbers = (small_public_key,small_totient)
if check_gcd == 1:
    print( "%s and %s are relatively prime" % show_numbers )
else:
    raise ValueError( "%s and %s are not relatively prime!" % show_numbers )


19 and 240 are relatively prime

Finding the private key

Finding the inverse of $k$ mod $\phi(n)$ takes some work.
Fidelio's gcd_and_inverse() function uses the extended Euclidean algorithm to find $x$.
It also checks that $\gcd(k,\phi(n)) = 1$. This guarantees that $k$ is not a factor of $\phi(n)$.


In [14]:
# Find the private key x such that kx % n = 1
check_gcd, small_private_key = gcd_and_inverse(small_public_key,small_totient)
print( "Private key is %s" % small_private_key )


Private key is 139

In [15]:
# Is the private key really the inverse of the public key (mod totient)?
check_inverse = (small_public_key * small_private_key) % small_totient
show_numbers = (small_private_key,small_public_key,small_totient)
if check_inverse == 1:
    print( "%s is the multiplicative inverse of %s (mod %s)" % show_numbers )
else:
    raise ValueError( "%s is not the multiplicative inverse of %s (mod %s)" % show_numbers )


139 is the multiplicative inverse of 19 (mod 240)