https://projecteuler.net/problem=59
Each character on a computer is assigned a unique code and the preferred standard is ASCII (American Standard Code for Information Interchange). For example, uppercase A = 65, asterisk (*) = 42, and lowercase k = 107.
A modern encryption method is to take a text file, convert the bytes to ASCII, then XOR each byte with a given value, taken from a secret key. The advantage with the XOR function is that using the same encryption key on the cipher text, restores the plain text; for example, 65 XOR 42 = 107, then 107 XOR 42 = 65.
For unbreakable encryption, the key is the same length as the plain text message, and the key is made up of random bytes. The user would keep the encrypted message and the encryption key in different locations, and without both "halves", it is impossible to decrypt the message.
Unfortunately, this method is impractical for most users, so the modified method is to use a password as a key. If the password is shorter than the message, which is likely, the key is repeated cyclically throughout the message. The balance for this method is using a sufficiently long password key for security, but short enough to be memorable.
Your task has been made easy, as the encryption key consists of three lower case characters. Using cipher.txt (in this directory), a file containing the encrypted ASCII codes, and the knowledge that the plain text must contain common English words, decrypt the message and find the sum of the ASCII values in the original text.
The following cell shows examples of how to perform XOR in Python and how to go back and forth between characters and integers:
The first step is to read in the cipher.txt and create a list containing each number as a single entry. On line two I searched stackexchange to help me split up the text file and separate the numbers by commas.
In [1]:
ciphertxt = open('cipher.txt', 'r')
cipher = ciphertxt.read().split(',') #Splits the ciphertxt into a list, splits at every ,
cipher = [int(i) for i in cipher]
ciphertxt.close()
Below are the lists I created that will help me narrow my search. I created the list called search because the key was only allowed to contain 3 lower case letters. Next I created a list of plain text english to help me filter out unwanted messages.
In [2]:
search = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
english = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z',',','?',"'",'!',';','"','.','(',')','-','1','2','3','4','5','6','7','8','9','0',' ']
Next I create a function that will check if a line of text is plain english, by comparing its components with my english list above.
In [3]:
def is_plain_text(text):
result = True
for letter in text:
if letter not in english:
result = False
break
return result
Now I begin the search for the key. This computation takes a minute or two because I am searching through all 17500 key possibilities and matching those with all 1200 cipher entries, for a total of about 20 million computations. I print every key that returns a plain text message. In the end I get very lucky, because the program only prints one key.
In [4]:
for x in search:
for y in search:
for z in search:
message = ""
i = 0 #Counter i allows me to apply the components of key at every third entry of the message
for entry in cipher:
if i == 0 or i % 3 == 0:
message = message + chr(entry^ord(x))
elif i == 1 or (i-1) % 3 == 0:
message = message + chr(entry^ord(y))
elif i == 2 or (i-2) % 3 == 0:
message = message + chr(entry^ord(z))
i = i + 1
if is_plain_text(message) == True:
print("A potential key is: " + x + y + z)
Now I know the key is 'god'. I will use that key to decipher the message below. I use the same method to print the message as I did to find the key.
In [5]:
message = ""
i = 0
for entry in cipher:
if i == 0 or i % 3 == 0:
message = message + chr(entry^ord('g'))
elif i == 1 or (i-1) % 3 == 0:
message = message + chr(entry^ord('o'))
elif i == 2 or (i-2) % 3 == 0:
message = message + chr(entry^ord('d'))
i = i + 1
print(message)
Now that I know the message I can compute the ASCII sum of the message
In [6]:
sum = 0
for char in message:
sum = sum + ord(char)
print("The ASCII sum is: " + str(sum))
In [7]:
# This cell will be used for grading, leave it at the end of the notebook.