Day 14: One-Time Pad

author: Harshvardhan Pandit

license: MIT

link to problem statement

In order to communicate securely with Santa while you're on this mission, you've been using a one-time pad that you generate using a pre-agreed algorithm. Unfortunately, you've run out of keys in your one-time pad, and so you need to generate some more.

To generate keys, you first get a stream of random data by taking the MD5 of a pre-arranged salt (your puzzle input) and an increasing integer index (starting with 0, and represented in decimal); the resulting MD5 hash should be represented as a string of lowercase hexadecimal digits.

However, not all of these MD5 hashes are keys, and you need 64 new keys for your one-time pad. A hash is a key only if:

  • It contains three of the same character in a row, like 777. Only consider the first such triplet in a hash.
  • One of the next 1000 hashes in the stream contains that same character five times in a row, like 77777.

Considering future hashes for five-of-a-kind sequences does not cause those hashes to be skipped; instead, regardless of whether the current hash is a key, always resume testing for keys starting with the very next hash.

For example, if the pre-arranged salt is abc:

  • The first index which produces a triple is 18, because the MD5 hash of abc18 contains ...cc38887a5.... However, index 18 does not count as a key for your one-time pad, because none of the next thousand hashes (index 19 through index 1018) contain 88888.
  • The next index which produces a triple is 39; the hash of abc39 contains eee. It is also the first key: one of the next thousand hashes (the one at index 816) contains eeeee.
  • None of the next six triples are keys, but the one after that, at index 92, is: it contains 999 and index 200 contains 99999.
  • Eventually, index 22728 meets all of the criteria to generate the 64th key.

So, using our example salt of abc, index 22728 produces the 64th key.

Given the actual salt in your puzzle input, what index produces your 64th one-time pad key?

Solution logic

Our salt is out input, we apply it increasingly to integers, take their MD5 hash, and if it contains a character repeating three times, we check if there is a hash in the next 1000 integers that features the same character 7 times, and if it does, then the integer we had is our key. Find 64 such keys, with the index of the 64th key being the answer. Seems simple and straightforward.


In [1]:
import re
three_repeating_characters = re.compile(r'(.)\1{2}')

In [2]:
with open('../inputs/day14.txt', 'r') as f:
    salt = f.readline().strip()
    # TEST DATA
    # salt = 'abc'
print(salt)


cuanljph

Hash index

To prevent the same hash from being counted upon again and again, we maintain a hash index to store the hashes of indexes. We also trim the index to remove any index for keys lower than the current key since we do not require those.


In [3]:
import hashlib

hash_index= {}

def get_hash_string(key):
    if key in hash_index:
        return hash_index[key]
    string = '{salt}{key}'.format(salt=salt, key=key)
    md5 = hashlib.md5()
    md5.update(string.encode('ascii'))
    hashstring = md5.hexdigest()
    hash_index[key] = hashstring
    return hashstring

def run():
    keys = []
    current_key = 0
    while(len(keys) < 64):
        for i in range(0, current_key):
            hash_index.pop(i, None)
        hashstring = get_hash_string(current_key)
        repeating_chacter = three_repeating_characters.findall(hashstring)
        if not repeating_chacter:
            current_key += 1
            continue
        repeating_chacter = repeating_chacter[0]
        repeating_character_five = ''.join(repeating_chacter for i in range(0, 5))
        for qualifying_index in range(current_key + 1, current_key + 1001):
            hashstring = get_hash_string(qualifying_index)
            if repeating_character_five in hashstring:
                break
        else:
            current_key += 1
            continue
        keys.append(current_key)
        print(len(keys), current_key)
        current_key += 1
    return keys

In [4]:
print('answer', run()[63])


1 310
2 399
3 442
4 479
5 734
6 847
7 927
8 1210
9 1390
10 1822
11 1855
12 2080
13 2112
14 2147
15 2153
16 2174
17 6226
18 6317
19 6415
20 6502
21 6660
22 6689
23 6697
24 6732
25 6822
26 7020
27 7060
28 7096
29 7293
30 7350
31 7500
32 7612
33 7672
34 7699
35 7770
36 7887
37 7986
38 8063
39 8105
40 8117
41 8167
42 8175
43 8199
44 8288
45 8309
46 8342
47 8386
48 8412
49 8423
50 8477
51 11425
52 11494
53 11607
54 11812
55 12012
56 19730
57 19860
58 19903
59 20188
60 20307
61 20351
62 23670
63 23695
64 23769
answer 23769

Part Two

Of course, in order to make this process even more secure, you've also implemented key stretching.

Key stretching forces attackers to spend more time generating hashes. Unfortunately, it forces everyone else to spend more time, too.

To implement key stretching, whenever you generate a hash, before you use it, you first find the MD5 hash of that hash, then the MD5 hash of that hash, and so on, a total of 2016 additional hashings. Always use lowercase hexadecimal representations of hashes.

For example, to find the stretched hash for index 0 and salt abc:

  • Find the MD5 hash of abc0: 577571be4de9dcce85a041ba0410f29f.
  • Then, find the MD5 hash of that hash: eec80a0c92dc8a0777c619d9bb51e910.
  • Then, find the MD5 hash of that hash: 16062ce768787384c81fe17a7a60c7e3.
  • ...repeat many times...
  • Then, find the MD5 hash of that hash: a107ff634856bb300138cac6568c0f24.

So, the stretched hash for index 0 in this situation is a107ff.... In the end, you find the original hash (one use of MD5), then find the hash-of-the-previous-hash 2016 times, for a total of 2017 uses of MD5.

The rest of the process remains the same, but now the keys are entirely different. Again for salt abc:

  • The first triple (222, at index 5) has no matching 22222 in the next thousand hashes.
  • The second triple (eee, at index 10) hash a matching eeeee at index 89, and so it is the first key.
  • Eventually, index 22551 produces the 64th key (triple fff with matching fffff at index 22859.

Given the actual salt in your puzzle input and using 2016 extra MD5 calls of key stretching, what index now produces your 64th one-time pad key?

Solution logic

We only need to change the definition of get_hash_string to calculate the hash 2016 times more. And then simply run the algorithm again.

To prevent computationally intensive operations from repeating themselves, we maintain an index of hashes so that we can easily lookup indexes without needing to calculate their hashes.


In [5]:
hash_index = {}

def get_hash_string(key):
    if key in hash_index:
        return hash_index[key]
    string = '{salt}{key}'.format(salt=salt, key=key)
    md5 = hashlib.md5()
    md5.update(string.encode('ascii'))
    hashstring = md5.hexdigest()
    # PART TWO
    for i in range(0, 2016):
        md5 = hashlib.md5()
        md5.update(hashstring.encode('ascii'))
        hashstring = md5.hexdigest()
    hash_index[key] = hashstring
    return hashstring

In [6]:
print('answer', run()[63])


1 26
2 51
3 128
4 179
5 289
6 451
7 516
8 5164
9 5228
10 5688
11 5739
12 5820
13 5835
14 5944
15 6571
16 6610
17 6717
18 6951
19 7154
20 7206
21 7208
22 7249
23 7252
24 7256
25 7302
26 7320
27 7519
28 14227
29 14669
30 14707
31 14795
32 14809
33 14859
34 14890
35 14923
36 14979
37 15046
38 15163
39 15170
40 15268
41 15304
42 15334
43 15415
44 15444
45 15496
46 15498
47 15594
48 15658
49 15828
50 15936
51 15939
52 16025
53 16105
54 16771
55 16851
56 16980
57 17323
58 20072
59 20141
60 20331
61 20357
62 20410
63 20498
64 20606
answer 20606

== END ==