Day 7: Internet Protocol Version 7

author: Harshvardhan Pandit

license: MIT

link to problem statement

While snooping around the local network of EBHQ, you compile a list of IP addresses (they're IPv7, of course; IPv6 is much too limited). You'd like to figure out which IPs support TLS (transport-layer snooping).

An IP supports TLS if it has an Autonomous Bridge Bypass Annotation, or ABBA. An ABBA is any four-character sequence which consists of a pair of two different characters followed by the reverse of that pair, such as xyyx or abba. However, the IP also must not have an ABBA within any hypernet sequences, which are contained by square brackets.

For example:

- `abba[mnop]qrst` supports TLS (`abba` outside square brackets).
- `abcd[bddb]xyyx` does not support TLS (`bddb` is within square brackets, even though `xyyx` is outside square brackets).
- `aaaa[qwer]tyui` does not support TLS (`aaaa` is invalid; the interior characters must be different).
- `ioxxoj[asdfgh]zxcvbn` supports TLS (`oxxo` is outside square brackets, even though it's within a larger string).

How many IPs in your puzzle input support TLS?

Solution logic

My first intinct was to use regular expressions to check for palindromes of four characters in each of the three sections - the one before the brackets, within the brackets, and after the brackets. And then check if either of non-bracket sections contain a palindrome and the brackets don't. But then I realized that it would be far too complicated (not complex, but complicated), and its not the pythonic way to do things.

Using regex to separate the sections (bracketed and non-bracketed, we need to check if any sequence of four characters is a palindrome, and that they are not equal. Non-bracketed sections must contain a palindrome, and the bracketed section must not. We write a function that checks for palindromes for length four in a given string, called check_abba that returns True or False depending on whether it satisfies the required condition.

pattern = r'(\[?[a-z]+\]?)'

After the first try, I realized that the input contains multiple brackets in a single IP. It always pays to look at the input, folks!

Algorithm

main:

- set counter = 0
- For every line of input:
    - tokenize the string using regex
    - if string starts and ends with brackets, put it in the bracketed list
    - else put it in the non-bracketed list
    - if any string in bracketed list return True for check_abba, skip this IP
    - else if any string in non-bracketed list returns True for check_abba, increment counter

check_abba:

- accepts a string
- returns boolean
- for each set of four characters in the string:
    - if
        first and fourth character are equal AND 
        second and third character are equal AND 
        first and second character are not equal:
        - return True

In [1]:
def tokenize(string):
    import re
    pattern = re.compile(r'(\[?[a-z]+\]?)')
    match = pattern.findall(string)
    return match

In [2]:
def check_abba(string):
    if len(string) < 4:
        return False
    for i in range(0, len(string) - 3):
        if string[i] == string[i + 3] and string[i + 1] == string[i + 2] and string[i] != string[i + 1]:
            return True
    return False

A little testing using the test data from the problem.


In [3]:
test_data = [
    'abba[mnop]qrst',
    'abcd[bddb]xyyx',
    'aaaa[qwer]tyui',
    'ioxxoj[asdfgh]zxcvbn',
]
for line in test_data:
    tokens = tokenize(line)
    bracketed = []
    non_bracketed = []
    for token in tokens:
        if token.startswith('[') and token.endswith(']'):
            bracketed.append(token)
        else:
            non_bracketed.append(token)
    if any(check_abba(token) for token in bracketed):
        continue
    if any(check_abba(token) for token in non_bracketed):
        print(line, True)


abba[mnop]qrst True
ioxxoj[asdfgh]zxcvbn True

Since the above works with test data, we go ahead and load the problem inputs.


In [4]:
with open('../inputs/day07.txt', 'r') as f:
    data = [line.strip() for line in f.readlines()]

In [5]:
counter = 0
for line in data:
    tokens = tokenize(line)
    bracketed = []
    non_bracketed = []
    for token in tokens:
        if token.startswith('[') and token.endswith(']'):
            bracketed.append(token)
        else:
            non_bracketed.append(token)
    if any(check_abba(token) for token in bracketed):
        continue
    if any(check_abba(token) for token in non_bracketed):
        counter += 1

Part Two

You would also like to know which IPs support SSL (super-secret listening).

An IP supports SSL if it has an Area-Broadcast Accessor, or ABA, anywhere in the supernet sequences (outside any square bracketed sections), and a corresponding Byte Allocation Block, or BAB, anywhere in the hypernet sequences. An ABA is any three-character sequence which consists of the same character twice with a different character between them, such as xyx or aba. A corresponding BAB is the same characters but in reversed positions: yxy and bab, respectively.

For example:

- `aba[bab]xyz` supports SSL (`aba` outside square brackets with corresponding `bab` within square brackets).
- `xyx[xyx]xyx- does not support SSL (`xyx`, but no corresponding `yxy`).
- `aaa[kek]eke` supports SSL (`eke` in supernet with corresponding `kek` in hypernet; the `aaa` sequence is not related, because the interior character must be different).
- `zazbz[bzb]cdb` supports SSL (`zaz` has no corresponding `aza`, but `zbz` has a corresponding `bzb`, even though `zaz` and `zbz` overlap).

How many IPs in your puzzle input support SSL?

Solution logic

Instead of having a four-character palindrome, we now have two three-character palindromes, where their characters are interchanged. For any three-letter palindrome in a non-bracketed string, we have to check whether a corresponding palindrome exists in any bracketed string. If it exists, the IP supports SSL.

Algorithm:

main:

- set counter = 0
- for each line of input:
    - tokenize line in to bracketed and non-bracketed tokens
    - for each token in non-bracketed tokens:
        - get ABA sequence
        - for each token in bracketed tokens:
            - check for BAB sequence
            - if present, increment counter, move to next line

check_aba:

- accepts string input
- returns a generator for every ABA sequence
- for i in range 0 to length of string - 2:
    - if first character is same as third character but is different than second character:
        - return string of the three characters

check_bab:

- accepts string of length 3 and token list of items
- return True or False based on whether string is present in the list of tokens
- for each token in list:
    - if string is present in token:
        - return True
- otherwise return False

In [6]:
def check_aba(string):
    if len(string) < 3:
        return []
    for i in range(0, len(string) - 2):
        if string[i] == string[i + 2] and string[i] != string[i + 1]:
            yield string[i] + string[i + 1] + string[i + 2]

In [7]:
def check_bab(string, token_list):
    string = string[1] + string[0] + string[1]
    return any(string in token for token in token_list)

Test the approach using the provided test data


In [8]:
counter = 0
test_data = [
    'aba[bab]xyz',
    'xyx[xyx]xyx',
    'aaa[kek]eke',
    'zazbz[bzb]cdb',
]
for line in test_data:
    tokens = tokenize(line)
    bracketed = []
    non_bracketed = []
    for token in tokens:
        if token.startswith('[') and token.endswith(']'):
            bracketed.append(token)
        else:
            non_bracketed.append(token)
    for token in non_bracketed:
        ip_has_ssl = False
        for string in check_aba(token):
            if check_bab(string, bracketed):
                print('SSL', line)
                ip_has_ssl = True
                break
        if ip_has_ssl:
            counter += 1
            break


SSL aba[bab]xyz
SSL aaa[kek]eke
SSL zazbz[bzb]cdb

Since it works with test data, let's use it on the problem input.


In [9]:
counter = 0

for line in data:
    tokens = tokenize(line)
    bracketed = []
    non_bracketed = []
    for token in tokens:
        if token.startswith('[') and token.endswith(']'):
            bracketed.append(token)
        else:
            non_bracketed.append(token)
    for token in non_bracketed:
        ip_has_ssl = False
        for string in check_aba(token):
            if check_bab(string, bracketed):
                ip_has_ssl = True
                break
        if ip_has_ssl:
            counter += 1
            break

answer = counter

== END ==