Data Exploration

Setup


In [1]:
import sys
import os

import re
import collections
import itertools
import bcolz
import pickle

import numpy as np
import pandas as pd
import gc
import random
import smart_open
import h5py
import csv

import tensorflow as tf
import gensim
import string

import datetime as dt
from tqdm import tqdm_notebook as tqdm

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns

random_state_number = 967898


/home/bicepjai/Programs/anaconda3/envs/statoilicc/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6
  return f(*args, **kwds)

Day1: No Time for a Taxicab

--- Day 1: No Time for a Taxicab ---

Santa's sleigh uses a very high-precision clock to guide its movements, and the clock's oscillator is regulated by stars. Unfortunately, the stars have been stolen... by the Easter Bunny. To save Christmas, Santa needs you to retrieve all fifty stars by December 25th.

Collect stars by solving puzzles. Two puzzles will be made available on each day in the advent calendar; the second puzzle is unlocked when you complete the first. Each puzzle grants one star. Good luck!

You're airdropped near Easter Bunny Headquarters in a city somewhere. "Near", unfortunately, is as close as you can get - the instructions on the Easter Bunny Recruiting Document the Elves intercepted start here, and nobody had time to work them out further.

The Document indicates that you should start at the given coordinates (where you just landed) and face North. Then, follow the provided sequence: either turn left (L) or right (R) 90 degrees, then walk forward the given number of blocks, ending at a new intersection.

There's no time to follow such ridiculous instructions on foot, though, so you take a moment and work out the destination. Given that you can only walk on the street grid of the city, how far is the shortest path to the destination?

For example:

Following R2, L3 leaves you 2 blocks East and 3 blocks North, or 5 blocks away. R2, R2, R2 leaves you 2 blocks due South of your starting position, which is 2 blocks away. R5, L5, R5, R3 leaves you 12 blocks away. How many blocks away is Easter Bunny HQ?


In [2]:
! cat day1_input.txt


R4, R5, L5, L5, L3, R2, R1, R1, L5, R5, R2, L1, L3, L4, R3, L1, L1, R2, R3, R3, R1, L3, L5, R3, R1, L1, R1, R2, L1, L4, L5, R4, R2, L192, R5, L2, R53, R1, L5, R73, R5, L5, R186, L3, L2, R1, R3, L3, L3, R1, L4, L2, R3, L5, R4, R3, R1, L1, R5, R2, R1, R1, R1, R3, R2, L1, R5, R1, L5, R2, L2, L4, R3, L1, R4, L5, R4, R3, L5, L3, R4, R2, L5, L5, R2, R3, R5, R4, R2, R1, L1, L5, L2, L3, L4, L5, L4, L5, L1, R3, R4, R5, R3, L5, L4, L3, L1, L4, R2, R5, R5, R4, L2, L4, R3, R1, L2, R5, L5, R1, R1, L1, L5, L5, L2, L1, R5, R2, L4, L1, R4, R3, L3, R1, R5, L1, L4, R2, L3, R5, R3, R1, L3

In [3]:
input_data = None
with open("day1_input.txt") as f:
    input_data = f.read().strip().split()
    input_data = [w.strip(",") for w in input_data ]

We will form the direction map since they are finite.


In [4]:
directions = {
    ("N","R") : ("E",0,1),
    ("N","L") : ("W",0,-1),
    
    ("W","R") : ("N",1,1),
    ("W","L") : ("S",1,-1),
    
    ("E","R") : ("S",1,-1),
    ("E","L") : ("N",1,1),
    
    ("S","R") : ("W",0,-1),
    ("S","L") : ("E",0,1)
}

In [5]:
def get_distance(data):
    d,pos = "N",[0,0]
    for code in data:
        d1,v = code[0], int(code[1:])
        d,i,m = directions[(d, code[0])]
        pos[i] += m*v
        #print(code,d,v,pos)
    return sum([abs(n) for n in pos])

In [6]:
data = ["R2", "R2", "R2"]
get_distance(data)


Out[6]:
2

In [7]:
data = ["R5", "L5", "R5", "R3"]
get_distance(data)


Out[7]:
12

In [8]:
get_distance(input_data)


Out[8]:
250

Day 2: Bathroom Security

part1

You arrive at Easter Bunny Headquarters under cover of darkness. However, you left in such a rush that you forgot to use the bathroom! Fancy office buildings like this one usually have keypad locks on their bathrooms, so you search the front desk for the code.

"In order to improve security," the document you find says, "bathroom codes will no longer be written down. Instead, please memorize and follow the procedure below to access the bathrooms."

The document goes on to explain that each button to be pressed can be found by starting on the previous button and moving to adjacent buttons on the keypad: U moves up, D moves down, L moves left, and R moves right. Each line of instructions corresponds to one button, starting at the previous button (or, for the first line, the "5" button); press whatever button you're on at the end of each line. If a move doesn't lead to a button, ignore it.

You can't hold it much longer, so you decide to figure out the code as you walk to the bathroom. You picture a keypad like this:

1 2 3 4 5 6 7 8 9 Suppose your instructions are:

ULL RRDDD LURDL UUUUD You start at "5" and move up (to "2"), left (to "1"), and left (you can't, and stay on "1"), so the first button is 1. Starting from the previous button ("1"), you move right twice (to "3") and then down three times (stopping at "9" after two moves and ignoring the third), ending up with 9. Continuing from "9", you move left, up, right, down, and left, ending with 8. Finally, you move up four times (stopping at "2"), then down once, ending with 5. So, in this example, the bathroom code is 1985.

Your puzzle input is the instructions from the document you found at the front desk. What is the bathroom code?


In [1]:
input_data = None
with open("day2_input.txt") as f:
    input_data = f.read().strip().split()

In [33]:
def get_codes(data, keypad, keypad_max_size, start_index=(1,1), verbose=False):
    r,c = start_index
    digit = ""
    for codes in data:
        if verbose: print("  ",codes)
        for code in codes:
            if verbose: print("  before",r,c,keypad[r][c])
            if code == 'R' and c+1 < keypad_max_size and keypad[r][c+1] is not None:
                c += 1
            elif code == 'L' and c-1 >= 0 and keypad[r][c-1] is not None:
                c -= 1
            elif code == 'U' and r-1 >= 0 and keypad[r-1][c] is not None:
                r -= 1
            elif code == 'D' and r+1 < keypad_max_size and keypad[r+1][c] is not None:
                r += 1
            if verbose: print("  after",code,r,c,keypad[r][c])
        digit += str(keypad[r][c])
    return digit

In [34]:
sample = ["ULL",
"RRDDD",
"LURDL",
"UUUUD"]

In [35]:
keypad = [[1,2,3],[4,5,6],[7,8,9]]
get_codes(sample, keypad, keypad_max_size=3)


Out[35]:
'1985'

In [36]:
keypad = [[1,2,3],[4,5,6],[7,8,9]]
get_codes(input_data, keypad, keypad_max_size=3)


Out[36]:
'12578'

part2

You finally arrive at the bathroom (it's a several minute walk from the lobby so visitors can behold the many fancy conference rooms and water coolers on this floor) and go to punch in the code. Much to your bladder's dismay, the keypad is not at all like you imagined it. Instead, you are confronted with the result of hundreds of man-hours of bathroom-keypad-design meetings:

1 2 3 4 5 6 7 8 9 A B C D You still start at "5" and stop when you're at an edge, but given the same instructions as above, the outcome is very different:

You start at "5" and don't move at all (up and left are both edges), ending at 5. Continuing from "5", you move right twice and down three times (through "6", "7", "B", "D", "D"), ending at D. Then, from "D", you move five more times (through "D", "B", "C", "C", "B"), ending at B. Finally, after five more moves, you end at 3. So, given the actual keypad layout, the code would be 5DB3.

Using the same instructions in your puzzle input, what is the correct bathroom code?

Although it hasn't changed, you can still get your puzzle input.


In [ ]:
input_data = None
with open("day21_input.txt") as f:
    input_data = f.read().strip().split()

In [37]:
keypad = [[None, None,  1, None, None],
          [None,    2,  3,    4, None],
          [   5,    6,  7,    8, None],
          [None,  'A', 'B', 'C', None],
          [None, None, 'D', None, None]]

In [39]:
sample = ["ULL",
"RRDDD",
"LURDL",
"UUUUD"]
get_codes(sample, keypad, keypad_max_size=5, start_index=(2,0), verbose=False)


Out[39]:
'5DB3'

In [40]:
get_codes(input_data, keypad, keypad_max_size=5, start_index=(2,0), verbose=False)


Out[40]:
'516DD'

Day3 squares With Three Sides

part1

Now that you can think clearly, you move deeper into the labyrinth of hallways and office furniture that makes up this part of Easter Bunny HQ. This must be a graphic design department; the walls are covered in specifications for triangles.

Or are they?

The design document gives the side lengths of each triangle it describes, but... 5 10 25? Some of these aren't triangles. You can't help but mark the impossible ones.

In a valid triangle, the sum of any two sides must be larger than the remaining side. For example, the "triangle" given above is impossible, because 5 + 10 is not larger than 25.

In your puzzle input, how many of the listed triangles are possible?


In [81]:
input_data = None
with open("day3_input.txt") as f:
    input_data = f.read().strip().split("\n")

In [89]:
input_data = [list(map(int, l.strip().split())) for l in input_data]

In [97]:
result = [ (sides[0]+sides[1] > sides[2]) and (sides[2]+sides[1] > sides[0]) and (sides[0]+sides[2] > sides[1]) for sides in input_data]

In [98]:
sum(result)


Out[98]:
862

part2

Now that you've helpfully marked up their design documents, it occurs to you that triangles are specified in groups of three vertically. Each set of three numbers in a column specifies a triangle. Rows are unrelated.

For example, given the following specification, numbers with the same hundreds digit would be part of the same triangle:

101 301 501 102 302 502 103 303 503 201 401 601 202 402 602 203 403 603 In your puzzle input, and instead reading by columns, how many of the listed triangles are possible?


In [50]:
input_data = None
with open("day31_input.txt") as f:
    input_data = f.read().strip().split("\n")

In [51]:
input_data = [list(map(int, l.strip().split())) for l in input_data]
input_data[:5]


Out[51]:
[[785, 516, 744],
 [272, 511, 358],
 [801, 791, 693],
 [572, 150, 74],
 [644, 534, 138]]

In [56]:
def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in range(0, len(l), n):
        yield l[i:i + n]
        
single_list = [input_data[r][c] for c in [0,1,2] for r in range(len(input_data))]
result = [ (sides[0]+sides[1] > sides[2]) and (sides[2]+sides[1] > sides[0]) and (sides[0]+sides[2] > sides[1]) for sides in chunks(single_list, 3)]
sum(result)


Out[56]:
1577

Day4

part1: Security Through Obscurity

Finally, you come across an information kiosk with a list of rooms. Of course, the list is encrypted and full of decoy data, but the instructions to decode the list are barely hidden nearby. Better remove the decoy data first.

Each room consists of an encrypted name (lowercase letters separated by dashes) followed by a dash, a sector ID, and a checksum in square brackets.

A room is real (not a decoy) if the checksum is the five most common letters in the encrypted name, in order, with ties broken by alphabetization. For example:

aaaaa-bbb-z-y-x-123[abxyz] is a real room because the most common letters are a (5), b (3), and then a tie between x, y, and z, which are listed alphabetically. a-b-c-d-e-f-g-h-987[abcde] is a real room because although the letters are all tied (1 of each), the first five are listed alphabetically. not-a-real-room-404[oarel] is a real room. totally-real-room-200[decoy] is not. Of the real rooms from the list above, the sum of their sector IDs is 1514.

What is the sum of the sector IDs of the real rooms?


In [150]:
input_data = None
with open("day4_input.txt") as f:
    input_data = f.read().strip().split("\n")
len(input_data), input_data[:5]


Out[150]:
(1091,
 ['gbc-frperg-pubpbyngr-znantrzrag-377[rgbnp]',
  'nij-mywlyn-wlsiayhcw-jfumncw-alumm-mbcjjcha-422[mcjwa]',
  'pualyuhapvuhs-ibuuf-zhslz-227[uhalp]',
  'xlrypetn-prr-lylwjdtd-665[dzoya]',
  'zilqwikbqdm-rmttgjmiv-mvoqvmmzqvo-278[mqvio]'])

In [151]:
answer = 0
for code in input_data:
    m = re.match(r'(.+)-(\d+)\[([a-z]*)\]', code)
    code, sector, checksum = m.groups()
    code = code.replace("-","")
    counts = collections.Counter(code).most_common()
    counts.sort(key=lambda k: (-k[1], k[0]))
    if ''.join([ch for ch,_ in counts[:5]]) == checksum:
        answer += int(sector)
answer


Out[151]:
409147

part2

With all the decoy data out of the way, it's time to decrypt this list and get moving.

The room names are encrypted by a state-of-the-art shift cipher, which is nearly unbreakable without the right software. However, the information kiosk designers at Easter Bunny HQ were not expecting to deal with a master cryptographer like yourself.

To decrypt a room name, rotate each letter forward through the alphabet a number of times equal to the room's sector ID. A becomes B, B becomes C, Z becomes A, and so on. Dashes become spaces.

For example, the real name for qzmt-zixmtkozy-ivhz-343 is very encrypted name.

What is the sector ID of the room where North Pole objects are stored?


In [167]:
for code in input_data:
    m = re.match(r'(.+)-(\d+)\[([a-z]*)\]', code)
    code, sector, checksum = m.groups()
    sector = int(sector)
    code = code.replace("-","")
    counts = collections.Counter(code).most_common()
    counts.sort(key=lambda k: (-k[1], k[0]))
    string_maps = string.ascii_lowercase
    cipher_table = str.maketrans(string_maps, string_maps[sector%26:] + string_maps[:sector%26])
    if ''.join([ch for ch,_ in counts[:5]]) == checksum:
        if "north" in code.translate(cipher_table):
            print(code.translate(cipher_table))
            print("sector",sector)


northpoleobjectstorage
sector 991

Day5 How About a Nice Game of Chess?

part1

You are faced with a security door designed by Easter Bunny engineers that seem to have acquired most of their security knowledge by watching hacking movies.

The eight-character password for the door is generated one character at a time by finding the MD5 hash of some Door ID (your puzzle input) and an increasing integer index (starting with 0).

A hash indicates the next character in the password if its hexadecimal representation starts with five zeroes. If it does, the sixth character in the hash is the next character of the password.

For example, if the Door ID is abc:

The first index which produces a hash that starts with five zeroes is 3231929, which we find by hashing abc3231929; the sixth character of the hash, and thus the first character of the password, is 1. 5017308 produces the next interesting hash, which starts with 000008f82..., so the second character of the password is 8. The third time a hash starts with five zeroes is for abc5278568, discovering the character f. In this example, after continuing this search a total of eight times, the password is 18f47a30.

Given the actual Door ID, what is the password?


In [ ]:


In [ ]:


In [ ]:

part2


In [ ]: