# Selected Exercise Solutions

## 3. Thinking in Binary

### 2.

A simple solution:

``````

In [1]:

def convert(number):
return str(number), bin(number), hex(number)

convert(0b1001)

``````
``````

Out[1]:

('9', '0b1001', '0x9')

``````

The exercise says that we should only output the two other formats. As far as I'm aware, it's not possible for a Python script to distinguish between integer literals (without hacking the Python parser), so we'll use a string as the input instead:

``````

In [2]:

def convert2(string_number):
if string_number[1] == "x":
num = int(string_number, 16)
return str(num), bin(num)
elif string_number[1] == "b":
num = int(string_number, 2)
return str(num), hex(num)
else:
num = int(string_number)
return bin(num), hex(num)

convert2("0b1001")

``````
``````

Out[2]:

('9', '0x9')

``````

### 5.

There are several ways of doing this, but the fastest way I've found so far (that doesn't involve writing C code or anything crazy) uses bitwise operations. This encoder is based on code for a base32 encoder taken from the `base64` module.

We first need to generate a base64 dictionary. To avoid mistakes, I've copied out the standard base64 dictionary, but you could generate this programmatically:

``````

In [3]:

b64 = {0:"A", 1:"B", 2:"C", 3:"D", 4:"E", 5:"F", 6:"G", 7:"H", 8:"I", 9:"J", 10:"K", 11:"L", 12:"M", 13:"N", 14:"O", 15:"P", 16:"Q", 17:"R", 18:"S", 19:"T", 20:"U", 21:"V", 22:"W", 23:"X", 24:"Y", 25:"Z", 26:"a", 27:"b", 28:"c", 29:"d", 30:"e", 31:"f", 32:"g", 33:"h", 34:"i", 35:"j", 36:"k", 37:"l", 38:"m", 39:"n", 40:"o", 41:"p", 42:"q", 43:"r", 44:"s", 45:"t", 46:"u", 47:"v", 48:"w", 49:"x", 50:"y", 51:"z", 52:"0", 53:"1", 54:"2", 55:"3", 56:"4", 57:"5", 58:"6", 59:"7", 60:"8", 61:"9", 62:"+", 63:"/"}

``````

We also need the `struct` module:

``````

In [4]:

import struct

``````

Remember that if the string's character length isn't a multiple of three, we'll have to pad the encoded string with `=`. Notice how we build the encoded string by analyzing it in chunks of three bytes (24 bits) at a time. The `struct.unpack(3B, string)` function is the same as calling `ord` on all three characters in `string` (but faster). The bitwise operations simply shift and concatenate each set of 24 bits to convert them from three sets of 8 bits to four sets of 6 bits.

``````

In [5]:

def slow_b64encode(string):

encode = []
quanta, leftover = divmod(len(string), 3)

if leftover:
string += '\0' * (3 - leftover)
quanta += 1

for i in xrange(quanta):
b0, b1, b2 = struct.unpack('3B', string[i*3:(i+1)*3])
encode.extend([b64[b0 >> 2],
b64[(b0 & 0b11) << 4 | b1 >> 4],
b64[(b1 & 0b1111) << 2 | b2 >> 6],
b64[(b2 &  0b111111)]])

encoded = ''.join(encode)

if leftover == 1:
return encoded[:-2] + "=="
elif leftover == 2:
return encoded[:-1] + "="
return encoded

``````

Testing:

``````

In [6]:

from base64 import b64encode

slow_b64encode(string) == b64encode(string)

``````
``````

Out[6]:

True

``````

The decoder is left for you to try on your own!

### 7.

``````

In [7]:

def ROR(num, size=None):

if size is None:
size = num.bit_length()
else:
size = max(size, num.bit_length())

return bin((num >> 1) | (num & 1) << (size - 1))

``````

The function `ROR` takes an integer `num` and rotates it to right by one bit. If `size` is not given, the function assumes that the intended size of the integer is `num.bit_length()`. If you specify a different size, the function will use that size as the integer bit length (with `max` used to avoid truncation in case the given `size` is smaller than `num.bit_length()`). For example:

``````

In [8]:

ROR(0b1001)

``````
``````

Out[8]:

'0b1100'

``````
``````

In [9]:

ROR(0b1001, size=8)

``````
``````

Out[9]:

'0b10000100'

``````

If you're having trouble understanding the bitwise operations, think of them this way:

``````1. Identify the rightmost bit with (num & 1).
2. Create a new number by shifting the rightmost bit to the left by (size - 1) spots. For example, if the last bit is 1 and size = 8, then we have 0b10000000.
3. Shift the original number to the right by one spot with (num >> 1) so the last bit "falls off" the end.
4. Combine the numbers found in steps 2 and 3. The original rightmost bit is now the leftmost bit.

``````

The equivalent `ROL` function is slightly more complicated:

``````

In [10]:

def ROL(num, size=None):
if size is None:
size = num.bit_length()
else:
size = max(size, num.bit_length())

return bin((num << 1) & ((2**size) - 1) | ((num & (2**(size - 1))) >> (size - 1)))

``````

`2**X` is a shortcut for getting the binary number with `1` at the (X + 1)th position, and zeros elsewhere. `(2**X) - 1` returns the binary number of size `X` with `1`s in all positions. We use these shortcuts to get masks for the leftmost bit, and for every bit except the leftmost bit, respectively.

``````

In [11]:

ROL(0b1001)

``````
``````

Out[11]:

'0b11'

``````
``````

In [12]:

ROL(0b1001, size=16)

``````
``````

Out[12]:

'0b10010'

``````

Note that we could also write functions that efficiently shift more than one bit at the same time. For instance, see https://www.falatic.com/index.php/108/python-and-bitwise-rotation

### 8.

One way to do this is to register a new encoding error option with the `codecs` module:

``````

In [13]:

from __future__ import unicode_literals
import codecs
import unicodedata

def unicode_name(ex):
"""The object `ex` follows the
UnicodeError exception format. See:
https://docs.python.org/2/library/exceptions.html#exceptions.UnicodeError"""

char = ex.object[ex.start:ex.end]
return "{" + unicodedata.name(char, "unknown") + "}", ex.end

codecs.register_error("name", unicode_name)

``````

`unicode_name` is an error handler that takes a `UnicodeError` object as an argument, and returns the name of the Unicode character that raised the encoding error. See https://docs.python.org/2/library/codecs.html#codecs.register_error for more information. The `unicodedata` module is used to produce the name of the Unicode character.

`to_asicii` encodes `string` using the ASCII encoding, handling errors with `unicode_name`, which we registered previously as an error option called `name`:

``````

In [14]:

def to_ascii(string):
return string.encode("ascii", errors="name")

``````
``````

In [15]:

print to_ascii(u"† and ►")
print to_ascii(u"\u00A5 and \u00A3")

``````
``````

{DAGGER} and {BLACK RIGHT-POINTING POINTER}
{YEN SIGN} and {POUND SIGN}

``````

If you're on a "narrow" Python build (i.e., UCS-2 under the hood), `unicodedata.name` will not support Unicode code points above `\uffff`, so this function won't work for, say, `\U0001F330`. You can find the highest-supported Unicode code point on your system by calling `sys.maxunicode`:

``````

In [16]:

import sys
print hex(sys.maxunicode)

``````
``````

0xffff

``````

Python 3 has no such limitations, as most Python 3 versions (>=3.3) support code points up to `0x10FFFF` (the maximum possible code point). Since our script is compatible with Python 3, we can run it with the cell magic `%%python3` (you may need to install Python 3 on your system):

``````

In [17]:

%%python3
# from __future__ import unicode_literals
import codecs
import unicodedata

def unicode_name(ex):
"""The object `ex` follows the UnicodeError exception format. See:
https://docs.python.org/2/library/exceptions.html#exceptions.UnicodeError"""

char = ex.object[ex.start:ex.end]
return "{" + unicodedata.name(char, "unknown") + "}", ex.end

codecs.register_error("name", unicode_name)

def to_ascii(string):
return string.encode("ascii", errors="name")

print(to_ascii(u"A \U0001F330"))

``````
``````

b'A {CHESTNUT}'

``````

## 4. Cryptography

### 1.

Given four sets of characters (lowercase, uppercase, digits, punctuation) there are 15 different combinations of character sets we can choose (assuming we have to choose at least one set). One way to indicate our choice is to use a binary mask, where a number like 0b1010 indicates that we'd like a random password with lowercase letters and digits only. Likewise, the number 0b11 indicates we only want digits and punctuation. Here's one way to implement this securely:

``````

In [18]:

from itertools import compress
import random
import string

char_list = [string.ascii_lowercase, string.ascii_uppercase,
string.digits, string.punctuation + " "]

# 0b1010 -> [1, 0, 1, 0]
bitmask = [(char_no >> bit) & 1 for bit in range(3, -1, -1)]

return "".join(random.SystemRandom().choice(chars) for _ in range(length))

``````
``````

In [19]:

``````
``````

Out[19]:

u'thh6a5qtwmize6isxg8t'

``````
``````

In [20]:

``````
``````

Out[20]:

u'BSxMqrQWiIqVTxIPqNyFLQJrShobOzKUMhhgMFmvmTQtEJAJBefhrjGGuUHs'

``````
``````

In [21]:

``````
``````

Out[21]:

u':#:\\_-<.:_(+7?2~'

``````

We use a list comprehension to turn the binary `char_no` into a "bit array" (so `0b1010` becomes `[1, 0, 1, 0]`). The `itertools.compress` function applies the bitmask to the list of character sets.

If you need a quick command line function, you can use the following one-liner (with modifications depending on your desired password length and character set):

``````

In [22]:

''.join(random.SystemRandom().choice(string.printable) for _ in range(20))

``````
``````

Out[22]:

u'&o&)!6sE,SidD|VvcK{!'

``````

### 3

a. We need to generate a list of dictionaries of all possible permutations with 3-bit block size (so using numbers in the range `0` through `7`):

``````

In [23]:

from itertools import permutations

p = ({i:j for i, j in zip(range(8), i)} for i in permutations(range(8)))

``````

b. Since we're using a 3-bit block size, we have to encrypt our data 3 bits at a time. We won't go through the trouble of building a function that regroups bytes (see exercise 1.5, which shows you how to make 6-bit groups out of regular bytes using bitwise operations). This `encrypt` function works with 3-bit numbers directly:

``````

In [24]:

from itertools import islice

def encrypt(data, key):
"""data is a 3-bit number and key is
an integer in range(40321)"""

p = ({i:j for i, j in zip(range(8), i)} for i in permutations(range(8)))

return next(islice(p, key, None))[data]

``````
``````

In [25]:

encrypt(4, 2210)

``````
``````

Out[25]:

2

``````

c. Permutations grow extremely fast. A 3-bit block cipher has `(2**3)! = 40320` possible mappings (dictionaries) but a 4-bit block cipher has around `2.1e13` mappings, which would put a significant strain on the computer's resources. Python's generators are helpful because they won't store the entire list of dictionaries in memory, but accessing arbitrary dictionaries is still a very slow operation for large indices (keys).

### 7.

The modulus operation `mod n` only affects numbers that are larger than `n`. If `m**e < n`, the `mod` operation during encryption does nothing. For example:

``````

In [26]:

m = 2
e = 3
n = 221

m**e < n

``````
``````

Out[26]:

True

``````
``````

In [27]:

c = pow(m, e, n)
c == pow(m, e)

``````
``````

Out[27]:

True

``````

This means we can easily reverse the encryption process by taking the `e`th root of the ciphertext:

``````

In [28]:

pow(c, 1./e) == m

``````
``````

Out[28]:

True

``````

## 5. Networking

### 4.

An IPv4 address has the form `X.X.X.X`, where each `X` represents a byte. You can easily convert this to an integer:

``````

In [29]:

return reduce(lambda x, y: (int(x) << 8) | int(y), address.split("."))

``````
``````

In [30]:

IPv4_to_int("192.168.2.1")

``````
``````

Out[30]:

3232236033

``````

Likewise for IPv6:

``````

In [31]:

return reduce(lambda x, y: (x << 16) | y, addr)

``````
``````

In [32]:

IPv6_to_int("2001:cdba:0000:0000:0000:0000:3257:9651")

``````
``````

Out[32]:

42544660792382819006154058011833243217L

``````

For a module with IP utilities, see `ipaddress`, which is part of the Python 3 standard library, and is available for Python 2 with `\$ pip install ipaddress`.

### 5.

There are a few ways you could do this. I prefer to use the `dig` command because it tends to be the most informative of all DNS utilities. For the root servers:

``````

In [33]:

! dig

``````
``````

; <<>> DiG 9.8.3-P1 <<>>
;; global options: +cmd
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21704
;; flags: qr rd ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 14

;; QUESTION SECTION:
;.				IN	NS

.			6145	IN	NS	i.root-servers.net.
.			6145	IN	NS	j.root-servers.net.
.			6145	IN	NS	k.root-servers.net.
.			6145	IN	NS	l.root-servers.net.
.			6145	IN	NS	m.root-servers.net.
.			6145	IN	NS	a.root-servers.net.
.			6145	IN	NS	b.root-servers.net.
.			6145	IN	NS	c.root-servers.net.
.			6145	IN	NS	d.root-servers.net.
.			6145	IN	NS	e.root-servers.net.
.			6145	IN	NS	f.root-servers.net.
.			6145	IN	NS	g.root-servers.net.
.			6145	IN	NS	h.root-servers.net.

j.root-servers.net.	3770	IN	A	192.58.128.30
k.root-servers.net.	13544	IN	A	193.0.14.129
l.root-servers.net.	3770	IN	A	199.7.83.42
m.root-servers.net.	9837	IN	A	202.12.27.33
m.root-servers.net.	2879	IN	AAAA	2001:dc3::35
a.root-servers.net.	1859	IN	A	198.41.0.4
a.root-servers.net.	8520	IN	AAAA	2001:503:ba3e::2:30
b.root-servers.net.	3769	IN	A	192.228.79.201
c.root-servers.net.	3769	IN	A	192.33.4.12
c.root-servers.net.	4860	IN	AAAA	2001:500:2::c
d.root-servers.net.	3770	IN	A	199.7.91.13
e.root-servers.net.	3769	IN	A	192.203.230.10
e.root-servers.net.	4860	IN	AAAA	2001:500:a8::e
f.root-servers.net.	3770	IN	A	192.5.5.241

;; Query time: 52 msec
;; SERVER: 192.168.2.1#53(192.168.2.1)
;; WHEN: Mon May 22 13:06:33 2017
;; MSG SIZE  rcvd: 500

``````

For the `com.` servers:

``````

In [34]:

! dig com. NS

``````
``````

; <<>> DiG 9.8.3-P1 <<>> com. NS
;; global options: +cmd
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44692
;; flags: qr rd ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 3

;; QUESTION SECTION:
;com.				IN	NS

com.			7487	IN	NS	h.gtld-servers.net.
com.			7487	IN	NS	k.gtld-servers.net.
com.			7487	IN	NS	f.gtld-servers.net.
com.			7487	IN	NS	m.gtld-servers.net.
com.			7487	IN	NS	e.gtld-servers.net.
com.			7487	IN	NS	l.gtld-servers.net.
com.			7487	IN	NS	i.gtld-servers.net.
com.			7487	IN	NS	c.gtld-servers.net.
com.			7487	IN	NS	b.gtld-servers.net.
com.			7487	IN	NS	j.gtld-servers.net.
com.			7487	IN	NS	d.gtld-servers.net.
com.			7487	IN	NS	g.gtld-servers.net.
com.			7487	IN	NS	a.gtld-servers.net.

k.gtld-servers.net.	13982	IN	A	192.52.178.30
f.gtld-servers.net.	12137	IN	A	192.35.51.30
e.gtld-servers.net.	3178	IN	A	192.12.94.30

;; Query time: 51 msec
;; SERVER: 192.168.2.1#53(192.168.2.1)
;; WHEN: Mon May 22 13:06:34 2017
;; MSG SIZE  rcvd: 293

``````

### 6.

``````

In [35]:

import urlparse

def simplify_url(url, size=1):
""" size is an integer that controls
the complexity of the output URL:
< 1 : no change
1 : query removed
2 : query and path removed
> 2 : query, path, and scheme removed
"""

p = urlparse.urlparse(url)

if size < 1:
return url
elif size == 1:
return "{uri.scheme}://{uri.hostname}{uri.path}".format(uri=p)
elif size == 2:
return "{uri.scheme}://{uri.hostname}".format(uri=p)
elif size > 2:
return p.hostname

``````
``````

In [36]:

url = "http://www.example.com/path/to/page.html?arg1=a&arg2=b"

for i in range(4):
print simplify_url(url, size=i)

``````
``````

http://www.example.com/path/to/page.html?arg1=a&arg2=b
http://www.example.com/path/to/page.html
http://www.example.com
www.example.com

``````

### 7.

``````

In [37]:

import socket

# carriage return + line feed
CRLF = "\r\n"

# create a socket using the AF_INET (*IPv4*) address family
# and the SOCK_STREAM transport protocol (*TCP*)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# connect to port 80 (the HTTP port) on www.google.com

# print the remote IP address

# form and send an HTTP GET request

# receive data in two chunks of 4096 bytes,
# which should be enough to get the <title>
response = s.recv(4096)
response += s.recv(4096)

# finally, close the TCP connection
s.close()

``````
``````

``````
``````

In [38]:

print response[:2000]

``````
``````

HTTP/1.1 200 OK
Date: Mon, 22 May 2017 17:06:34 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Accept-Ranges: none
Vary: Accept-Encoding
Transfer-Encoding: chunked

469
20a7

``````
``````

In [39]:

from bs4 import BeautifulSoup

soup = BeautifulSoup(response, "lxml")
print soup.title.text

``````
``````