In [1]:
%matplotlib inline
from notebook.auth import passwd
import numpy as np
import struct
from ctypes import c_uint32
import matplotlib.pyplot as plt
This notebook paves the way for the 52 Things You Should Know About Geocomputing: [Seismic] Data Encryption chapter. Since we'll be encrypting and decrypting things in this notebook, we'll always use the same password:
is_52_ytpeieonncr_iTghsn_kye._oru_hTsi
Strong passwords are long passwords. Strength is added when the number of possible symbols increases, hence the punctuation and capitals in our password. Mixing up letters and words reduces the possibility of attack by dictionary methods. Just start with a sentence like "This is our 52 Things encryption key." Jumble up the letters and words from there. And I've used underscores in place of spaces here.
By the way, in the event you're storing your password on a computer (or a GitHub repo), never store it in plaintext. That is to say, don't just make a Google Document containing all your passwords. Anyone who gains access to the document (like this one) would now have your password (like the above). There's a process called hashing a password, which we'll now do using a simple Jupyter tool.
In [2]:
passwd()
Out[2]:
And that's it. This hash was built using SHA1, which is now trivially breakable. Python has some other awesome tools to perform hashing if you need realistically strong hashes (with salt, etc.). Again, for the remainder of this notebook we'll use the above password in its plaintext form. I only mentioned hashing to get your wheels turning.
The file we're loading is located in this repo under the data
folder. I've already stripped a segy file down into and array of binary floats, so we don't require any special seismic manipulation tools. The array is of size 390 x 276.
The seismic data used in this publication was obtained from RMOTC and the U.S. Department of Energy via the SEG.
First we grab Numpy and set our size:
In [3]:
ntr = 390
ns = 276
Then we build an array
In [4]:
plain = np.zeros((ns, ntr))
and populate it with samples
In [5]:
with open('data/seis.bin', 'rb') as f:
for i in range(ntr):
for j in range(ns):
plain[j,i] = struct.unpack('f', f.read(4))[0]
In this section let's explore how we can encrypt an image. The algorithm we'll build here is based on the TEA (Tiny Encryption Algorithm) cryptographic protocol. It is a symmetric encryption algorithm, meaning we'll use the same password (the one above) to both encrypt and decrypt the data.
This code is not acceptable for real encryption. It is a demonstration toy. Do not use it to protect sensitive data!!!
Additionally, even a good implementation of TEA isn't strong against modern cryptanalysis. But it is a really good demonstration of how modern cryptography works. Also, yes, I realize Python is a horrible language to do this in.
In [6]:
class tea(object):
def __init__(self, data, pswrd):
self.data = data
self.pswrd = pswrd
self.cycles = 32
self.d_const = 0x9e3779b9 # Wheeler's key schedule constant, from Golden Ratio
self.s_const = 0xc6ef3720 # Wheeler's Delta << 5 (left bit shift)
self.precision = 1000000 # precision for converting floats to c_uint32's
def encrypt_piece(self, dat, key):
# dat, here, is two 4 byte words. key is four 4 byte words
summ = c_uint32(0)
delta = c_uint32(self.d_const)
for idx in range(0, self.cycles):
summ.value += delta.value
# at each cycle, bit shift and XOR, recombining key words recursively (see img abv)
dat[0].value += ((dat[1].value << 4) + key[0].value) ^ (dat[1].value + summ.value) ^ ((dat[1].value >> 5) + key[1].value)
dat[1].value += ((dat[0].value << 4) + key[2].value) ^ (dat[0].value + summ.value) ^ ((dat[0].value >> 5) + key[3].value)
return dat
def byte_splitta(self):
# this sucker splits out float data into c_uint32 words
byts = []
for idx in range(len(self.data)):
temp = c_uint32(int(struct.unpack('f', temp_data[idx])[0] * self.precision))
byts.append(temp)
return (byts)
def build_key(self, passin):
# builds c_uint32 words from a key
ckey = []
bytelist = [ord(char) for char in passin] # get ints from string chars
for index in range(0, len(bytelist), 4):
chunk = bytelist[index:index + 4] # chunk out ints for words
temp = ''.join((str(chunk[0]),str(chunk[1]),str(chunk[2]),str(chunk[3]))) # concatenate ints
ckey.append(c_uint32(int(temp))) # string -> int -> c_uint32
return ckey
def forward(self):
# push plaindata through the bit splitter and encryption
d_bytes = self.byte_splitta() # build word list from data
passin = self.pswrd[0:32] # to eliminate padding
k_bytes = self.build_key(passin) # build word list from key
cipherdata = [] # blank for enciphered image
count = 0 # to ease indexing
for idx2 in range(0, len(d_bytes), 2):
d_piece = d_bytes[idx2: idx2 + 2] # splilt out a couple of data words
k_piece = [k_bytes[count%len(k_bytes)], k_bytes[count%len(k_bytes) + 1],
k_bytes[count%len(k_bytes) + 2], k_bytes[count%len(k_bytes) + 3]] # split out 4 key words
count += 4
temp = self.encrypt_piece(d_piece, k_piece) # encrypt a block
cipherdata.append(temp[0]) # append to list
cipherdata.append(temp[1]) # append to list
return cipherdata
Now let's do an encryption run of the data and show its output. First we load our data as a bunch of four byte floats, and input our key as a string:
In [7]:
temp_data = []
with open('data/seis.bin', 'rb') as f:
for i in range(ns*ntr):
temp_data.append(f.read(4))
temp_key = 'is_52_ytpeieonncr_iTghsn_kye._oru_hTsi'
proto = tea(temp_data, temp_key)
cipher_bytes = proto.forward()
Then we resize our list of encrypted data into a 2D array for plotting against the plain image.
In [8]:
cipher = np.zeros((ns, ntr))
for i in range(ntr):
for j in range(ns):
cipher[j,i] = cipher_bytes[i*j].value
And plot.
In [17]:
fig = plt.figure()
plt.rcParams['figure.figsize'] = (20, 6)
a=fig.add_subplot(1,2,1)
imgplot = plt.imshow(plain, cmap='gray')
a.set_title('Plain image data', fontsize=30, color='red')
plt.colorbar( orientation ='vertical')
a=fig.add_subplot(1,2,2)
imgplot = plt.imshow(cipher, cmap='gray')
imgplot.set_clim(-4e9,4e9)
a.set_title('Cipher image data', fontsize=30, color='red')
plt.colorbar(orientation='vertical')
Out[17]:
That's pretty jumbled. Our encryption algorithm made quick work of that data. The basic tenets of encryption workflows are confusion and diffusion (as per Shannon, 1949):
Let's now use a free tool called AEScrypt to encrypt the above seismic section. Below the -p flag specifies the password, where we've used the string from above. The -e flag specifies "encrypt" (instead of -d: "decrypt").
In [10]:
!aescrypt -p is_52_ytpeienncr_iTghsn_kye._oru_hTsi -e data/seis.bin
Now we'll load the enciphered data into an array for manipulation.
In [11]:
cipher = np.zeros((ns, ntr))
with open('data/seis.bin.aes', 'rb') as f:
for i in range(ntr):
for j in range(ns):
cipher[j,i] = struct.unpack('f', f.read(4))[0]
Let's now view the plainimage data and the cipherimage data side-by-side.
In [15]:
fig = plt.figure()
plt.rcParams['figure.figsize'] = (20, 6)
a=fig.add_subplot(1,2,1)
imgplot = plt.imshow(plain, cmap='gray')
a.set_title('Plain image data', fontsize=30, color='red')
plt.colorbar( orientation ='vertical')
a=fig.add_subplot(1,2,2)
imgplot = plt.imshow(cipher, cmap='gray')
imgplot.set_clim(-4,4)
a.set_title('Cipher image data', fontsize=30, color='red')
plt.colorbar(ticks=range(-4, 5), orientation='vertical')
Out[15]:
Obviously, you'd much rather an attacker have access to the cipherimage, not the plainimage.
The discerning reader will notice the grid in the cipher image. This is due to the blockwise operation of the AES algorithm. AES utilizes constant size blocks (128 bits), whereas other substitution-permutation cryptography algorithms can use blocks of varying size.
In [ ]: