Jonathan Lowery

CSCI 466 - Project 4

100 Points

For this assignment you will investigate hash function and hash chains. You may complete the assignment in any language. For a given hash function h, a hash chain is defined as follows:

$H_0=x$

$H_i=h(H_{i-1})\ for\ i\ge 1$

For the purpose of this assignment, our hash function is the last k bits of MD5. In other words, $h_k(x)$ = least significant k bits of MD5(x). Before completing the assignment, read Section 2.1.6 from the Handbook of Applied Crypto (available: http://www.cacr.math.uwaterloo.ca/hac/about/chap2.pdf).

Part I:

(70 points) Write a computer program to compute the number of components, average/max tail length, min/average/max cycle length when we use $h_{16}(x)$. Your output should print these 6 numbers. Tail and cycle length are defined in HAC Definition 2.35. For the following figure, there are 2 components, 4 tails of lengths 3, 2, 1, and 1 for an average tail length of 1.75. The two cycles are length 4 and 3.

Solution:


In [78]:
from hashlib import md5 #Hash source
from collections import defaultdict as dd #Dynamic Dictionary
import networkx as nx #Graph Theory
import time
from multiprocessing import Pool

DG=nx.DiGraph() #Main DiGraph
CE=[] #Cycle Elements
CL=[] #Cycle Lengths
TL=[] #Tail Lengths
#RevMap=dd()
def get_hash(name,size): #Hash function
    temp=bin(int(md5(name).hexdigest(),16)%2**size)[2:]
    temp='0'*(size-len(temp))+temp
    return temp

def trymap(size): #Create Graph
    for i in range(2**size):
        name=bin(i)[2:]
        temp=get_hash(name,size)
        name="0"*(size-len(name))+name
        DG.add_edge(int(name,2),int(temp,2))

#Return a path from start to the
#Beginning of a cycle if 0
#End of a cycle if 1
def findpath(start):
    templist=[start]
    temp=start
    while not(DG.successors(temp)[0] in templist):
        if (DG.successors(temp)[0] in CE):
            return (templist,0)
        temp=DG.successors(temp)[0]
        templist+=[temp]
    return (templist,1) #templist[-1] is the start of a cycle

def taillen(tailpath): 
    global CE
    global CL
    global TL
    cycle=findpath(tailpath[-1])[0]
    CE+=cycle #Add to Cycle Watchlist
    CL+= [len(cycle)] #Add cycle length to 
    TL+= [len(tailpath)-len(cycle)]

def pathrunner(start):
    global TL
    temp=findpath(start)
    if temp[1]:
        taillen(temp[0])
    else:
        TL+=[len(temp[0])]
    
complexity=16
#pool = Pool(processes=(complexity/2))
timer=time.time()
trymap(complexity)
if complexity<=8: 
    nx.draw(DG)
    pylab.show()
tails=[i for i in DG.pred.keys() if len(DG.pred[i])==0] #Nodes with no predeccesors
timer1=time.time()-timer
print "Tail-enumeration time: {0:.2}s".format(timer1)
#pool.map(pathrunner,tails)
map(pathrunner,tails)
#print "Cycle Elements: ",CE
print "Cycle Lengths: ",CL
#print "Tail Lengths: ",TL
print "Cycle Lengths: Max: {0}, Avg: {1:.3}, Min: {2}, Count: {3}".format(max(CL),average(CL),min(CL),len(CL))
print "Tail Lengths: Max: {0}, Avg: {1:.3}, Min: {2}, Count: {3}".format(max(TL),average(TL),min(TL),len(TL))
print "Total Components: {0}".format(DG.number_of_selfloops()+len(CL))
timer2=time.time()-timer1-timer
print "Time: {0:.6}s for {1} hashes, {2:.6} hashes per second".format(timer2,2**complexity,(2**complexity)/timer2)

#Clean memory since a lot is being stored
del DG #Main DiGraph
del CE #Cycle Elements
del CL #Cycle Lengths
del TL #Tail Lengths


Tail-enumeration time: 1.4s
Cycle Lengths:  [160, 159, 65, 12, 3, 11, 2, 8, 4, 8]
Cycle Lengths: Max: 160, Avg: 43.2, Min: 2, Count: 10
Tail Lengths: Max: 593, Avg: 192.0, Min: 1, Count: 24098
Total Segments: 10
Time: 176.619s for 65536 hashes, 371.059 hashes per second

The variable above labeled 'complexity' determines k in $h_k(x)$

Part II:

  1. (30 points) Under the assumption that MD5 is a random function, design a new hash chain that does not have a cycle. Why does your algorithm not have a cycle?

Solution:

For the purposes of chaining, keep track of an iteration variable i and add i to the input of each number before calculating the next value.

$H_i=h(H_{i-1}+i)\ for\ i\ge 1$


In [ ]: