Instagram Engineering Challenge


In [1]:
from IPython.display import HTML
url = 'http://instagram-engineering.tumblr.com/post/12651721845/instagram-engineering-challenge-the-unshredder'
HTML('<iframe src=' + url + ' width=900 height=350></iframe>')


Out[1]:

First let's download the file to the local directory.


In [14]:
import requests

imgurl = 'http://instagram-static.s3.amazonaws.com/images/TokyoPanoramaShredded.png'
response = requests.get(imgurl)
if response.status_code == 200:
    f = open("tokyoshred.png", "w")
    f.write(response.content)
    f.close()

This code is from the Instagram page and explains how the pixel data is stored in an Image object and how it can be retrieved.


In [54]:
from PIL import Image
path = 'tokyoshred.png'
image = Image.open(path)
data = image.getdata() # This gets pixel data
print image.info

def get_pixel_value(x, y):
   width, height = image.size
   pixel = data[y * width + x]
   return pixel

get_pixel_value(20, 30)


{}
Out[54]:
(203, 204, 198, 255)

Here is the shredded file:


In [19]:
import IPython.display
IPython.display.Image('./tokyoshred.png')


Out[19]:

In [158]:
def pixel_compare(p1, p2):
    '''
    Takes in two length four tuple and returns a naive distance between them, simply the sum of the absolute values of the 
    differences.
    '''
    error = 0
    assert len(p1) == 4
    for i in range(4):
        error += abs(p1[i]-p2[i])
    return error

pixel_compare(get_pixel_value(31, 245),get_pixel_value(32, 245))


Out[158]:
87

In [207]:
import random

NUMBER_OF_COLUMNS = 20

def shred_compare(shred, shreds_left, left=True):
    errors = []
    for oshred in shreds_left:
        error = 0
        for i in range(image.size[1]):
            if left == True:
                p1 = get_pixel_value(shred*shred_width, i)
                p2 = get_pixel_value((oshred+1)*shred_width - 1, i)
                error += pixel_compare(p1,p2)
            else:
                p1 = get_pixel_value((shred+1)*shred_width - 1, i)
                p2 = get_pixel_value(oshred*shred_width, i)
                error += pixel_compare(p1,p2)
        errors.append((error, oshred))
    return min(errors)
    
shreds_left = [i for i in range(NUMBER_OF_COLUMNS)]
shred = random.choice(shreds_left)
shreds_left.remove(shred)
print 'Shred %s' % shred
print 'Best shred on the left: {0}'.format(shred_compare(shred, shreds_left))
print 'Best shred on the right: {0}'.format(shred_compare(shred, shreds_left, left=False))


Shred 3
Best shred on the left: (36009, 7)
Best shred on the right: (26985, 2)

Now find the best candidate for the leftmost shred (this technique turns out to be not so effective in practice, hence the alteration to the algorithm below):


In [202]:
lefterrors = []
for shred in range(NUMBER_OF_COLUMNS):
    shreds_left = [i for i in range(NUMBER_OF_COLUMNS) if i != shred]
    lefterrors.append([shred_compare(shred, shreds_left), shred])
print sorted(lefterrors, key=lambda triple: -triple[0][0])


[[(50508, 12), 6], [(43988, 19), 17], [(42182, 0), 8], [(36009, 7), 3], [(33656, 13), 7], [(33250, 18), 13], [(31697, 6), 15], [(30924, 15), 1], [(29927, 4), 19], [(29257, 14), 16], [(26985, 3), 2], [(26391, 17), 12], [(25169, 5), 0], [(24941, 2), 11], [(22474, 1), 5], [(22235, 11), 4], [(18990, 8), 10], [(18091, 16), 18], [(17093, 10), 14], [(13999, 0), 9]]

The algorithm chooses a random starting place, computes the smallest errors on the left and on the right and then chooses the minimum one. That shred gets added to the appropriate side and then we use the new end shreds (one of which will be the original shred) to find the next shred via the same process. The benefit of doing this is that the end shreds do not cause problems like they did in my previous methods.


In [208]:
shreds_left = [i for i in range(NUMBER_OF_COLUMNS)]
shred = random.choice(shreds_left)
shreds_left.remove(shred)
shred_order = [shred]
lshred, rshred = shred, shred

while len(shred_order) < NUMBER_OF_COLUMNS:
    shreds_left = [i for i in range(NUMBER_OF_COLUMNS) if i not in shred_order]
    lerror, lshred = shred_compare(lshred, shreds_left)
    rerror, rshred = shred_compare(rshred, shreds_left, left=False)

    if lerror < rerror:
        shreds_left.remove(lshred)
        if lshred in shred_order:
            print shreds_left, shred_order, lshred, lerror
        shred_order = [lshred] + shred_order
        rshred = shred
    else:
        shreds_left.remove(rshred)
        if rshred in shred_order:
            print shreds_left, shred_order, lshred, lerror
        shred_order.append(rshred)
        lshred = shred

print shred_order


[8, 10, 14, 16, 18, 13, 7, 3, 2, 11, 4, 19, 17, 12, 6, 15, 1, 5, 0, 9]

This code from the Instagram page creates a new image object of the correct size and pastes the shred in the correct positions using the list of shreds obtained above:


In [209]:
# Create a new image of the same size as the original and copy a region into the new image
unshredded = Image.new('RGBA', image.size)
shred_width = unshredded.size[0]/NUMBER_OF_COLUMNS

for idx, shred_number in enumerate(shred_order):
    x1, y1 = shred_width * shred_number, 0
    x2, y2 = x1 + shred_width, unshredded.size[1]
    source_region = image.crop((x1, y1, x2, y2))
    destination_point = (shred_width*idx, 0)
    unshredded.paste(source_region, destination_point)

# Output the new image
newpath = path[:-9] + 'unshred.png'
unshredded.save(newpath, 'PNG')

In [210]:
import IPython.display
IPython.display.Image('./tokyounshred.png')


Out[210]:

In [109]:
%load_ext version_information
%version_information PIL


Out[109]:
SoftwareVersion
Python2.7.3 (default, Feb 27 2014, 20:00:17) [GCC 4.6.3]
IPython2.0.0
OSposix [linux2]
PIL1.1.7
Tue May 20 10:44:25 2014 BST