Analyzing the GIFGIF dataset

GIFGIF is a project from the MIT media lab that aims at understanding the emotional content of animated GIF images. The project covers 17 emotions, including happiness, fear, amusement, shame, etc. To collect feedback from users, the web site shows two images at a time, and asks feedback as follows.

Which of the left or right image better expresses [emotion] ?

where [emotion] is one of 17 different possibilities. Therefore, the raw data that is collected consists of outcomes of pairwise comparison between pairs of images. Just the kind of data that choix is built for!

In this notebook, we will use choix to try making sense of the raw pairwise-comparison data. In particular, we would like to embed the images on a scale (for a given emotion).

Dataset

We will use a dump of the raw data available at http://lucas.maystre.ch/gifgif-data. Download and uncompress the dataset (you don't need to download the images).



In [1]:

    
import choix
import collections
import numpy as np

from IPython.display import Image, display



In [2]:

    
# Change this with the path to the data on your computer.
PATH_TO_DATA = "/tmp/gifgif/gifgif-dataset-20150121-v1.csv"

We also define a short utility function to display an image based on its identifier.



In [3]:

    
def show_gif(idx):
    template = "http://media.giphy.com/media/{idx}/giphy.gif"
    display(Image(url=template.format(idx=idx)))
    
# A random image.
show_gif("k39w535jFPYrK")

Processing the raw data

First, we need to transform the raw dataset into a format that choix can process. Remember that choix encodes pairwise-comparison outcomes as tuples (i, j) (meaning "$i$ won over $j$"), and that items are assumed to be numbered by consecutive integers.

We begin by mapping all distinct images that appear in the dataset to consecutive integers.



In [4]:

    
# First pass over the data to transform GIFGIF IDs to consecutive integers.
image_ids = set()
with open(PATH_TO_DATA) as f:
    next(f)  # First line is header.
    for line in f:
        emotion, left, right, choice = line.strip().split(",")
        if len(left) > 0 and len(right) > 0:
            # `if` condition eliminates corrupted data.
            image_ids.add(left)
            image_ids.add(right)
int_to_idx = dict(enumerate(image_ids))
idx_to_int = dict((v, k) for k, v in int_to_idx.items())

n_items = len(idx_to_int)
print("Number of distinct images: {:,}".format(n_items))









    



Number of distinct images: 6,170

Next, we parse the comparisons in the data and convert the image IDs to the corresponding integers. We collect all the comparisons and filter them by emotion.



In [5]:

    
data = collections.defaultdict(list)
with open(PATH_TO_DATA) as f:
    next(f)  # First line is header.
    for line in f:
        emotion, left, right, choice = line.strip().split(",")
        if len(left) == 0 or len(right) == 0:
            # Datum is corrupted, continue.
            continue
        # Map ids to integers.
        left = idx_to_int[left]
        right = idx_to_int[right]
        if choice == "left":
            # Left image won the comparison.
            data[emotion].append((left, right))
        if choice == "right":
            # Right image won the comparison.
            data[emotion].append((right, left))
            
print("Number of comparisons for each emotion")
for emotion, comps in data.items():
    print("{: <14} {: >7,}".format(emotion, len(comps)))









    



Number of comparisons for each emotion
pleasure        87,941
contentment     71,493
embarrassment   49,579
surprise        65,030
sadness         64,631
excitement      82,054
happiness      106,877
pride           50,394
disgust         60,361
shame           47,099
amusement       76,653
anger           65,880
relief          39,609
contempt        50,976
guilt           44,971
satisfaction    79,612
fear            51,703

Parameter inference

Now, we are ready to fit a Bradley-Terry model to the data, in order to be able to embed the images on a quantitative scale (for a given emotion). In the following, we consider happiness.



In [6]:

    
# What does the data look like?
data["happiness"][:3]









    Out[6]:





[(5081, 4390), (3685, 3717), (2774, 3672)]



In [7]:

    
%%time
params = choix.opt_pairwise(n_items, data["happiness"])









    



CPU times: user 3min, sys: 3.32 s, total: 3min 3s
Wall time: 1min 43s

The parameters induce a ranking over the images. Images ranked at the bottom are consistently found to express less happiness, and vice-versa for images ranked at the top.



In [8]:

    
ranking = np.argsort(params)

Visualizing the results

The top three images that best express happiness are the following:



In [9]:

    
for i in ranking[::-1][:3]:
    show_gif(int_to_idx[i])

The top three images that *least express happiness are the following:



In [10]:

    
for i in ranking[:3]:
    show_gif(int_to_idx[i])

Predicting future comparison outcomes

Based on the model learnt from the data, it is also possible to predict what a user would select as "better expressing happiness" for any pair of images. Below is an example.



In [11]:

    
rank = 2500
top = ranking[::-1][rank]
show_gif(int_to_idx[top])

bottom = ranking[rank]
show_gif(int_to_idx[bottom])



In [12]:

    
prob_top_wins, _ = choix.probabilities((top, bottom), params)
print("Prob(user selects top image) = {:.2f}".format(prob_top_wins))









    



Prob(user selects top image) = 0.80