from planout.ops.random import *
from planout.experiment import SimpleExperiment
import pandas as pd
import json

Log data

Here we explain what all the fields are in the log data. Run this:

class LoggedExperiment(SimpleExperiment):
    def assign(self, params, userid):
        params.x = UniformChoice(choices=["What's on your mind?", "Say something."], unit=userid)
        params.y = BernoulliTrial(p=0.5, unit=userid)

print LoggedExperiment(userid=5).get('x')

Then open your terminal, navigate to the directory this notebook is in, and type:

> tail -f LoggedExperiment.log

You can now see how data is logged to your experiment as its run.

Exposure logs

Whenever you request a parameter, an exposure is automatically logged. In a production environment, one would use caching (e.g., memcache) so that we only exposure log once per unit. SimpleExperiment exposure logs once per instance.

e = LoggedExperiment(userid=4)
print e.get('x')
print e.get('y')

Manual exposure logging

Calling log_exposure() will force PlanOut to log an exposure event. You can optionally pass in additional data.

e.log_exposure({'endpoint': 'home.py'})

Event logging

You can also log arbitrary events. The first argument to log_event() is a required parameter that specifies the event type.

e.log_event('post_status_update', {'type': 'photo'})

Putting it all together

We simulate the components of a PlanOut-driven website and show how data analysis would work in conjunction with the data generated from the simulation.

This hypothetical experiment looks at the effect of sorting a music album's songs by popularity (instead of say track number) on a Web-based music store.

Our website simulation consists of four main parts:

  • Code to render the web page (which uses PlanOut to decide how to display items)
  • Code to handle item purchases (this logs the "conversion" event)
  • Code to simulate the process of users' purchase decision-making
  • A loop that simulates many users viewing many albums

class MusicExperiment(SimpleExperiment):
    def assign(self, params, userid, albumid):
        params.sort_by_rating = BernoulliTrial(p=0.2, unit=[userid, albumid])

import random

def get_price(albumid):
    "look up the price of an album"
    # this would realistically hook into a database
    return 11.99

Rendering the web page

def render_webpage(userid, albumid):
    'simulated web page rendering function'
    # get experiment for the given user / album pair.
    e = MusicExperiment(userid=userid, albumid=albumid)
    # use log_exposure() so that we can also record the price
    e.log_exposure({'price': get_price(albumid)})
    # use a default value with get() in production settings, in case
    # your experimentation system goes down
    if e.get('sort_by_rating', False):
        songs = "some sorted songs" # this would sort the songs by rating
        songs = "some non-sorted songs"
    html = "some HTML code involving %s" % songs  # most valid html ever.
    # render html

Logging outcomes

def handle_purchase(userid, albumid):
    'handles purchase of an album'
    e = MusicExperiment(userid=userid, albumid=albumid)
    e.log_event('purchase', {'price': get_price(albumid)})
    # start album download

Generative model of user decision making

def simulate_user_decision(userid, albumid):
    'simulate user experience'
    # This function should be thought of as simulating a users' decision-making
    # process for the given stimulus - and so we don't actually want to do any
    # logging here.
    e = MusicExperiment(userid=userid, albumid=albumid)
    e.set_auto_exposure_logging(False)  # turn off auto-logging
    # users with sorted songs have a higher purchase rate
    if e.get('sort_by_rating'):
        prob_purchase = 0.15
        prob_purchase = 0.10
    # make purchase with probability prob_purchase
    return random.random() < prob_purchase

Running the simulation

# We then simulate 500 users' visitation to 20 albums, and their decision to purchase
for u in xrange(500):
    for a in xrange(20):
        render_webpage(u, a)
        if simulate_user_decision(u, a):
            handle_purchase(u, a)

Loading data into Python for analysis

Data is logged to MusicExperiment.log. Each line is JSON-encoded dictionary that contains information about the event types, inputs, and parameter assignments.

raw_log_data = [json.loads(i) for i in open('MusicExperiment.log')]

It's preferable to deal with the data as a flat set of columns. We use this handy-dandy function Eytan found on stackoverflow to flatten dictionaries.

# stolen from http://stackoverflow.com/questions/23019119/converting-multilevel-nested-dictionaries-to-pandas-dataframe
from collections import OrderedDict
def flatten(d):
    "Flatten an OrderedDict object"
    result = OrderedDict()
    for k, v in d.items():
        if isinstance(v, dict):
            result[k] = v
    return result

Here is what the flattened dataframe looks like:

log_data = pd.DataFrame.from_dict([flatten(i) for i in raw_log_data])

Joining exposure data with event data

We first extract all user-album pairs that were exposed to an experiemntal treatment, and their parameter assignments.

all_exposures = log_data[log_data.event=='exposure']
unique_exposures = all_exposures[['userid','albumid','sort_by_rating']].drop_duplicates()

Tabulating the users' assignments, we find that the assignment probabilities correspond to the design at the beginning of this notebook.

Now we can merge with the conversion data.

conversions = log_data[log_data.event=='purchase'][['userid', 'albumid','price']]
df = pd.merge(unique_exposures, conversions, on=['userid', 'albumid'], how='left')
df['purchased'] = df.price.notnull()
df['revenue'] = df.purchased * df.price.fillna(0)

Here is a sample of the merged rows. Most rows contain missing values for price, because the user didn't purchase the item.

Restricted to those who bought something...

df[df.price > 0][:5]

Analyzing the experimental results

df.groupby('sort_by_rating')[['purchased', 'price', 'revenue']].agg(mean)

If you were actually analyzing the experiment you would want to compute confidence intervals.