Lesson 04 - Designing an experiment

Choosing population

  • can be done based on logins, random events, cookie based
  • all of the above are called diversions i.e. how to decide what traffic goes into experimental group vs. control group
  • Unit of Diversion Define what a subject is in your experiment

    • User id
      • stable, unchanging
      • personally identifiable
    • Anonymous id (cookie)
      • changes when person changes device/browser
      • cookies can be cleared
    • Event
      • No consistent experience
    • Device ID
      • only available on mobile devices
      • unchangeable by user
      • personally identifiable
    • IP address

Considerations

  • Consistency
  • Ethics
  • variability
    • empirical variability can be larger than analytical variability when unit of analysis is different than unit of diversion.
    • unit of analysis is what goes in denominator. e.g. in case CTP = (clicks / page views) and you are using cookies as unit of diversion
    • the reason in the difference is because in case of event based diversion it is random but in case of cookie based diversion it is not completely random. Groups of people are being diverted and that makes the data points correlated. That increases empirical variability. Analytical assumes randomness

More details about the graph are in this paper http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/36500.pdf

We would usually want our unit of diversion to be at least as big as unit of analysis. e.g. cookie is larger than pageview as a single cookie can be in multiple pageviews. We don't want that as the same diversion should not be part of multiple analysis which could potentially fall in both control and experiment.

Inter-vs-Intra User Experiments

  • Anything other event driven diversion is a proxy for users
  • If using intra-user experiments (same users in control and experiment) ensure that you don't expose them to weird things like control before christmas and experiment after christmas. Would give different behaviour
  • Interleaved experiments have info at http://www.cs.cornell.edu/People/tj/publications/chapelle_etal_12a.pdf
  • Internet experiments are mainly inter user experiments

Target Population

  • based on age, gender depending on whether you have access to that information
  • when launching something we may want to restrict it to a geographic location due to cultural, language, legal reasons
  • sometimes targeting is not necessary. e.g. in case the change is related to global population
  • for targeting talking to engineers who implement it is necessary
  • for a big change before launching the actual change it be a good idea to launch a global experiment to ensure that the change does not have a bad effect on unintended population

Population vs. Cohort

  • Cohort are people who enter experiment at the same time. Not necessarily only time but also group based on other things e.g. who use both mobile & desktop
  • we use cohort when we want to ensure that there was a change in behaviour relative to their history

When to use a cohort instead of population

  • looking for learning effects
  • examinining user retention
  • want to increase user activity
  • anything requiring user to be established

Duration & Sizing

Based on metric choice, unit of diversion, population there is an effect on variability. Determine the size based on this. e.g. for latency example we want to see whether it affects our user. For a global population that may require fair amount of data. Based on our variability we can size the experiment and see whether it is really practical to do an experiment of that size.

Duration vs Exposure

  • when to run the experiment? holiday time, working days etc.
  • how long will your experiment run depends on how much of total traffic you are willing to divert

Learning effects

  • people may hate or love the change. with time there will be a plateau of their behaviour. this is learning effect
  • to measure learning effect
    • needs some time
    • need same user to be seeing this so unit of diversion needs to be chosen based on that