Statistics of pod check-ins

Kyle Willett

17 Jul 2016

Jimmy noted last week that he and Noah got slotted into the same pod for daily check-ins three days in a row. That made me wonder how likely that was, and I thought it'd be a good stats problem.

I tried tackling this using an analytical/combinatorial solution and didn't get to a reasonable answer (even with help from Amelia and Lois). So I decided to Monte Carlo it and see what the result was.

Question: what is the probability of *any* two people being in the same pod check-in at least three days in a row?

My (very naive) prediction: I wasn't sure at all how likely this situation would be, other than the fact that it happened once (with what Katie and Jen assured us were random selections. So I'll guess that the odds are ~50%.


In [38]:
import numpy as np

In [39]:
# Parameters for this problem

N = 1000                # Number of trials to run

n_people = 15           # Number of individual people to schedule
n_groups = 5            # Number of groups per day
n_days = 5              # Number of days with scheduled groups
n_inarow = 3            # Number of days in a row in which two people the same pod (target)

In [40]:
# Set up the initial schedule
people = range(n_people)
modval = n_people//n_groups
sched = np.mgrid[0:n_days,0:n_people][1,:,:]

In [41]:
# A couple of useful print functions for later on

def breakline():

    print "------------------------------------------" 

def print_example(s):
    breakline()
    header = ["Day {:d}\t".format(x+1) for x in range(n_days)]
    print ' '.join(header)
    nrows = s.shape[0]
    for idx,row in enumerate(s.T):
        if not idx % modval:
            breakline()
        templine = "{}\t"*nrows
        print templine.format(*row)
    breakline()

In [42]:
# The actual Monte Carlo loop

def run_trials(N,verbose=False):

    foundone = 0
    successful_example,params = None,None

    for i in range(N):
    
        # Randomly shuffle the groups in place for each day
        map(np.random.shuffle,sched)
    
        # Set if successful match was found in this trial
        daysinarow = False
    
        # Keep looking through possible pairs until one is found
        keep_looking = True
        while keep_looking:
            for person1 in people:
    
                # Groups for Person #1
                groups_p1 = [list(row).index(person1)//modval for row in sched]
    
                for person2 in people:
                    # Can't compare to oneself
                    if person1 != person2:
                        # Groups for Person #2
                        groups_p2 = [list(row).index(person2)//modval for row in sched]
    
                        # Look over each sliding window of N days for a match
                        for j in range(n_days - n_inarow + 1):
                            sumarr = [x-y for x,y in zip(groups_p1,groups_p2)]
    
                            # Check if conditions match
                            if all(s is 0 for s in sumarr[j:j+n_inarow]):
                                daysinarow = True
                                successful_example = sched[::]
                                params = (person1,person2,j+1,j+n_inarow)
    
                                if verbose:
                                    print "\nPersons {} and {} on Days {}-{}".format(
                                        person1,person2,j+1,j+n_inarow)
                                    print_example(sched)
    
                                # Found a match; can stop looking in this trial
                                keep_looking = False
    
            # No pairs on consecutive days found in this trial; exit the loop.
            keep_looking = False
    
        # Mark that this trial was successful
        if daysinarow:
            foundone += 1

    return foundone,successful_example,params

In [43]:
# Run trials and report result

successes,example,params = run_trials(N,verbose=False)

if example is not None:
    print "Example of a successful trial:"
    print "\nPersons {} and {} are on the same pod on Days {}-{}".format(*params)
    print_example(example)

print "\n{:.1f}% of the time, two people are \
in the same group at least {} days in a row.\n".format(
    successes /float(N)*100.,n_inarow)


Example of a successful trial:

Persons 13 and 0 are on the same pod on Days 3-5
------------------------------------------
Day 1	 Day 2	 Day 3	 Day 4	 Day 5	
------------------------------------------
2	9	13	0	13	
1	8	6	6	7	
7	5	7	14	14	
------------------------------------------
12	12	0	10	12	
11	10	11	3	5	
6	6	4	1	1	
------------------------------------------
0	0	3	5	6	
8	11	8	12	10	
10	2	5	11	11	
------------------------------------------
5	7	9	13	3	
4	4	1	8	2	
14	3	12	9	8	
------------------------------------------
9	13	10	2	0	
3	14	2	7	9	
13	1	14	4	4	
------------------------------------------

56.6% of the time, two people are in the same group at least 3 days in a row.

That is shockingly close to my naive prediction. I'd still really love to get a proper mathematical justification of this result, but it's very much in line with the observed sample size of 1 success in 1 trial.


In [ ]: