Chair Selection for a Conference

Some very simple code to help automate the chair selection for the AAS meeting.

This assumes that the information about all potential chairs is compiled in a single spread sheet which has columns describing the name, but more importantly the area of expertise and the session date and time where they give a talk as in the example in the data folder.

Requirements

python 2.7 or 3.5
numpy
pandas



In [1]:

    
import numpy as np
import pandas as pd

Let's make some example data. That also requires the package names, so if you don't have it, or don't want to install it, then ignore the following section and go to Selecting a Chair for a Session below.

We're going to generate 100 names and then define some areas of expertise as well as some sessions for each of them.



In [2]:

    
import names



In [3]:

    
last_names = np.array([names.get_last_name() for i in xrange(100)])
first_names = np.array([names.get_first_name() for i in xrange(100)])



In [4]:

    
## Set up areas of expertise for all of them
expertises = ["Stars", "Galaxies", "Black Holes", "Neutron Stars", 
             "Planets", "Cosmology", "Gas and Dust", "Aliens", "Education"]

area_of_expertise_1 = np.random.choice(expertises, size=last_names.shape[0])
area_of_expertise_2 = np.random.choice(expertises, size=last_names.shape[0])
area_of_expertise_3 = np.random.choice(expertises, size=last_names.shape[0])



In [5]:

    
## set up session dates and times
dates = ["10/04/2015", "10/05/2015", "10/06/2015"]
session_times = ["9:00:00 AM,10:30:00 PM",
                 "11:00:00 AM,12:30:00 PM",
                 "2:00:00 PM,3:30:00 PM"]

session_date = np.random.choice(dates, size=last_names.shape[0])
session_times = np.random.choice(session_times, size=last_names.shape[0])



In [6]:

    
session_start = [s.split(",")[0] for s in session_times]
session_end =  [s.split(",")[1] for s in session_times]

Now we can set up a DataFrame:



In [7]:

    
df_dict = {"last_name":last_names,
          "first_name":first_names,
          "area_of_expertise_1":area_of_expertise_1,
          "area_of_expertise_2":area_of_expertise_2,
          "area_of_expertise_3":area_of_expertise_3,
          "session_date":session_date,
          "session_start":session_start,
          "session_end": session_end
          }

df = pd.DataFrame(df_dict)



In [8]:

    
df.head()









    Out[8]:






  
    
      
      area_of_expertise_1
      area_of_expertise_2
      area_of_expertise_3
      first_name
      last_name
      session_date
      session_end
      session_start
    
  
  
    
      0
      Black Holes
      Stars
      Aliens
      Ramona
      Walshe
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      1
      Planets
      Aliens
      Cosmology
      Toshiko
      Ungar
      10/05/2015
      12:30:00 PM
      11:00:00 AM
    
    
      2
      Stars
      Aliens
      Aliens
      James
      Soria
      10/05/2015
      10:30:00 PM
      9:00:00 AM
    
    
      3
      Stars
      Cosmology
      Planets
      Tomas
      Wesson
      10/06/2015
      10:30:00 PM
      9:00:00 AM
    
    
      4
      Neutron Stars
      Gas and Dust
      Cosmology
      Jeremy
      Laday
      10/06/2015
      12:30:00 PM
      11:00:00 AM

Let's dump this into a .csv file for future use:



In [11]:

    
df.to_csv("../data/example_chairs.csv",index=False,index_label="Index")

Selecting a Chair for a Session

Imagine you have a session on black holes and neutron stars, which will take place on 10/05/2015 from 2:00:00 PM to 3:30:00 PM.

You'd like to pick a person from your sample of possible chairs who has the right expertise, but also does not give a talk in that session.

Let's first load some sample data. As shown above, that data was randomly generated, so all names should be fictitious. All similarities to living or dead persons is entirely coincidental.



In [12]:

    
df = pd.read_csv("../data/example_chairs.csv")

What does this table look like?



In [13]:

    
df.head()









    Out[13]:






  
    
      
      area_of_expertise_1
      area_of_expertise_2
      area_of_expertise_3
      first_name
      last_name
      session_date
      session_end
      session_start
    
  
  
    
      0
      Black Holes
      Stars
      Aliens
      Ramona
      Walshe
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      1
      Planets
      Aliens
      Cosmology
      Toshiko
      Ungar
      10/05/2015
      12:30:00 PM
      11:00:00 AM
    
    
      2
      Stars
      Aliens
      Aliens
      James
      Soria
      10/05/2015
      10:30:00 PM
      9:00:00 AM
    
    
      3
      Stars
      Cosmology
      Planets
      Tomas
      Wesson
      10/06/2015
      10:30:00 PM
      9:00:00 AM
    
    
      4
      Neutron Stars
      Gas and Dust
      Cosmology
      Jeremy
      Laday
      10/06/2015
      12:30:00 PM
      11:00:00 AM

Your table might have more columns, but it needs to have at least the ones above. You can easily export it from, for example, Excel into a .csv file, which Pandas can read.

Two important things to note:

Your table might have NaN values (e.g. if a person does not give a talk at all). That's fine.
Apple's Numbers and Microsoft's Excel count from 1, and also count the header row. Pandas DataFrames don't, which means that the number in the index column of the DataFrame may be two lower than the column in whatever form you might be using for any given column. We will fix this by adding 2 to the index column, but if your software does something different, you might want to change that!



In [14]:

    
df.index += 2



In [15]:

    
df.head()









    Out[15]:






  
    
      
      area_of_expertise_1
      area_of_expertise_2
      area_of_expertise_3
      first_name
      last_name
      session_date
      session_end
      session_start
    
  
  
    
      2
      Black Holes
      Stars
      Aliens
      Ramona
      Walshe
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      3
      Planets
      Aliens
      Cosmology
      Toshiko
      Ungar
      10/05/2015
      12:30:00 PM
      11:00:00 AM
    
    
      4
      Stars
      Aliens
      Aliens
      James
      Soria
      10/05/2015
      10:30:00 PM
      9:00:00 AM
    
    
      5
      Stars
      Cosmology
      Planets
      Tomas
      Wesson
      10/06/2015
      10:30:00 PM
      9:00:00 AM
    
    
      6
      Neutron Stars
      Gas and Dust
      Cosmology
      Jeremy
      Laday
      10/06/2015
      12:30:00 PM
      11:00:00 AM

Okay, so now we have the data!

Let's get out some column to make our queries look easier:



In [16]:

    
area1 = df["area_of_expertise_1"]
area2 = df["area_of_expertise_2"]
area3 = df["area_of_expertise_3"]

session_date = df["session_date"]
session_start = df["session_start"]
session_end = df["session_end"]

First task: let's pick a person whose first expertise is black holes



In [17]:

    
## Extrasolar Planets, Theory 1
black_holes = df[((area1 == "Black Holes") & (
                 ((session_date == "10/05/2015") & 
                            (session_start != "2:00:00 PM") & 
                            (session_end != "3:30:00 PM")) |
                      (session_date != "10/05/2015")))]

The query above basically says the following:

"Pick all participants whose value for "area1" is "Black Holes", and who either do not give a talk on the date "10/05/2015" at all or who give a talk on "10/05/2015", but not between "2:00:00 PM" and "3:30:00 PM".

Note: Make sure that the strings you use in your query match exactly the entries in your table!

Here's the resulting list:



In [18]:

    
black_holes









    Out[18]:






  
    
      
      area_of_expertise_1
      area_of_expertise_2
      area_of_expertise_3
      first_name
      last_name
      session_date
      session_end
      session_start
    
  
  
    
      2
      Black Holes
      Stars
      Aliens
      Ramona
      Walshe
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      11
      Black Holes
      Galaxies
      Planets
      David
      Bertholf
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      15
      Black Holes
      Gas and Dust
      Black Holes
      Charles
      Boles
      10/04/2015
      3:30:00 PM
      2:00:00 PM
    
    
      18
      Black Holes
      Aliens
      Cosmology
      Ethel
      Villalobos
      10/04/2015
      12:30:00 PM
      11:00:00 AM
    
    
      20
      Black Holes
      Gas and Dust
      Aliens
      Adela
      Guevara
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      28
      Black Holes
      Gas and Dust
      Education
      Paul
      Robles
      10/04/2015
      10:30:00 PM
      9:00:00 AM
    
    
      29
      Black Holes
      Neutron Stars
      Gas and Dust
      Kevin
      Violette
      10/06/2015
      10:30:00 PM
      9:00:00 AM
    
    
      35
      Black Holes
      Aliens
      Stars
      Jessica
      Bucknell
      10/04/2015
      12:30:00 PM
      11:00:00 AM
    
    
      40
      Black Holes
      Cosmology
      Planets
      Jeanne
      Gomez
      10/04/2015
      10:30:00 PM
      9:00:00 AM
    
    
      41
      Black Holes
      Cosmology
      Black Holes
      Valerie
      Reeves
      10/06/2015
      3:30:00 PM
      2:00:00 PM
    
    
      45
      Black Holes
      Galaxies
      Stars
      Tomas
      Williams
      10/05/2015
      12:30:00 PM
      11:00:00 AM
    
    
      62
      Black Holes
      Stars
      Education
      Lori
      Poole
      10/04/2015
      12:30:00 PM
      11:00:00 AM
    
    
      63
      Black Holes
      Planets
      Gas and Dust
      Corey
      Gaines
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      94
      Black Holes
      Aliens
      Cosmology
      Elizabeth
      Kin
      10/06/2015
      12:30:00 PM
      11:00:00 AM

If you want a random sample from that list, here is your solution:



In [19]:

    
black_holes.loc[np.random.choice(np.array(black_holes.index))]









    Out[19]:





area_of_expertise_1    Black Holes
area_of_expertise_2      Cosmology
area_of_expertise_3    Black Holes
first_name                 Valerie
last_name                   Reeves
session_date            10/06/2015
session_end             3:30:00 PM
session_start           2:00:00 PM
Name: 41, dtype: object

We can make our query more complex. For example, because our session is about both neutron stars and black holes, we might be happy with anyone whose expertise is either in neutron stars or black holes:



In [20]:

    
## Extrasolar Planets, Theory 1
black_holes = df[(((area1 == "Black Holes") | (area1 == "Neutron Stars")) & (
                 ((session_date == "10/05/2015") & 
                            (session_start != "2:00:00 PM") & 
                            (session_end != "3:30:00 PM")) |
                      (session_date != "10/05/2015")))]



In [21]:

    
black_holes









    Out[21]:






  
    
      
      area_of_expertise_1
      area_of_expertise_2
      area_of_expertise_3
      first_name
      last_name
      session_date
      session_end
      session_start
    
  
  
    
      2
      Black Holes
      Stars
      Aliens
      Ramona
      Walshe
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      6
      Neutron Stars
      Gas and Dust
      Cosmology
      Jeremy
      Laday
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      7
      Neutron Stars
      Education
      Planets
      Mark
      Skimehorn
      10/05/2015
      10:30:00 PM
      9:00:00 AM
    
    
      11
      Black Holes
      Galaxies
      Planets
      David
      Bertholf
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      12
      Neutron Stars
      Planets
      Planets
      Ann
      Boyd
      10/06/2015
      3:30:00 PM
      2:00:00 PM
    
    
      15
      Black Holes
      Gas and Dust
      Black Holes
      Charles
      Boles
      10/04/2015
      3:30:00 PM
      2:00:00 PM
    
    
      17
      Neutron Stars
      Cosmology
      Aliens
      David
      Johnston
      10/05/2015
      12:30:00 PM
      11:00:00 AM
    
    
      18
      Black Holes
      Aliens
      Cosmology
      Ethel
      Villalobos
      10/04/2015
      12:30:00 PM
      11:00:00 AM
    
    
      20
      Black Holes
      Gas and Dust
      Aliens
      Adela
      Guevara
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      25
      Neutron Stars
      Cosmology
      Cosmology
      Cassandra
      Hubertus
      10/04/2015
      12:30:00 PM
      11:00:00 AM
    
    
      28
      Black Holes
      Gas and Dust
      Education
      Paul
      Robles
      10/04/2015
      10:30:00 PM
      9:00:00 AM
    
    
      29
      Black Holes
      Neutron Stars
      Gas and Dust
      Kevin
      Violette
      10/06/2015
      10:30:00 PM
      9:00:00 AM
    
    
      35
      Black Holes
      Aliens
      Stars
      Jessica
      Bucknell
      10/04/2015
      12:30:00 PM
      11:00:00 AM
    
    
      38
      Neutron Stars
      Education
      Cosmology
      Carman
      Tucker
      10/05/2015
      10:30:00 PM
      9:00:00 AM
    
    
      40
      Black Holes
      Cosmology
      Planets
      Jeanne
      Gomez
      10/04/2015
      10:30:00 PM
      9:00:00 AM
    
    
      41
      Black Holes
      Cosmology
      Black Holes
      Valerie
      Reeves
      10/06/2015
      3:30:00 PM
      2:00:00 PM
    
    
      45
      Black Holes
      Galaxies
      Stars
      Tomas
      Williams
      10/05/2015
      12:30:00 PM
      11:00:00 AM
    
    
      46
      Neutron Stars
      Planets
      Galaxies
      Daniel
      Kennison
      10/06/2015
      10:30:00 PM
      9:00:00 AM
    
    
      62
      Black Holes
      Stars
      Education
      Lori
      Poole
      10/04/2015
      12:30:00 PM
      11:00:00 AM
    
    
      63
      Black Holes
      Planets
      Gas and Dust
      Corey
      Gaines
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      67
      Neutron Stars
      Neutron Stars
      Gas and Dust
      Joseph
      Cate
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      70
      Neutron Stars
      Neutron Stars
      Black Holes
      Patrick
      Scurry
      10/06/2015
      3:30:00 PM
      2:00:00 PM
    
    
      90
      Neutron Stars
      Galaxies
      Cosmology
      George
      Rodriquez
      10/04/2015
      10:30:00 PM
      9:00:00 AM
    
    
      94
      Black Holes
      Aliens
      Cosmology
      Elizabeth
      Kin
      10/06/2015
      12:30:00 PM
      11:00:00 AM

That's a much larger list! We might also want to consider people who have expertise in both! that means that their first area of expertise need to be either black holes or neutron stars and their second area of expertise also needs to be either of the two:



In [22]:

    
## Extrasolar Planets, Theory 1
black_holes = df[(((area1 == "Black Holes") | (area1 == "Neutron Stars")) &
                  ((area2 == "Black Holes") | (area2 == "Neutron Stars")) &
                  (
                 ((session_date == "10/05/2015") & 
                            (session_start != "2:00:00 PM") & 
                            (session_end != "3:30:00 PM")) |
                      (session_date != "10/05/2015")))]



In [23]:

    
black_holes









    Out[23]:






  
    
      
      area_of_expertise_1
      area_of_expertise_2
      area_of_expertise_3
      first_name
      last_name
      session_date
      session_end
      session_start
    
  
  
    
      29
      Black Holes
      Neutron Stars
      Gas and Dust
      Kevin
      Violette
      10/06/2015
      10:30:00 PM
      9:00:00 AM
    
    
      67
      Neutron Stars
      Neutron Stars
      Gas and Dust
      Joseph
      Cate
      10/06/2015
      12:30:00 PM
      11:00:00 AM
    
    
      70
      Neutron Stars
      Neutron Stars
      Black Holes
      Patrick
      Scurry
      10/06/2015
      3:30:00 PM
      2:00:00 PM

And that's about as difficult as our queries get!

I've put this entire workflow into a little script (chair_selection.py) that should be easy to use. Make sure your csv file with the candidates are in the same folder as the script.

Some sample queries:

See the help message for the script: $> python chair_selection.py --help

All candidates whose primary expertise is "Black Holes" and who don't give a talk on Oct 5 between 2 and 3 pm:

$> python chair_selection.py -f "../data/example_chairs.csv" -a "Black Holes" -d "10/05/2015" -s "2:00:00 PM" -e "3:30:00 PM" -m "all"

One random candidate whose primary expertise is "Black Holes" and who don't give a talk on Oct 5 between 2 and 3 pm:

$> python chair_selection.py -f "../data/example_chairs.csv" -a "Black Holes" -d "10/05/2015" -s "2:00:00 PM" -e "3:30:00 PM" -m "random"

All candidates whose primary expertise is either "Black Holes" or "Neutron Stars" and who don't give a talk on Oct 5 2015 between 2 and 3 pm:

$> python chair_selection.py -f "../data/example_chairs.csv" -a "Black Holes" "Neutron Stars" -d "10/05/2015" -s "2:00:00 PM" -e "3:30:00 PM" -m "all"

All candidates whose first expertise is either "Black Holes" or "Neutron Stars" and whose second expertise is also "Black Holes" and "Neutron Stars":

$> python chair_selection.py -f "../data/example_chairs.csv" -a "Black Holes" "Neutron Stars" -d "10/05/2015" -s "2:00:00 PM" -e "3:30:00 PM" -m "all" --area2 "Black Holes" "Neutron Stars"

Good luck with selecting your chairs! :)

	area_of_expertise_1	area_of_expertise_2	area_of_expertise_3	first_name	last_name	session_date	session_end	session_start
0	Black Holes	Stars	Aliens	Ramona	Walshe	10/06/2015	12:30:00 PM	11:00:00 AM
1	Planets	Aliens	Cosmology	Toshiko	Ungar	10/05/2015	12:30:00 PM	11:00:00 AM
2	Stars	Aliens	Aliens	James	Soria	10/05/2015	10:30:00 PM	9:00:00 AM
3	Stars	Cosmology	Planets	Tomas	Wesson	10/06/2015	10:30:00 PM	9:00:00 AM
4	Neutron Stars	Gas and Dust	Cosmology	Jeremy	Laday	10/06/2015	12:30:00 PM	11:00:00 AM