Scatter Plots of Rank, OPR, and Obstacle Success



In [3]:

    
import matplotlib
import matplotlib.pyplot as plot
import pandas as pd
import numpy as np
%matplotlib inline

Quick test to demonstrate how the arguments in the scatterplots work and what their types are:



In [16]:

    
a = [3, 4, 5, 6, 7, 15]
b = [4, 5, 6, 7 , 7, 3]
size = [2, 200, 10, 2, 15]
hues = [0.1, 0.2, 0.6, 0.7, 0.9]

plot.scatter(a, b, s=size, alpha=0.5)
plot.show()

Just some fake data in the same format as the actual data will be



In [21]:

    
team_data = pd.read_csv("fake_features.csv")
team_data









    Out[21]:






  
    
      
      Team Number
      Rank
      NormalOPR
      Obstacles
    
  
  
    
      0
      118
      1
      1.000000
      30
    
    
      1
      296
      7
      0.516667
      20
    
    
      2
      610
      4
      0.889855
      10
    
    
      3
      746
      8
      0.304348
      14
    
    
      4
      1075
      9
      0.343841
      15
    
    
      5
      1114
      2
      0.999638
      3
    
    
      6
      1241
      5
      0.672826
      55
    
    
      7
      1246
      6
      0.490217
      60
    
    
      8
      1285
      3
      0.860145
      12

Notes

Color code dots red that have an obstacle rating > the median or mean of the top eight or sixteen teams.
Nice shade of red : c = (1,0.2,0.2)

Extract the data from the dataframe so that it can be used as arguments in the scatter function



In [127]:

    
ranks = team_data['Rank']
OPR = team_data['NormalOPR']
obstacles = 5 * team_data['Obstacles']
test_hues = ['red', 'red', 'red', 'blue', 'red', 'green', 'red', 'yellow', 'red']
plot.scatter(ranks, OPR, s=obstacles, c=hues)
plot.show()



In [88]:

    
obstacle_list = list(obstacles)
print obstacle_list









    



[150, 100, 50, 70, 75, 15, 275, 300, 60]



In [126]:

    
top_mean = sum(obstacle_list[0:3]) / len(obstacle_list[0:3])
hues = []
for index, obstacle in enumerate(obstacle_list):
    if obstacle < top_mean or index <= 3:
        hues.append('blue')
    elif obstacle >= top_mean:
        hues.append('red')









    Out[126]:





['blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'red', 'red', 'blue']



In [120]:

    
# testerino = []
# for i, v in enumerate(obstacle_list):
#     if v < top_mean or i <= 3:
#         testerino.append('blue')
#     elif v >= top_mean:
#         testerino.append('red')



In [121]:

    
testerino









    Out[121]:





['blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'red', 'red', 'blue']



In [97]:

    
top_mean









    Out[97]:





100



In [96]:

    
obstacle_list









    Out[96]:





[150, 100, 50, 70, 75, 15, 275, 300, 60]

	Team Number	Rank	NormalOPR	Obstacles
0	118	1	1.000000	30
1	296	7	0.516667	20
2	610	4	0.889855	10
3	746	8	0.304348	14
4	1075	9	0.343841	15
5	1114	2	0.999638	3
6	1241	5	0.672826	55
7	1246	6	0.490217	60
8	1285	3	0.860145	12