Scatter Plots of Rank, OPR, and Obstacle Success


In [3]:
import matplotlib
import matplotlib.pyplot as plot
import pandas as pd
import numpy as np
%matplotlib inline

Quick test to demonstrate how the arguments in the scatterplots work and what their types are:


In [16]:
a = [3, 4, 5, 6, 7, 15]
b = [4, 5, 6, 7 , 7, 3]
size = [2, 200, 10, 2, 15]
hues = [0.1, 0.2, 0.6, 0.7, 0.9]

plot.scatter(a, b, s=size, alpha=0.5)
plot.show()


Just some fake data in the same format as the actual data will be


In [21]:
team_data = pd.read_csv("fake_features.csv")
team_data


Out[21]:
Team Number Rank NormalOPR Obstacles
0 118 1 1.000000 30
1 296 7 0.516667 20
2 610 4 0.889855 10
3 746 8 0.304348 14
4 1075 9 0.343841 15
5 1114 2 0.999638 3
6 1241 5 0.672826 55
7 1246 6 0.490217 60
8 1285 3 0.860145 12

Notes

  • Color code dots red that have an obstacle rating > the median or mean of the top eight or sixteen teams.
  • Nice shade of red : c = (1,0.2,0.2)

Extract the data from the dataframe so that it can be used as arguments in the scatter function


In [127]:
ranks = team_data['Rank']
OPR = team_data['NormalOPR']
obstacles = 5 * team_data['Obstacles']
test_hues = ['red', 'red', 'red', 'blue', 'red', 'green', 'red', 'yellow', 'red']
plot.scatter(ranks, OPR, s=obstacles, c=hues)
plot.show()



In [88]:
obstacle_list = list(obstacles)
print obstacle_list


[150, 100, 50, 70, 75, 15, 275, 300, 60]

In [126]:
top_mean = sum(obstacle_list[0:3]) / len(obstacle_list[0:3])
hues = []
for index, obstacle in enumerate(obstacle_list):
    if obstacle < top_mean or index <= 3:
        hues.append('blue')
    elif obstacle >= top_mean:
        hues.append('red')


Out[126]:
['blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'red', 'red', 'blue']

In [120]:
# testerino = []
# for i, v in enumerate(obstacle_list):
#     if v < top_mean or i <= 3:
#         testerino.append('blue')
#     elif v >= top_mean:
#         testerino.append('red')

In [121]:
testerino


Out[121]:
['blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'red', 'red', 'blue']

In [97]:
top_mean


Out[97]:
100

In [96]:
obstacle_list


Out[96]:
[150, 100, 50, 70, 75, 15, 275, 300, 60]