This NY Times piece from 2013 showed that strikeouts are on the rise in Major League Baseball. In a simple infographic, it shows the gradual increase over time in strikeouts per game from 1900 to 2012.
This is a brief example of how you can use pybaseball to replicate this graphic and answer other historical questions about team-level baseball statistics.
In [1]:
from pybaseball import team_pitching
import matplotlib.pyplot as plt
%matplotlib inline
In [2]:
#collect historic team pitching data from pybaseball
pitching_data = team_pitching(1900,2016)
In [3]:
#a quick look at the data
print("data shape: {}").format(pitching_data.shape)
print(pitching_data.head())
In [4]:
# some summary stats
pitching_data.describe()
Out[4]:
In [5]:
# get league-average SO/game by year
league_average = pitching_data.groupby('Season', as_index=False)['K/9'].mean()
In [6]:
# plot avg. SO/game over time
plt.scatter(pitching_data['Season'], pitching_data['K/9'])
plt.plot(league_average['Season'], league_average['K/9'], c='red')
plt.xlim(1900,2016)
plt.ylim(2,10)
plt.xlabel('Season')
plt.ylabel('Average Strikeouts per Game')
plt.title('Strikeouts per Game over Time');
In [ ]: