Which variable most affects an NBA Basketball Team's "success" ?

Which variable leads to more game wins? Team Assists or Three Pointers?

Spring 2017 Data Bootcamp

By: Leland Sutton, Danielle Bennett and Michael Hou

We initially wanted to look at the intersection between national basketball team profitability and national basketball team wins/losses. We were looking to see if the number of wins in a season would affect the amount of money the team brought in via ticket and merchandise sales, but after looking at the website we found that housed the financial information regarding the inflow of revenue and trying to scrape data from it, we decided to change course as the website was written in Java.

Below is the code for the scraped data we are using to compare the different national basketball team’s success over an eleven year period, which is from the 2006-2007 season to the 2016-2017 season. This is defined by the percentage of wins, taking into account the irregularity of the 2011 NBA lockout. Now, we are looking to compare the “success” of national basketball teams by examining the number of three point shots made (3PA) as well as assists (AST) completed. We are now looking to compare the “success” of national basketball teams, which we defined as how many games a team wins, to the number of three point shots made and assists completed. We expect to see an increase in the 3PA for “successful” teams in recent years, but do not expect to see big variations in amount of assists relative to the time frame.


In [24]:
import pandas as pd             # data package
import matplotlib.pyplot as plt # graphics 
import datetime as dt           # date tools, used to note current date  
import seaborn as sns
from IPython.display import display, HTML

pd.options.mode.chained_assignment = None  # default='warn'

%matplotlib inline

#Abbreviations for teams we are using; Golden State Warriors, Los Angeles Lakers, Los Angeles Clippers, Phoenix Suns, 
#Sacramento Kings 

teams = ["GSW", "LAL", "LAC", "PHO", "SAC"]

urls = []

for team in teams:
    urls.append("http://www.basketball-reference.com/teams/" + team + "/stats_basic_totals.html")

In [121]:
from bs4 import BeautifulSoup
from requests import get

dfs = []
for url in urls:
    team_data = get(url)
    soup = BeautifulSoup(team_data.content, 'html.parser')
    # Load site, find table
    table = soup.findAll('table')[0]
    
    # Find columns
    columns = table.findAll('thead')[0].findAll('th')
    headers = []

    # Only use columns that have a header longer than 0
    for column in columns:
        header = column.getText().strip()
        if (len(header) > 0):
            headers.append(header)
    
    # Finds the rows that we want for the project 
    rows = table.findAll('tbody')[0].findAll('tr')
    team = []
    for row in rows[0:11]:
        season = row.findAll('th')[0].getText()
        cells = row.findAll('td')
        data = [season]
        
        # Extracts the specific data from each row 
        for cell in cells:
            c = cell.getText().strip()
            if (len(c) > 0):
                data.append(c)
        
        team.append(data)
    
    df = pd.DataFrame(columns=headers, data=team)
    dfs.append(df)
    print('Scraped ' + df['Tm'][0] + ' data')

print('Headers: ' + str(list(dfs[0])))


Scraped GSW data
Scraped LAL data
Scraped LAC data
Scraped PHO data
Scraped SAC data
Headers: ['Season', 'Lg', 'Tm', 'W', 'L', 'Finish', 'Age', 'Ht.', 'Wt.', 'G', 'MP', 'FG', 'FGA', 'FG%', '3P', '3PA', '3P%', '2P', '2PA', '2P%', 'FT', 'FTA', 'FT%', 'ORB', 'DRB', 'TRB', 'AST', 'STL', 'BLK', 'TOV', 'PF', 'PTS']

In [122]:
# Only keep the data that we need

scrubbed = []

for df in dfs:
    df = df[['Season', 'Tm', 'W', 'L', '3PA', '3P%', 'AST']]
    df['Season'] = df['Season'].str[:4]
    df['W'] = df['W'].astype(float)
    df['L'] = df['L'].astype(float)
    df['WLP'] = (df['W'] / (df['W'] + df['L']) * 100).astype(float)
    #here we added a column for the Win/Loss percentage to account for the lockout. 
    print(df)
    scrubbed.append(df)


   Season   Tm     W     L   3PA   3P%   AST        WLP
0    2016  GSW  67.0  15.0  2563  .383  2490  81.707317
1    2015  GSW  73.0   9.0  2592  .416  2373  89.024390
2    2014  GSW  67.0  15.0  2217  .398  2248  81.707317
3    2013  GSW  51.0  31.0  2037  .380  1912  62.195122
4    2012  GSW  47.0  35.0  1632  .403  1845  57.317073
5    2011  GSW  23.0  43.0  1351  .388  1470  34.848485
6    2010  GSW  36.0  46.0  1749  .392  1847  43.902439
7    2009  GSW  26.0  56.0  1687  .375  1839  31.707317
8    2008  GSW  29.0  53.0  1475  .373  1711  35.365854
9    2007  GSW  48.0  34.0  2185  .348  1833  58.536585
10   2006  GSW  42.0  40.0  1967  .356  1950  51.219512
   Season   Tm     W     L   3PA   3P%   AST        WLP
0    2016  LAL  26.0  56.0  2110  .346  1716  31.707317
1    2015  LAL  17.0  65.0  2016  .317  1478  20.731707
2    2014  LAL  21.0  61.0  1546  .344  1715  25.609756
3    2013  LAL  27.0  55.0  2032  .381  2006  32.926829
4    2012  LAL  45.0  37.0  2015  .355  1818  54.878049
5    2011  LAL  41.0  25.0  1112  .326  1485  62.121212
6    2010  LAL  57.0  25.0  1487  .352  1801  69.512195
7    2009  LAL  57.0  25.0  1562  .341  1730  69.512195
8    2008  LAL  65.0  17.0  1516  .361  1908  79.268293
9    2007  LAL  57.0  25.0  1751  .378  2003  69.512195
10   2006  LAL  42.0  40.0  1724  .353  1850  51.219512
   Season   Tm     W     L   3PA   3P%   AST        WLP
0    2016  LAC  51.0  31.0  2245  .375  1848  62.195122
1    2015  LAC  53.0  29.0  2190  .364  1873  64.634146
2    2014  LAC  56.0  26.0  2202  .376  2031  68.292683
3    2013  LAC  57.0  25.0  1966  .352  2016  69.512195
4    2012  LAC  56.0  26.0  1752  .358  1958  68.292683
5    2011  LAC  40.0  26.0  1441  .357  1385  60.606061
6    2010  LAC  32.0  50.0  1519  .338  1813  39.024390
7    2009  LAC  29.0  53.0  1457  .332  1810  35.365854
8    2008  LAC  19.0  63.0  1513  .354  1723  23.170732
9    2007  LAC  23.0  59.0  1079  .324  1732  28.048780
10   2006  LAC  40.0  42.0   903  .348  1762  48.780488
   Season   Tm     W     L   3PA   3P%   AST        WLP
0    2016  PHO  24.0  58.0  1854  .332  1604  29.268293
1    2015  PHO  23.0  59.0  2118  .348  1701  28.048780
2    2014  PHO  39.0  43.0  2048  .341  1659  47.560976
3    2013  PHO  48.0  34.0  2055  .372  1563  58.536585
4    2012  PHO  25.0  57.0  1455  .330  1855  30.487805
5    2011  PHO  33.0  33.0  1295  .343  1486  50.000000
6    2010  PHO  40.0  42.0  1857  .377  1945  48.780488
7    2009  PHO  54.0  28.0  1770  .412  1912  65.853659
8    2008  PHO  46.0  36.0  1445  .383  1905  56.097561
9    2007  PHO  55.0  27.0  1764  .393  2188  67.073171
10   2006  PHO  61.0  21.0  1965  .399  2122  74.390244
   Season   Tm     W     L   3PA   3P%   AST        WLP
0    2016  SAC  32.0  50.0  1961  .376  1844  39.024390
1    2015  SAC  33.0  49.0  1839  .359  2009  40.243902
2    2014  SAC  29.0  53.0  1350  .341  1667  35.365854
3    2013  SAC  28.0  54.0  1475  .333  1547  34.146341
4    2012  SAC  28.0  54.0  1681  .363  1708  34.146341
5    2011  SAC  22.0  44.0  1301  .316  1271  33.333333
6    2010  SAC  24.0  58.0  1277  .335  1675  29.268293
7    2009  SAC  25.0  57.0  1383  .349  1679  30.487805
8    2008  SAC  17.0  65.0  1594  .368  1619  20.731707
9    2007  SAC  38.0  44.0  1367  .373  1567  46.341463
10   2006  SAC  33.0  49.0  1513  .350  1665  40.243902

Three-Pointers

We wanted to compare both three-pointers attempted (3PA) and 3P% to win rates over a eleven year period because we wanted to see how significant the effects of the rising popularity in three-pointers are on a team's success rate.

In the past few years, especially with the rising super-stardom of Stephen Curry and the coaching style of Steve Kerr, many teams have been focusing on improving three-point shooting. We gathered data from the Pacific Division because it has one of the largest ranges of success in the league, from the Golden State Warriors, who are peaking in terms of team “success” to the Sacramento Kings and Phoenix Suns, who are on the complete opposite end of the spectrum. The success in the teams themselves have also varied greatly in years, yet almost all five teams have shown an increase in 3PA.


In [86]:
# PLOT...LY 

from plotly.offline import iplot, iplot_mpl  # plotting functions
import plotly.graph_objs as go               # ditto
import plotly                                # just to print version and init notebook
import plotly.plotly as py
import cufflinks as cf                       # gives us df.iplot that feels like df.plot
cf.set_config_file(offline=True, offline_show_link=False)

# these lines make our graphics show up in the notebook
%matplotlib inline             
plotly.offline.init_notebook_mode(connected=True)


#we took some of this code from the Plotly website that we found recently 

#comparing 3PA to W/L%
for df in scrubbed:
    data = []
    data.append(go.Scatter(
        x = df['Season'],
        y = df['WLP'],
        name= 'Win Loss Percentage'
    ))
    
    data.append(go.Scatter(
        x = df['Season'],
        y = df['3PA'],
        name= '3 Point Attempts'
    ))
    
    layout = dict(title= df['Tm'][0],xaxis = dict(title = 'Years'), yaxis = dict(title ='Percentage'))
    fig = dict(data=data, layout=layout)
    iplot(fig, filename='basic-line')



In [87]:
#next we compare 3P% to W/L%

for df in scrubbed:
    data = []
    data.append(go.Scatter(
        x = df['Season'],
        y = df['WLP'],
        name= 'Win Loss Percentage'
    ))
    
    data.append(go.Scatter(
        x = df['Season'],
        y = df['3P%'],
        name= '3 Point Percentage'
    ))
    
    layout = dict(title= df['Tm'][0],xaxis = dict(title = 'Years'), yaxis = dict(title ='Percentage'))
    fig = dict(data=data, layout=layout)
    iplot(fig, filename='basic-line')



In [ ]:


In [124]:
for df in scrubbed:
    data = []
    data.append(go.Bar(
        x = df['Season'],
        y = df['AST'],
        name= '# of Assists'
    ))
    data = []
    data.append(go.Bar(
        x = df['Season'],
        y = df['3P%'],
        name= '3 Point Percentage'
    ))
    data = []
    data.append(go.Bar(
        x = df['Season'],
        y = df['WLP'],
        name= 'Win Loss Percentage'
    ))
    data.append(go.Bar(
        x = df['Season'],
        y = df['3PA'],
        name= '3 Point Attempts'
    ))
    layout = dict(title= 'Comparison for '+ df['Tm'][0],xaxis = dict(title = 'Years')) 
    fig = dict(data=data, layout=layout)
    iplot(fig, filename='basic-line')


Explanation

Above we looked at 3 point attempts and then we look at 3 point percentage. We looked at both variables because we figured that though the percentage of 3 point shots made would be indicative of the strength of the NBA team and potentially directly correlated to the "success" of the basketball team, we believe that 3 point attempts would also increase over time. As there are more attempts made than successful shots, this variable would also have more data points, giving us a clearer picture of the effect 3 point shots have on a team's success.


In [94]:
for df in scrubbed:
    data = []
    data.append(go.Scatter(
        x = df['Season'],
        y = df['WLP'],
        name= 'Win Loss Percentage'
    ))
    layout = dict(title= df['Tm'][0] + "'s Win Loss Percentage",xaxis = dict(title = 'Years'), yaxis = dict(title ='Percentage'))
    fig = dict(data=data, layout=layout)
    iplot(fig, filename='basic-line')


We found that while 3PA generally increases across most teams (except the Phoenix Suns), there does not seem to be a clear trend between three point attempts and team success, which was measured by the W/L percentage. Next, we turn to 3P% which we separated into individual graphs in order to compare against Win Loss Percenatage. We did this in order to accurately see the variation in 3P% over time. On average 3P% followed the same curvature as the win/loss percentage. This is evidence of correlation between the 3P% and the win/loss percentage.

Assists

We wanted to add another variable, so we looked at assists. We used this csv to upload data from the same 5 teams over the same periiod of time (2006-2016). Below you will find the tables correlating to the scraped data for each of the five teams in the same pacific NBA region.


In [119]:
scrubbed = []

for df in dfs:
    df = df[['Season', 'Tm', 'AST']]
    df['Season'] = df['Season'].str[:4]
    print(df)
    scrubbed.append(df)


   Season   Tm   AST
0    2016  GSW  2490
1    2015  GSW  2373
2    2014  GSW  2248
3    2013  GSW  1912
4    2012  GSW  1845
5    2011  GSW  1470
6    2010  GSW  1847
7    2009  GSW  1839
8    2008  GSW  1711
9    2007  GSW  1833
10   2006  GSW  1950
   Season   Tm   AST
0    2016  LAL  1716
1    2015  LAL  1478
2    2014  LAL  1715
3    2013  LAL  2006
4    2012  LAL  1818
5    2011  LAL  1485
6    2010  LAL  1801
7    2009  LAL  1730
8    2008  LAL  1908
9    2007  LAL  2003
10   2006  LAL  1850
   Season   Tm   AST
0    2016  LAC  1848
1    2015  LAC  1873
2    2014  LAC  2031
3    2013  LAC  2016
4    2012  LAC  1958
5    2011  LAC  1385
6    2010  LAC  1813
7    2009  LAC  1810
8    2008  LAC  1723
9    2007  LAC  1732
10   2006  LAC  1762
   Season   Tm   AST
0    2016  PHO  1604
1    2015  PHO  1701
2    2014  PHO  1659
3    2013  PHO  1563
4    2012  PHO  1855
5    2011  PHO  1486
6    2010  PHO  1945
7    2009  PHO  1912
8    2008  PHO  1905
9    2007  PHO  2188
10   2006  PHO  2122
   Season   Tm   AST
0    2016  SAC  1844
1    2015  SAC  2009
2    2014  SAC  1667
3    2013  SAC  1547
4    2012  SAC  1708
5    2011  SAC  1271
6    2010  SAC  1675
7    2009  SAC  1679
8    2008  SAC  1619
9    2007  SAC  1567
10   2006  SAC  1665

In [99]:
for df in scrubbed:
    data = []
    data.append(go.Scatter(
        x = df['Season'],
        y = df['AST'],
        name= 'Assists'
    ))
    layout = dict(title= df['Tm'][0] + "'s No. of Assists",xaxis = dict(title = 'Years'), yaxis = dict(title ='Number'))
    fig = dict(data=data, layout=layout)
    iplot(fig, filename='basic-line')


Explanation

On average we are have also seen that comparatively to 3P%, the win/loss percentage of a team is not directly correlated to the number of assists on average per team over the course of one season.

Additional Stats

To add even more depth to our analysis we wanted to analyze some more assist-related stats. We took four stats from nbaminer.com:
1) Assists/Turnover ratio of each team --- used to measure ball control of a team
2) Opposing Team's Assists/Turnover ratio for each team --- used to measure opposing team's performance
3) Assisted Field Goals Made % of each team --- rough measure for teamwork capabilities
4) Opposing Team's Field Goals Made % for each team --- same, but for opposing teams

We took this data for each team in the NBA over five years so we could see if there was a particular focus in coaching style for each team that lead to varying levels of success, i.e. W/L ratio.


In [100]:
import pandas as pd             # data package
import matplotlib.pyplot as plt # graphics 
import datetime as dt
import pandas as pd                   
import matplotlib.pyplot as plt      
import datetime as dt                  
import numpy as np                    
import seaborn as sns
import statistics
import csv

In [101]:
import requests

In [135]:
url = 'http://www.nbaminer.com/assist-details/'

In [103]:
miner = requests.get(url)
miner


Out[103]:
<Response [200]>

In [104]:
miner.status_code


Out[104]:
200

In [105]:
miner.content[:500]


Out[105]:
b'\n\t<!DOCTYPE html>\r\n<html lang="en-US">\r\n<head>\r\n\r\n\t<meta charset="UTF-8" />\r\n\t<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />\r\n\t\t<meta name=viewport content="width=device-width,initial-scale=1,user-scalable=no">\r\n\t\t<title>NBA Miner | NBA Advanced Stats- Assist Details</title>\r\n\t\t\t<meta name="description" content="Daily updated NBA Advanced Stats- Assist Details (Assist/TO Ratio, Assisted FG percentages and much more)">\r\n\t\t\t<meta name="keywords" content="NBA Stats, NBA Advanced S'

In [158]:
af = pd.read_csv('NBA_Asst.csv')

In [211]:
af.describe()


/Users/michaelhou/anaconda/lib/python3.5/site-packages/numpy/lib/function_base.py:3834: RuntimeWarning:

Invalid value encountered in percentile

Out[211]:
Assist/TO Ratio Opp. Assist/TO Ratio Assisted FGM Pct. Opp. Assisted FGM Pct. W L W/L
count 150.000000 150.000000 150.00000 150.000000 150.000000 150.000000 150.000000
mean 1.621580 1.623027 0.58550 0.585633 40.993333 40.993333 1.260828
std 0.174728 0.172469 0.03821 0.031186 12.689105 12.693864 0.997601
min 1.215000 1.324000 0.47200 0.491000 10.000000 9.000000 0.138889
25% NaN NaN NaN NaN NaN NaN NaN
50% NaN NaN NaN NaN NaN NaN NaN
75% NaN NaN NaN NaN NaN NaN NaN
max 2.106000 2.082000 0.70500 0.658000 73.000000 72.000000 8.111111

In [160]:
af.drop(af.columns[8:], axis=1, inplace=True)

In [161]:
af.drop(af.index[59:], axis=0)


Out[161]:
Team Assist/TO Ratio Opp. Assist/TO Ratio Assisted FGM Pct. Opp. Assisted FGM Pct. W L W/L
0 Atlanta Hawks 1.717 1.529 0.651 0.594 44.0 38.0 1.157895
1 Brooklyn Nets 1.463 1.654 0.567 0.573 49.0 33.0 1.484848
2 Boston Celtics 1.637 1.500 0.614 0.599 41.0 40.0 1.025000
3 Charlotte Bobcats 1.478 1.859 0.562 0.650 21.0 61.0 0.344262
4 Chicago Bulls 1.684 1.441 0.645 0.532 45.0 37.0 1.216216
5 Cleveland Cavaliers 1.540 1.626 0.566 0.631 24.0 58.0 0.413793
6 Dallas Mavericks 1.722 1.571 0.599 0.589 41.0 41.0 1.000000
7 Denver Nuggets 1.661 1.555 0.600 0.620 57.0 25.0 2.280000
8 Detroit Pistons 1.459 1.710 0.585 0.601 29.0 53.0 0.547170
9 Golden State Warriors 1.525 1.820 0.590 0.633 47.0 35.0 1.342857
10 Houston Rockets 1.465 1.595 0.609 0.584 45.0 37.0 1.216216
11 Indiana Pacers 1.396 1.474 0.578 0.556 49.0 32.0 1.531250
12 Los Angeles Clippers 1.730 1.436 0.620 0.639 56.0 26.0 2.153846
13 Los Angeles Lakers 1.514 1.890 0.598 0.590 45.0 37.0 1.216216
14 Memphis Grizzlies 1.589 1.325 0.579 0.582 56.0 26.0 2.153846
15 Miami Heat 1.737 1.372 0.600 0.567 66.0 16.0 4.125000
16 Milwaukee Bucks 1.680 1.511 0.600 0.601 38.0 44.0 0.863636
17 Minnesota Timberwolves 1.584 1.456 0.624 0.589 31.0 51.0 0.607843
18 New Orleans Hornets 1.533 1.834 0.582 0.632 27.0 55.0 0.490909
19 New York Knicks 1.666 1.326 0.527 0.542 54.0 28.0 1.928571
20 Oklahoma City Thunder 1.460 1.443 0.561 0.567 60.0 22.0 2.727273
21 Orlando Magic 1.631 1.968 0.605 0.614 20.0 62.0 0.322581
22 Philadelphia 76ers 1.813 1.655 0.610 0.620 34.0 48.0 0.708333
23 Phoenix Suns 1.514 1.546 0.606 0.588 25.0 57.0 0.438596
24 Portland Trailblazers 1.531 1.871 0.593 0.603 33.0 49.0 0.673469
25 Sacramento Kings 1.483 1.774 0.554 0.639 28.0 54.0 0.518519
26 San Antonio Spurs 1.776 1.500 0.641 0.570 58.0 24.0 2.416667
27 Toronto Raptors 1.628 1.594 0.593 0.605 34.0 48.0 0.708333
28 Utah Jazz 1.597 1.459 0.610 0.562 43.0 39.0 1.102564
29 Washington Wizards 1.480 1.547 0.610 0.600 29.0 53.0 0.547170
30 Atlanta Hawks 1.717 1.594 0.667 0.601 38.0 44.0 0.863636
31 Brooklyn Nets 1.498 1.396 0.585 0.578 44.0 38.0 1.157895
32 Boston Celtics 1.460 1.613 0.576 0.585 25.0 57.0 0.438596
33 Charlotte Bobcats 1.864 1.641 0.597 0.569 43.0 39.0 1.102564
34 Chicago Bulls 1.620 1.426 0.654 0.560 48.0 34.0 1.411765
35 Cleveland Cavaliers 1.573 1.860 0.573 0.658 33.0 49.0 0.673469
36 Dallas Mavericks 1.788 1.436 0.596 0.584 49.0 33.0 1.484848
37 Denver Nuggets 1.459 1.614 0.584 0.569 36.0 46.0 0.782609
38 Detroit Pistons 1.500 1.690 0.539 0.624 29.0 53.0 0.547170
39 Golden State Warriors 1.561 1.420 0.591 0.558 51.0 31.0 1.645161
40 Houston Rockets 1.388 1.662 0.563 0.570 54.0 28.0 1.928571
41 Indiana Pacers 1.393 1.406 0.560 0.535 56.0 26.0 2.153846
42 Los Angeles Clippers 1.846 1.586 0.628 0.622 57.0 25.0 2.280000
43 Los Angeles Lakers 1.654 1.903 0.639 0.614 27.0 55.0 0.490909
44 Memphis Grizzlies 1.691 1.458 0.574 0.558 50.0 32.0 1.562500
45 Miami Heat 1.584 1.371 0.588 0.591 54.0 28.0 1.928571
46 Milwaukee Bucks 1.502 1.682 0.596 0.594 15.0 67.0 0.223881
47 Minnesota Timberwolves 1.777 1.431 0.616 0.565 40.0 42.0 0.952381
48 New Orleans Pelicans 1.639 1.611 0.563 0.604 34.0 48.0 0.708333
49 New York Knicks 1.614 1.426 0.542 0.552 37.0 45.0 0.822222
50 Oklahoma City Thunder 1.474 1.504 0.562 0.598 59.0 23.0 2.565217
51 Orlando Magic 1.482 1.668 0.571 0.591 23.0 59.0 0.389831
52 Philadelphia 76ers 1.329 1.610 0.576 0.653 19.0 63.0 0.301587
53 Phoenix Suns 1.285 1.381 0.493 0.534 48.0 34.0 1.411765
54 Portland Trailblazers 1.740 1.763 0.594 0.516 54.0 28.0 1.928571
55 Sacramento Kings 1.286 1.778 0.511 0.621 28.0 54.0 0.518519
56 San Antonio Spurs 1.790 1.533 0.621 0.539 62.0 20.0 3.100000
57 Toronto Raptors 1.581 1.471 0.581 0.581 48.0 34.0 1.411765
58 Utah Jazz 1.453 1.790 0.564 0.570 25.0 57.0 0.438596

In [110]:
af['Opp. Assisted FGM Pct.'] = pd.to_numeric(af['Opp. Assisted FGM Pct.'])

In [111]:
af['Assisted FGM Pct.'] = pd.to_numeric(af['Assisted FGM Pct.'])

In [112]:
af['Opp. Assist/TO Ratio'] = pd.to_numeric(af['Opp. Assist/TO Ratio'])

In [113]:
af['Assist/TO Ratio'] = pd.to_numeric(af['Assist/TO Ratio'])

In [163]:
af['W/L'] = pd.to_numeric(af['W/L'])

Explanation

In order to group the data recieved by NBAMiner.com by team name we decided to use the groupby function. We were then able to take a wider look at all the assist related attributes of all 32 of the teams in comparison to one another.


In [184]:
afgb = af.groupby('Team')
afgb_atr = afgb['Assist/TO Ratio'].agg([np.sum, np.mean, np.median])
afgb_oatr = afgb['Opp. Assist/TO Ratio'].agg([np.sum, np.mean, np.median])
afgb_afgm = afgb['Assisted FGM Pct.'].agg([np.sum, np.mean, np.median])
afgb_oafgm = afgb['Opp. Assisted FGM Pct.'].agg([np.sum, np.mean, np.median])
afgb_wl = afgb['W/L'].agg([np.sum, np.mean, np.median])

In [191]:
fig, ax = plt.subplots(nrows=2, ncols=1, sharex=True, figsize=(15,10))
fig.suptitle('Assist/TO Ratio compared to W/L', fontsize=18, fontweight='bold')
afgb_atr['mean'].plot(kind='bar', ax=ax[0], color='orchid')
afgb_wl['mean'].plot(kind='bar', ax=ax[1], color='green')


ax[0].set_ylabel('Assist/TO Ratio', fontsize=15)
ax[1].set_ylabel('W/L Ratio', fontsize=15)


Out[191]:
<matplotlib.text.Text at 0x123db03c8>

In [204]:
fig, ax = plt.subplots(nrows=2, ncols=1, sharex=True, figsize=(15,10))
fig.suptitle('Opposing Team Assist/TO ratio compared to W/L', fontsize=18, fontweight='bold')
afgb_oatr['mean'].plot(kind='bar', ax=ax[0], color='mediumslateblue')
afgb_wl['mean'].plot(kind='bar', ax=ax[1], color='green')


ax[0].set_ylabel('Opp. Assist/TO Ratio', fontsize=15)
ax[1].set_ylabel('W/L Ratio', fontsize=15)


Out[204]:
<matplotlib.text.Text at 0x129b79470>

In [210]:
fig, ax = plt.subplots(nrows=2, ncols=1, sharex=True, figsize=(15,10))
fig.suptitle('Assisted Field Goals Made % compared to W/L', fontsize=18, fontweight='bold')
afgb_afgm['mean'].plot(kind='bar', ax=ax[0], color='magenta')
afgb_wl['mean'].plot(kind='bar', ax=ax[1], color='green')

ax[0].set_ylabel('Assisted FGM Pct.', fontsize=15)
ax[1].set_ylabel('W/L Ratio', fontsize=15)


Out[210]:
<matplotlib.text.Text at 0x12b2f2940>

In [214]:
fig, ax = plt.subplots(nrows=2, ncols=1, sharex=True, figsize=(15,10))
fig.suptitle('Opposing Assisted Field Goals Made % compared to W/L', fontsize=18, fontweight='bold')
afgb_oafgm['mean'].plot(kind='bar', ax=ax[0], color='pink')
afgb_wl['mean'].plot(kind='bar', ax=ax[1], color='green')


ax[0].set_ylabel('Opp. Assisted FGM Pct.', fontsize=15)
ax[1].set_ylabel('W/L Ratio', fontsize=15)


Out[214]:
<matplotlib.text.Text at 0x12c4c6710>

Conclusion

After analyzing these assist-related stats, as well as assists, we found that none of these attributes have a strong connection to a team's success, which is defined by the team's W/L ratio. However, in certain well-performing teams, such as the Golden State Warriors and the San Antonio Spurs, we saw that they were consistently at the top ranks for statistics such as 3PA or 3P%, and ranked lower in tables that measured the opposing team's performance. Attributes such as opp. assist/TO ratio and opp. assisted FGM% measure the opposing team's performance, so it is actually good these statistics are low.

While three-pointers show a much stronger correlation to W/L ratios, this in-depth look at assists and other statistics that measure teamwork and ball handling in every team, suggest that teams must be strong on multiple fronts in order to be successful.

While three-pointers have undoubtedly contributed to the success of teams in recent years, as well as placed pressure on other teams to increase three-point attempts, this tactic must be supplemented by other statistics and other factors that the data cannot show. Some major "soft" factors include coaching styles,playing environment, and even a team's chemistry.

In the future, we would like to be able to find a way to analyze the relationship between the financial value of a team and its performance. We originally planned on this but the website we found proved to be too challenging to scrape data from as it was written in Java, not HTML/CSS. It would be interesting to see whether or not teams can "buy" their way to success, and maybe see how succes of a team is reflected in the team's financial bottom line.


In [ ]: