2016 Phillies Games Broadcast on National Television

I like watching the Phillies. I do not have cable. Some Phillies games are broadcast on national television. This is how I made a list of those games.

Pandas

Pandas is a data analysis tool for the Python programming language. It can do a tremendous amount of really powerful data analysis and visualization. It's a gun in this CSV knife fight.


In [1]:
import pandas as pd

A downloadable CSV schedule is available from mlb.com. Here is a direct link to the Phillies schedule.

The CSV schedule will be used to instantiate a Pandas DataFrame object.


In [2]:
schedule = pd.DataFrame.from_csv("phillies-2016.csv")

What does the schedule metadata look like?


In [3]:
schedule.info()


<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 190 entries, 2016-03-07 to 2016-10-02
Data columns (total 16 columns):
START TIME          189 non-null object
START TIME ET       189 non-null object
SUBJECT             190 non-null object
LOCATION            190 non-null object
DESCRIPTION         187 non-null object
END DATE            190 non-null object
END DATE ET         190 non-null object
END TIME            189 non-null object
END TIME ET         189 non-null object
REMINDER OFF        190 non-null bool
REMINDER ON         190 non-null bool
REMINDER DATE       190 non-null object
REMINDER TIME       189 non-null object
REMINDER TIME ET    189 non-null object
SHOWTIMEAS FREE     190 non-null object
SHOWTIMEAS BUSY     190 non-null object
dtypes: bool(2), object(14)
memory usage: 22.6+ KB

190 games and 16 columns of data for each game.

What does the schedule data itself look like?


In [4]:
schedule.head()


Out[4]:
START TIME START TIME ET SUBJECT LOCATION DESCRIPTION END DATE END DATE ET END TIME END TIME ET REMINDER OFF REMINDER ON REMINDER DATE REMINDER TIME REMINDER TIME ET SHOWTIMEAS FREE SHOWTIMEAS BUSY
START DATE
2016-03-07 01:05 PM 01:05 PM Phillies at Pirates McKechnie Field - Bradenton Local TV: MLB.TV ----- Local Radio: MLB.com 03/07/16 03/07/16 04:05 PM 04:05 PM False True 03/07/16 12:05 PM 12:05 PM FREE BUSY
2016-03-08 01:05 PM 01:05 PM Pirates at Phillies Bright House Field - Clearwater Local TV: TCN- MLB.TV 03/08/16 03/08/16 04:05 PM 04:05 PM False True 03/08/16 12:05 PM 12:05 PM FREE BUSY
2016-03-09 01:05 PM 01:05 PM Phillies at Twins CenturyLink Sports Complex - Fort Myers NaN 03/09/16 03/09/16 04:05 PM 04:05 PM False True 03/09/16 12:05 PM 12:05 PM FREE BUSY
2016-03-09 01:05 PM 01:05 PM Orioles at Phillies Bright House Field - Clearwater Local TV: TCN- MLB.TV 03/09/16 03/09/16 04:05 PM 04:05 PM False True 03/09/16 12:05 PM 12:05 PM FREE BUSY
2016-03-10 01:05 PM 01:05 PM Tigers at Phillies Bright House Field - Clearwater Local TV: TCN- MLBN- MLB.TV 03/10/16 03/10/16 04:05 PM 04:05 PM False True 03/10/16 12:05 PM 12:05 PM FREE BUSY

Cleaning up the schedule

The DESCRIPTION column contains the broadcast information. Less interesting columns can be removed.


In [6]:
schedule.drop(["REMINDER OFF", 
             "REMINDER ON",
             "START TIME ET",
             "END DATE",
             "END DATE ET",
             "END TIME",
             "END TIME ET",
             "REMINDER TIME",
             "REMINDER TIME ET",
             "SHOWTIMEAS FREE",
             "SHOWTIMEAS BUSY",
             "REMINDER DATE"], axis=1, inplace=True)
schedule.head()


Out[6]:
START TIME SUBJECT LOCATION DESCRIPTION
START DATE
2016-03-07 01:05 PM Phillies at Pirates McKechnie Field - Bradenton Local TV: MLB.TV ----- Local Radio: MLB.com
2016-03-08 01:05 PM Pirates at Phillies Bright House Field - Clearwater Local TV: TCN- MLB.TV
2016-03-09 01:05 PM Phillies at Twins CenturyLink Sports Complex - Fort Myers NaN
2016-03-09 01:05 PM Orioles at Phillies Bright House Field - Clearwater Local TV: TCN- MLB.TV
2016-03-10 01:05 PM Tigers at Phillies Bright House Field - Clearwater Local TV: TCN- MLBN- MLB.TV

What are all of the stations that games are broadcast on this season?

The DESCRIPTION column is nice because it mentions the stations that games are broadcast on. Sometimes a game is broadcast on two channels at once. There is also radio broadcast information that I'm not interested in right now.


In [11]:
schedule.DESCRIPTION.head(50)


Out[11]:
START DATE
2016-03-07        Local TV: MLB.TV ----- Local Radio: MLB.com
2016-03-08                              Local TV: TCN- MLB.TV
2016-03-09                                                NaN
2016-03-09                              Local TV: TCN- MLB.TV
2016-03-10                        Local TV: TCN- MLBN- MLB.TV
2016-03-11                               Local Radio: MLB.com
2016-03-12    Local TV: CSN- MLB.TV ----- Local Radio: 94 WIP
2016-03-13         Local TV: MLB.TV ----- Local Radio: 94 WIP
2016-03-14                               Local Radio: MLB.com
2016-03-15                               Local Radio: MLB.com
2016-03-17                              Local TV: TCN- MLB.TV
2016-03-18                              Local TV: TCN- MLB.TV
2016-03-19         Local TV: MLB.TV ----- Local Radio: 94 WIP
2016-03-20    Local TV: CSN- MLB.TV ----- Local Radio: 94 WIP
2016-03-21                               Local Radio: MLB.com
2016-03-22                              Local TV: TCN- MLB.TV
2016-03-23                                Local Radio: 94 WIP
2016-03-24         Local TV: MLB.TV ----- Local Radio: 94 WIP
2016-03-25    Local TV: CSN- MLB.TV ----- Local Radio: 94 WIP
2016-03-26    Local TV: CSN- MLB.TV ----- Local Radio: 94 WIP
2016-03-27         Local TV: MLB.TV ----- Local Radio: 94 WIP
2016-03-28                                                NaN
2016-03-29                        Local TV: TCN- MLBN- MLB.TV
2016-03-30                              Local TV: TCN- MLB.TV
2016-03-31    Local TV: TCN- MLB.TV ----- Local Radio: 94 WIP
2016-04-01    Local TV: TCN- MLB.TV ----- Local Radio: 94 WIP
2016-04-02    Local TV: TCN- MLB.TV ----- Local Radio: 94 WIP
2016-04-04                                      Local TV: CSN
2016-04-06                               Local TV: CSN- ESPN2
2016-04-07                                      Local TV: CSN
2016-04-08                                      Local TV: CSN
2016-04-09                                      Local TV: CSN
2016-04-10                                      Local TV: TCN
2016-04-11                                   Local TV: NBC 10
2016-04-12                                      Local TV: TCN
2016-04-13                                      Local TV: TCN
2016-04-14                                      Local TV: CSN
2016-04-15                                      Local TV: CSN
2016-04-16                                      Local TV: CSN
2016-04-17                                      Local TV: CSN
2016-04-18                                      Local TV: CSN
2016-04-19                                      Local TV: CSN
2016-04-20                                      Local TV: CSN
2016-04-22                                      Local TV: TCN
2016-04-23                                      Local TV: CSN
2016-04-24                                      Local TV: CSN
2016-04-26                                      Local TV: CSN
2016-04-27                                      Local TV: CSN
2016-04-28                                      Local TV: CSN
2016-04-29                                      Local TV: CSN
Name: DESCRIPTION, dtype: object

Parse television station broadcast channels from DESCRIPTION

Thankfully, the DESCRIPTION column data is parseable. Getting a list of television broadcast stations for each game is not too difficult.


In [73]:
description = schedule.DESCRIPTION[6]
print description


Local TV: CSN- MLB.TV ----- Local Radio: 94 WIP

Grab the rough station string with a regular expression.


In [123]:
import re

TV_STATION_RE = re.compile(r"""Local\s+TV:\s+    # TV token
                               (?P<stations>.*)  # Group everything following it lazily as stations
                               """, re.X)

Use that to pull them out and do some text wrangling.


In [ ]:
def tv_stations_from_description(description):
    """Return a list of television stations embedded in the given description."""
    tv_stations = []
    result = re.search(TV_STATION_RE, str(description))
    if result:
        media_delimiter = "-----"
        tv_station_str = result.group("stations").split(media_delimiter)[0]
        tv_stations = tv_station_str.split("- ")
        tv_stations = [s.strip() for s in tv_stations]
    return tv_stations

Test it out on all of the descriptions.


In [126]:
tv_stations = set()
for d in schedule.DESCRIPTION:
    tv_stations |= set(tv_stations_from_description(d))
tv_stations


Out[126]:
{'CSN', 'ESPN2', 'MLB.TV', 'MLBN', 'NBC 10', 'TCN'}

Applying this function to the DataFrame yields a Series of all television stations on which the Phillies are broadcast this season.


In [127]:
stations_series = schedule.DESCRIPTION.apply(lambda d: tv_stations_from_description(d))
stations_series


Out[127]:
START DATE
2016-03-07               [MLB.TV]
2016-03-08          [TCN, MLB.TV]
2016-03-09                     []
2016-03-09          [TCN, MLB.TV]
2016-03-10    [TCN, MLBN, MLB.TV]
2016-03-11                     []
2016-03-12          [CSN, MLB.TV]
2016-03-13               [MLB.TV]
2016-03-14                     []
2016-03-15                     []
2016-03-17          [TCN, MLB.TV]
2016-03-18          [TCN, MLB.TV]
2016-03-19               [MLB.TV]
2016-03-20          [CSN, MLB.TV]
2016-03-21                     []
2016-03-22          [TCN, MLB.TV]
2016-03-23                     []
2016-03-24               [MLB.TV]
2016-03-25          [CSN, MLB.TV]
2016-03-26          [CSN, MLB.TV]
2016-03-27               [MLB.TV]
2016-03-28                     []
2016-03-29    [TCN, MLBN, MLB.TV]
2016-03-30          [TCN, MLB.TV]
2016-03-31          [TCN, MLB.TV]
2016-04-01          [TCN, MLB.TV]
2016-04-02          [TCN, MLB.TV]
2016-04-04                  [CSN]
2016-04-06           [CSN, ESPN2]
2016-04-07                  [CSN]
                     ...         
2016-08-31                  [CSN]
2016-09-02                  [CSN]
2016-09-03                  [CSN]
2016-09-04                  [CSN]
2016-09-05                  [CSN]
2016-09-06                  [CSN]
2016-09-07                  [CSN]
2016-09-08                  [CSN]
2016-09-09                  [CSN]
2016-09-10                  [CSN]
2016-09-11                  [CSN]
2016-09-12                  [CSN]
2016-09-13                  [CSN]
2016-09-14                  [CSN]
2016-09-15                  [CSN]
2016-09-16                  [CSN]
2016-09-17                  [CSN]
2016-09-18                  [CSN]
2016-09-20                  [CSN]
2016-09-21                  [CSN]
2016-09-22                  [CSN]
2016-09-23                  [CSN]
2016-09-24                  [CSN]
2016-09-25                  [CSN]
2016-09-27                  [CSN]
2016-09-28                  [CSN]
2016-09-29                  [CSN]
2016-09-30                  [CSN]
2016-10-01                  [CSN]
2016-10-02                  [CSN]
Name: DESCRIPTION, dtype: object

Double check the set of stations from that Series.


In [129]:
set([station for stations in stations_series.values for station in stations])


Out[129]:
{'CSN', 'ESPN2', 'MLB.TV', 'MLBN', 'NBC 10', 'TCN'}

The 190 Phillies games are broadcast on 6 television channels. Unfortunately only 1 of those 6 stations are available without a cable television subscription. This means that I can only watch games on NBC.

The Phillies national television broadcast schedule

Filtering the DESCRIPTION column to national television broadcast stations yields only the games which I can watch over the air with my HD antenna.


In [117]:
national_broadcast_schedule = schedule[schedule.DESCRIPTION.str.contains("NBC 10") == True]
national_broadcast_schedule


Out[117]:
START TIME SUBJECT LOCATION DESCRIPTION
START DATE
2016-04-11 03:05 PM Padres at Phillies Citizens Bank Park - Philadelphia Local TV: NBC 10
2016-06-03 07:05 PM Brewers at Phillies Citizens Bank Park - Philadelphia Local TV: NBC 10
2016-06-10 07:05 PM Phillies at Nationals Nationals Park - Washington Local TV: NBC 10
2016-06-17 07:05 PM D-backs at Phillies Citizens Bank Park - Philadelphia Local TV: NBC 10
2016-06-23 01:10 PM Phillies at Twins Target Field - Minneapolis Local TV: NBC 10
2016-07-15 07:05 PM Mets at Phillies Citizens Bank Park - Philadelphia Local TV: NBC 10
2016-07-16 07:05 PM Mets at Phillies Citizens Bank Park - Philadelphia Local TV: NBC 10
2016-07-22 07:05 PM Phillies at Pirates PNC Park - Pittsburgh Local TV: NBC 10
2016-07-30 07:10 PM Phillies at Braves Turner Field - Atlanta Local TV: NBC 10
2016-08-04 01:05 PM Giants at Phillies Citizens Bank Park - Philadelphia Local TV: NBC 10

In [118]:
national_broadcast_schedule.describe()


Out[118]:
START TIME SUBJECT LOCATION DESCRIPTION
count 10 10 10 10
unique 5 9 5 1
top 07:05 PM Mets at Phillies Citizens Bank Park - Philadelphia Local TV: NBC 10
freq 6 2 6 10

This means that I have the possibility to watch 10 out of 190 Phillies games this season which is roughly 5%.