Follow the README in Brandon Rhodes' Pycon Pandas Tutorial. The steps are briefly summarised below.
pycon-pandas-tutorial
repository.actors.list.gz
, actresses.list.gz
, genres.list.gz
, and release-dates.list.gz
.build
folder of the repository.Notes:
python
is generally to be run in the terminal.If you missed class, please watch Brandon's Pandas Tutorial Video from Pycon 2015 in Montréal.
Brandon Rhodes (Website | Twitter | Github | StackOverflow)
Complete Exercises-1.ipynb
.
After completing the exercise, copy it into the workspace
folder of the DAT-DC-12 repository. If you're having trouble with the command line, just copy and paste the file from one folder into the other. Commit and push the code.
[OPTIONAL] You'll notice that trying to run the Exercises-1.ipynb
notebook from the workspace folder will result in an error. It will no longer find the needed csv files. To fix this: Copy the three csv data files (titles.csv
, release_dates.csv
, and cast.csv
) into the data
folder of the class repository . Then, in Exercises-1.ipynb
change the line that reads the csv from data/titles.csv
to ../data/titles.csv
. By adding the ..
in the front we can now run the notebook from the workspace folder.
My current working directory is my ~/Development
folder. Within this folder I have my DAT-DC-12
folder and my pycon-pandas-tutorial
folder.
➜ Development pwd
/Users/johria/Development
➜ Development cd DAT-DC-12
➜ DAT-DC-12 [master] pwd
/Users/johria/Development/DAT-DC-12
➜ DAT-DC-12 [master] cd ..
➜ Development cd pycon-pandas-tutorial
➜ pycon-pandas-tutorial [master] pwd
/Users/johria/Development/pycon-pandas-tutorial
➜ pycon-pandas-tutorial [master] cd ..
Copy the three csvs (titles.csv
, release_dates.csv
, and cast.csv
) from pycon-pandas-tutorial/data/
to DAT-DC-12/data/
.
➜ Development cp pycon-pandas-tutorial/data/titles.csv DAT-DC-12/data
'pycon-pandas-tutorial/data/titles.csv' -> 'DAT-DC-12/data/titles.csv'
➜ Development cp pycon-pandas-tutorial/data/release_dates.csv DAT-DC-12/data
'pycon-pandas-tutorial/data/release_dates.csv' -> 'DAT-DC-12/data/release_dates.csv'
➜ Development cp pycon-pandas-tutorial/data/cast.csv DAT-DC-12/data
'pycon-pandas-tutorial/data/cast.csv' -> 'DAT-DC-12/data/cast.csv'
These csvs files are quite large so we want to ensure that we don't commit them to the repository. This is why you may notice that they are gitignored.