Example of how to set up your lab notebook

Analysis in this notebook

  • [Dead end] Does year predict production?
  • Does "hours worked" correlate with production?

Tip

Standard imports at the top

Imports should be grouped in the following order:

  1. magics
  2. Alphabetical order
    1. standard library imports
    2. related third party imports
    3. local application/library specific imports

In [7]:
# Magics first (server issues)
%matplotlib inline 
# Do below if you want interactive matplotlib plot ()
# %matplotlib notebook 

# https://ipython.org/ipython-doc/dev/config/extensions/autoreload.html
%load_ext autoreload
%autoreload 2

# %install_ext http://raw.github.com/jrjohansson/version_information/master/version_information.py
%load_ext version_information
%version_information numpy, scipy, matplotlib, pandas


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
The version_information extension is already loaded. To reload it, use:
  %reload_ext version_information
Out[7]:
SoftwareVersion
Python2.7.10 64bit [GCC 4.2.1 (Apple Inc. build 5577)]
IPython3.2.1
OSDarwin 14.4.0 x86_64 i386 64bit
numpy1.9.2
scipy0.15.1
matplotlib1.4.3
pandas0.16.2
Thu Jul 23 19:55:17 2015 PDT

In [8]:
# Standard library
import os
import sys
sys.path.append("../src/")

# Third party imports
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

# Local imports
from simpleexample import example_func

In [9]:
# Customizations
sns.set() # matplotlib defaults

# Any tweaks that normally go in .matplotlibrc, etc., should explicitly go here
plt.rcParams['figure.figsize'] = (12, 12)

In [10]:
# Find the notebook the saved figures came from
fig_prefix = "../figures/2015-07-16-jw-"

In [ ]:
example_func()

In [15]:

Importing cleaned data

See ../deliver/coal_data_cleanup.ipynb for how the raw data was cleaned.


In [3]:
from IPython.display import FileLink

In [4]:
FileLink("../deliver/coal_data_cleanup.ipynb")





In [6]:
dframe = pd.read_csv("../data/coal_prod_cleaned.csv")

[Dead end] Does year predict production?


In [7]:
plt.scatter(dframe['Year'], dframe['Production_short_tons'])


Out[7]:
<matplotlib.collections.PathCollection at 0x10bb83710>

Does Hours worked correlate with output?


In [10]:
df2 = dframe.groupby('Mine_State').sum()

In [29]:
sns.jointplot('Labor_Hours', 'Production_short_tons', data=df2, kind="reg", ) 
plt.xlabel("Labor Hours Worked")
plt.ylabel("Total Amount Produced") 
plt.tight_layout()
# plt.savefig(fig_prefix + "production-vs-hours-worked.png", dpi=350)



In [23]:
%load_ext autoreload
%autoreload 2

In [24]:
import sys
sys.path.append("../src/")

In [26]:
from simpleexample import example_func
example_func()


Out[26]:
'This works.'

In [27]:
example_func()


Out[27]:
'This works, seriously you can update this.'

In [ ]: