In [ ]:
# this is a python comment
# this cell contains python code
# executing the cell yields the results of the python command
In [ ]:
# live code some graphics here
In [ ]:
# your turn: plot some additional digits of pi
In [ ]:
# live code an example of loading the va data csv with pandas here
In [ ]:
df = pd.read_csv('../3-data/
In [ ]:
# DataFrame.iloc method selects row and columns by "integer location"
df.iloc[5:10,
In [ ]:
# If you are new to this sort of thing, what do you think this does?
df.iloc[5:10, :10
In [ ]:
# I don't have time to show you the details now, but I find that
# pandas DataFrames have really done things well. For example:
df.gs_text34
In [ ]:
df.gs_text34.value_counts(
In [ ]:
# you can guess what the next line does,
# even if you have never used python before:
import sklearn.neighbors
In [ ]:
# here is how sklearn creates a "classifier":
clf =
In [ ]:
# I didn't mention `numpy` before, but this is "the fundamental
# package for scientific computing with Python"
In [ ]:
# sklearn gets mixed up with Pandas DataFrames and Series,
# so you need to turn things into np.arrays:
X =
y =
In [ ]:
# one nice thing about sklearn is that it has all different
# fancy machine learning methods, but they all follow a
# common pattern:
# fit
In [ ]:
# predict