In [ ]:

    
import pandas as pd
import seaborn as sbn
import statsmodels.api as sm
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')
%matplotlib inline



In [ ]:

    
df = pd.read_csv("data/brain_size.csv", sep=";", index_col=0, na_values='.')
df.head()

Plotting options of pandas series and dataframes



In [ ]:

    
df.plot?

Exercise: Try different plotting options from the `kind` parameter in the `plot` command



In [ ]:

    
ax = pd.tools.plotting.scatter_matrix(df[['Weight', 'Height', 'MRI_Count']])



In [ ]:

    
ax = pd.tools.plotting.scatter_matrix(df[['FSIQ', 'PIQ', 'VIQ']])

Q: Do the clusters mean anything?

Exercise: Plot scatter matrices for males and females separately. What is the inference?



In [ ]:

    
# Enter code here

Introducing Seaborn

Combining simple statistics with visualization



In [ ]:

    
df = pd.read_csv("data/wages.csv")
df.head()



In [ ]:

    
ax = sbn.pairplot(df, vars=['WAGE', 'AGE', 'EDUCATION'], kind="reg")

Q: What about categorical variables?



In [ ]:

    
ax = sbn.pairplot(df, vars=['WAGE', 'AGE', 'EDUCATION'], kind="reg", hue="SEX")

Simple regression with `lmplot`



In [ ]:

    
ax = sbn.lmplot(y="WAGE", x="EDUCATION", data=df)

Exercise:

1. Do wages depend on age or experience?

2. Answer the question above separately for men and women.



In [ ]:

    
# enter code here

Plotting options of pandas series and dataframes

Exercise: Try different plotting options from the kind parameter in the plot command

Q: Do the clusters mean anything?

Exercise: Plot scatter matrices for males and females separately. What is the inference?

Introducing Seaborn

Combining simple statistics with visualization

Q: What about categorical variables?

Simple regression with lmplot

Exercise:

1. Do wages depend on age or experience?

2. Answer the question above separately for men and women.

Exercise: Try different plotting options from the `kind` parameter in the `plot` command

Simple regression with `lmplot`