Building a pandas Cheat Sheet, Part 1
Import pandas with the right name
In [11]:
    
import pandas as pd
    
In [12]:
    
df = pd.read_csv("07-hw-animals.csv")
    
Set all graphics from matplotlib to display inline
In [13]:
    
#!pip install matplotlib
    
In [14]:
    
import matplotlib.pyplot as plt
%matplotlib inline
#This lets your graph show you in your notebook
    
In [15]:
    
df
    
    Out[15]:
Display the names of the columns in the csv
In [16]:
    
df['name']
    
    Out[16]:
Display the first 3 animals.
In [17]:
    
df.head(3)
    
    Out[17]:
Sort the animals to see the 3 longest animals.
In [18]:
    
df.sort_values('length', ascending=False).head(3)
    
    Out[18]:
What are the counts of the different values of the "animal" column? a.k.a. how many cats and how many dogs.
In [19]:
    
df['animal'].value_counts()
    
    Out[19]:
Only select the dogs.
In [20]:
    
dog_df = df['animal'] == 'dog'
df[dog_df]
    
    Out[20]:
Display all of the animals that are greater than 40 cm.
In [21]:
    
long_animals = df['length'] > 40
df[long_animals]
    
    Out[21]:
In [22]:
    
df['length_inches'] = df['length'] / 2.54
df
    
    Out[22]:
Save the cats to a separate variable called "cats." Save the dogs to a separate variable called "dogs."
In [23]:
    
cats = df['animal'] == 'cat'
dogs = df['animal'] == 'dog'
    
Display all of the animals that are cats and above 12 inches long. First do it using the "cats" variable, then do it using your normal dataframe.
In [24]:
    
long_animals = df['length_inches'] > 12
df[cats & long_animals]
    
    Out[24]:
In [25]:
    
df[(df['length_inches'] > 12) & (df['animal'] == 'cat')]
#Amazing!
    
    Out[25]:
In [26]:
    
df[cats].mean()
    
    Out[26]:
In [27]:
    
df[dogs].mean()
    
    Out[27]:
Use groupby to accomplish both of the above tasks at once.
In [28]:
    
df.groupby('animal').mean()
#groupby
    
    Out[28]:
Make a histogram of the length of dogs. I apologize that it is so boring.
In [29]:
    
df[dogs].plot.hist(y='length_inches')
    
    Out[29]:
    
Change your graphing style to be something else (anything else!)
In [30]:
    
df[dogs].plot.bar(x='name', y='length_inches')
    
    Out[30]:
    
Make a horizontal bar graph of the length of the animals, with their name as the label (look at the billionaires notebook I put on Slack!)
In [31]:
    
df[dogs].plot.barh(x='name', y='length_inches')
#Fontaine is such an annoying name for a dog
    
    Out[31]:
    
Make a sorted horizontal bar graph of the cats, with the larger cats on top.
In [34]:
    
df[cats].sort(['length_inches'], ascending=False).plot(kind='barh', x='name', y='length_inches')
#df[df['animal']] == 'cat'].sort_values(by='length).plot(kind='barh', x='name', y='length', legend=False)
    
    
    Out[34]:
    
In [ ]: