In [1]:
import seaborn as sns
%matplotlib inline
flights = sns.load_dataset('flights')
flights.head()
Out[1]:
In [2]:
flights.shape
Out[2]:
Let us pivot this flights data such that it becomes a 2D matrix. Lets make the Month as row indices
In [3]:
flights_pv = flights.pivot_table(index='month', columns='year', values='passengers')
flights_pv.head()
Out[3]:
Using pivot_tables
we have also aggregated the data by month and years.
In [4]:
sns.heatmap(flights_pv)
Out[4]:
From the heatmap above, we see there are more passengers in summer (June, July, August) and the number of passengers increases by the year as well.
In [2]:
#from ml chapter, read titanic data
import pandas as pd
titanic = pd.read_csv('../udemy_ml_bootcamp/Machine Learning Sections/Logistic-Regression/titanic_train.csv')
titanic.head()
Out[2]:
In [5]:
titanic.shape
Out[5]:
In [3]:
titanic.isnull().head()
Out[3]:
In [4]:
sns.heatmap(titanic.isnull(), yticklabels=False, cbar=False, cmap='viridis')
Out[4]:
You can see Age
and Cabin
columns have lots of null while others have none or very few.
In [5]:
sns.clustermap(flights_pv)
Out[5]:
Cluster map rearranges the data to show cells of similar values close by.
In [6]:
tips = sns.load_dataset('tips')
tips.head()
Out[6]:
In [7]:
#regressin total bill to the tip
sns.lmplot(x='total_bill', y='tip', data=tips)
Out[7]:
You can decorate this by splitting it by sex and assigning a different color for males and females
In [9]:
sns.lmplot(x='total_bill', y='tip', data=tips, hue='sex')
Out[9]:
You can bring in factors like day of week and create a regression for each day
In [10]:
sns.lmplot(x='total_bill', y='tip', data=tips, hue='sex', col='day')
Out[10]: