The next module we will explore is Seaborn. Seaborn is a Python visualization library based on matplotlib. It is built on top of matplotlib and tightly integrated with the PyData stack, including support for numpy and pandas data structures and statistical routines from scipy and statsmodels. It provides a high-level interface for drawing attractive statistical graphics... emphasis on STATISTICS. You don't want to use Seaborn as a general purpose charting libray.
http://web.stanford.edu/~mwaskom/software/seaborn/index.html
In [1]:
%matplotlib inline
import matplotlib
import seaborn as sns
import pandas as pd
import numpy as np
import warnings
sns.set(color_codes=True)
warnings.filterwarnings("ignore")
In [2]:
tips = pd.read_csv('input/tips.csv')
tips['tip_percent'] = (tips['tip'] / tips['total_bill'] * 100)
tips.head()
Out[2]:
In [3]:
tips.describe()
Out[3]:
http://web.stanford.edu/~mwaskom/software/seaborn/tutorial/regression.html
In [4]:
sns.jointplot("total_bill", "tip_percent", tips, kind='reg');
In [5]:
sns.lmplot(x="total_bill", y="tip_percent", hue="ordered_alc_bev", data=tips)
Out[5]:
In [6]:
sns.lmplot(x="total_bill", y="tip_percent", col="day", data=tips, aspect=.5)
Out[6]:
In [7]:
sns.lmplot(x="total_bill", y="tip_percent", hue='ordered_alc_bev', col="time", row='gender', size=6, data=tips);
http://web.stanford.edu/~mwaskom/software/seaborn/tutorial/regression.html
In [8]:
# Let's add some calculated columns
tips['tip_above_avg'] = np.where(tips['tip_percent'] >= tips['tip_percent'].mean(), 1, 0)
tips.replace({'Yes': 1, 'No': 0}, inplace=True)
tips.head()
Out[8]:
In [9]:
sns.lmplot(x="tip_percent", y="ordered_alc_bev", col='gender', data=tips, logistic=True)
Out[9]:
In [10]:
sns.lmplot(x="ordered_alc_bev", y="tip_above_avg", col='gender', data=tips, logistic=True)
Out[10]:
In [ ]: