This notebook provides a brief introduction to the plotting functions in seaborn that help visualize linear relationships between variables.
For much more detail, you can check out the documentation on graphing quantitative and categorical linear models.
In [1]:
%matplotlib inline
In [2]:
import seaborn as sns
import matplotlib.pyplot as plt
In [3]:
tips = sns.load_dataset("tips")
iris = sns.load_dataset("iris")
exercise = sns.load_dataset("exercise")
titanic = sns.load_dataset("titanic")
Seaborn visualizes linear regressions with regplot by plotting a regression line and confidence band over a scatterplot of the data:
In [4]:
sns.regplot("total_bill", "tip", tips);
The higher-level function lmplot can draw this plot separately for different parts of a dataset.
In [5]:
sns.lmplot("total_bill", "tip", hue="smoker", col="sex", data=tips);
You can also use the jointplot function to show the marginal distributions of the two variables.
In [6]:
sns.jointplot("total_bill", "tip", tips, kind="reg");
The interactplot function can be used to help understand complex interactions in your dataset.
In [7]:
sns.interactplot("age", "fare", "survived", titanic.dropna(), logistic=True);
The low-level barplot and pointplot functions, and the higher-lever factorplot function, can draw similar plots in cases where you have categorical predictor variables.
In [8]:
sns.factorplot("time", "pulse", hue="kind", col="diet", data=exercise);
In [9]:
sns.factorplot("sex", "survived", hue="class", data=titanic, kind="bar", palette="Purples_d");
To explore many relationships simultaneously, you might want to look at scatterplots for each paired variable.
In [10]:
sns.pairplot(iris, "species", size=2.5);
For a similar visualization, you can draw a heatmap of correlation values.
In [11]:
f, ax = plt.subplots(figsize=(9, 9))
sns.corrplot(titanic.dropna(), ax=ax);
You can also fit a linear model and plot the model coefficients.
In [12]:
sns.coefplot("survived ~ pclass + scale(age) + sibsp + parch + scale(fare)", titanic.dropna());