In [1]:
%matplotlib inline
from ggplot import *
ggplot provides 2 main ways to visualize distributions: histograms and density plots. Both are fairly easy to do, but it's not recommended that you use them at the same time. Reason being the scales of each are very different and can create confusion about your data as opposed to being helpful. So before you ask, no, there is not an easy (or at least sanctioned) way to create a histogram that's overlayed with a density plot.
stat_density
and geom_density
geoms can be applied to ggplot base objects to create density plots. They're actually the exact same thing, it's just a matter of preference as to whether you want to use the stat
or geom
version. Both use a gaussian kernel density estimator to estimate the probability density function that's used in the plot.
In [2]:
ggplot(diamonds, aes(x='price')) + geom_density()
Out[2]:
In [3]:
ggplot(diamonds, aes(x='price')) + stat_density()
Out[3]:
Just as you do can with other geoms, you can add different aesthetics to your plot in order to visualize multi-dimensional data.
In [4]:
ggplot(diamonds, aes(x='price', color='clarity')) + stat_density()
Out[4]:
Careful, it's easy to get carried away
In [5]:
ggplot(diamonds, aes(x='price', color='clarity', linetype='cut')) + stat_density()
Out[5]:
In [6]:
ggplot(diamonds, aes(x='price')) + geom_histogram()
Out[6]:
Again, just as you do can with other geoms, you can add different aesthetics to your plot in order to visualize multi-dimensional data.
In [7]:
ggplot(diamonds, aes(x='price', fill='clarity')) + geom_histogram()
Out[7]:
In [ ]: