In [1]:
%matplotlib inline
from ggplot import *
In [2]:
mtcars.head()
Out[2]:
The simplest plot you can make is an x/y plot. To do this in ggplot, first create a base layer using the ggplot
function. Pass in your data and aesthetics that map columns in your DataFrame to x
and y
. In this case, we're going to look at the relationship between car weight (wt
) and miles per gallon (mpg
), so we'll set the x value of our aesthetics to wt
and the y value to mpg
.
Once the aesthetics are defined, we just need to add a geom_point
layer to our plot.
In [3]:
p = ggplot(mtcars, aes(x='wt', y='mpg'))
p + geom_point()
Out[3]:
In addition to the x and y variables, you can also control the shape, size, color, alpha (see throughness), and other "aesthetics" of your scatterplot. To do this, just include definitions for the aesthetic you'd like to control. For instance, let's set the color of each point in our graph to the acceleration (qsec
) of each car.
In [4]:
p = ggplot(mtcars, aes(x='wt', y='mpg', color='qsec')) + geom_point()
print(p)
Since qsec
is a continuous variable you probably noticed that you got that nice, graduated legend on the right side of the last plot that indicated the value of each color. If you were to use a discrete variable such as name
for the color, watch what happens.
As you can see, each name is assigned its own color and displayed in the legend. This can get a bit unruly (especially if you're plotting a discrete value with hundreds of possible values) so be careful.
In [5]:
p = ggplot(mtcars, aes(x='wt', y='mpg', color='name')) + geom_point()
p
Out[5]:
You can also use the ggplot shorthand factor
to discretize continuous variables. For example, let's change cyl
from a numerical to a categorical variable.
In [6]:
ggplot(mtcars, aes(x='wt', y='mpg', color='factor(cyl)')) + geom_point()
Out[6]:
In [ ]: