This tutorial will guide you through the basic process of creating visualizations in Altair. First, you will need to make sure you have the Altair package and its dependencies installed (see Installation) and make sure you understand how altair plots are displayed (see Displaying Altair Charts). This tutorial will assume you are working within a Jupyter notebook user interface, so that plots are automatically rendered.
Here is the outline of this basic tutorial:
Data in Altair is built around the Pandas Dataframe. One of the defining characteristics of statistical visualization is that it begins with tidy Dataframes. For the purposes of this tutorial, we’ll start by importing Pandas and creating a simple DataFrame to visualize, with a categorical variable in column a and a numerical variable in column b:
import pandas as pd
data = pd.DataFrame({'a': list('CCCDDDEEE'),
'b': [2, 7, 4, 1, 2, 6, 8, 4, 7]})
When using Altair, datasets are most commonly provided as a Dataframe. As we will see, the labeled columns of the dataframe are an essential piece of plotting with Altair.
The fundamental object in Altair is the Chart, which takes a dataframe as a single argument:
import altair as alt
chart = alt.Chart(data)
So far, we have defined the Chart object, but we have not yet told the chart to do anything with the data. That will come next.
In [ ]: