Title: Making A Matplotlib Scatterplot From A Pandas Dataframe
Slug: matplotlib_scatterplot_from_pandas
Summary: Making A Matplotlib Scatterplot From A Pandas Dataframe
Date: 2016-05-01 12:00
Category: Python
Tags: Data Visualization
Authors: Chris Albon

Based on: StackOverflow.

import modules


In [1]:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

Create dataframe


In [2]:
raw_data = {'first_name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'], 
        'last_name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze'], 
        'female': [0, 1, 1, 0, 1],
        'age': [42, 52, 36, 24, 73], 
        'preTestScore': [4, 24, 31, 2, 3],
        'postTestScore': [25, 94, 57, 62, 70]}
df = pd.DataFrame(raw_data, columns = ['first_name', 'last_name', 'age', 'female', 'preTestScore', 'postTestScore'])
df


Out[2]:
first_name last_name age female preTestScore postTestScore
0 Jason Miller 42 0 4 25
1 Molly Jacobson 52 1 24 94
2 Tina Ali 36 1 31 57
3 Jake Milner 24 0 2 62
4 Amy Cooze 73 1 3 70

Scatterplot of preTestScore and postTestScore, with the size of each point determined by age


In [3]:
plt.scatter(df.preTestScore, df.postTestScore
, s=df.age)


Out[3]:
<matplotlib.collections.PathCollection at 0x112fbaac8>

Scatterplot of preTestScore and postTestScore with the size = 300 and the color determined by sex


In [5]:
plt.scatter(df.preTestScore, df.postTestScore, s=300, c=df.female)


Out[5]:
<matplotlib.collections.PathCollection at 0x11320b400>