Matplotlib scatter plots

The aim of this tutorial is to reproduce the following figure. You can imagine this figure as measuring A and B from a set of samples numbered (0,1,2,3, and 4).


In [1]:
from IPython.display import Image
Image('../data/scatter_plot.png')


Out[1]:

Of course, the first thing we need is the data. Usually this data will come from your experiments or your computations, but here we are going to generate it. It's always good to be able to generate some fake data to see if our algorithms will work.

The package scikit-learn provides lots of tools for data mining, data analysis and visualization. We will introduce it in a future lecture in this course. Here we will only use it to generate our data. The datasets module has several functions to generate different types of data. We will use make_blobs.


In [4]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 6.0)
from sklearn.datasets.samples_generator import make_blobs

In [5]:
X, y = blobs = make_blobs(n_samples=500, centers=5, cluster_std=1.5, random_state=8)

Start with the default options for scatter and then change the code until you get as close as possible to the figure above.


In [10]:
plt.scatter(X[:,0], X[:,1])


Out[10]:
<matplotlib.collections.PathCollection at 0x7f2bb0a16898>

In [ ]: