In [1]:
import numpy as np
from bqplot import *

Bins Mark

This Mark is essentially the same as the Hist Mark from a user point of view, but is actually a Bars instance that bins sample data.

The difference with Hist is that the binning is done in the backend, so it will work better for large data as it does not have to ship the whole data back and forth to the frontend.


In [2]:
# Create a sample of Gaussian draws
np.random.seed(0)
x_data = np.random.randn(1000)

Give the Hist mark the data you want to perform as the sample argument, and also give 'x' and 'y' scales.


In [3]:
x_sc = LinearScale()
y_sc = LinearScale()

hist = Bins(sample=x_data, scales={'x': x_sc, 'y': y_sc}, padding=0,)
ax_x = Axis(scale=x_sc, tick_format='0.2f')
ax_y = Axis(scale=y_sc, orientation='vertical')

Figure(marks=[hist], axes=[ax_x, ax_y], padding_y=0)


The midpoints of the resulting bins and their number of elements can be recovered via the read-only traits x and y:


In [4]:
hist.x, hist.y


Out[4]:
(array([-2.75586815, -2.17531833, -1.59476851, -1.0142187 , -0.43366888,
         0.14688094,  0.72743075,  1.30798057,  1.88853039,  2.46908021]),
 array([  9,  20,  70, 146, 217, 239, 160,  86,  38,  15]))

Tuning the bins

Under the hood, the Bins mark is really a Bars mark, with some additional magic to control the binning. The data in sample is binned into equal-width bins. The parameters controlling the binning are the following traits:

  • bins sets the number of bins. It is either a fixed integer (10 by default), or the name of a method to determine the number of bins in a smart way ('auto', 'fd', 'doane', 'scott', 'rice', 'sturges' or 'sqrt').

  • min and max set the range of the data (sample) to be binned

  • density, if set to True, normalizes the heights of the bars.

For more information, see the documentation of numpy's histogram


In [5]:
x_sc = LinearScale()
y_sc = LinearScale()

hist = Bins(sample=x_data, scales={'x': x_sc, 'y': y_sc}, padding=0,)
ax_x = Axis(scale=x_sc, tick_format='0.2f')
ax_y = Axis(scale=y_sc, orientation='vertical')

Figure(marks=[hist], axes=[ax_x, ax_y], padding_y=0)



In [6]:
# Changing the number of bins
hist.bins = 'sqrt'

In [7]:
# Changing the range
hist.min = 0

Histogram Styling

The styling of Hist is identical to the one of Bars


In [8]:
# Normalizing the count

x_sc = LinearScale()
y_sc = LinearScale()

hist = Bins(sample=x_data, scales={'x': x_sc, 'y': y_sc}, density=True)
ax_x = Axis(scale=x_sc, tick_format='0.2f')
ax_y = Axis(scale=y_sc, orientation='vertical')

Figure(marks=[hist], axes=[ax_x, ax_y], padding_y=0)



In [9]:
# changing the color
hist.colors=['orangered']

In [10]:
# stroke and opacity update
hist.stroke = 'orange'
hist.opacities = [0.5] * len(hist.x)

In [11]:
# Laying the histogram on its side
hist.orientation = 'horizontal'
ax_x.orientation = 'vertical'
ax_y.orientation = 'horizontal'

In [ ]: