In [1]:
import numpy as np
from bqplot import *
This Mark
is essentially the same as the Hist
Mark
from a user point of view, but is actually a Bars
instance that bins sample data.
The difference with Hist
is that the binning is done in the backend, so it will work better for large data as it does not have to ship the whole data back and forth to the frontend.
In [2]:
# Create a sample of Gaussian draws
np.random.seed(0)
x_data = np.random.randn(1000)
Give the Hist
mark the data you want to perform as the sample
argument, and also give 'x' and 'y' scales.
In [3]:
x_sc = LinearScale()
y_sc = LinearScale()
hist = Bins(sample=x_data, scales={'x': x_sc, 'y': y_sc}, padding=0,)
ax_x = Axis(scale=x_sc, tick_format='0.2f')
ax_y = Axis(scale=y_sc, orientation='vertical')
Figure(marks=[hist], axes=[ax_x, ax_y], padding_y=0)
The midpoints of the resulting bins and their number of elements can be recovered via the read-only traits x
and y
:
In [4]:
hist.x, hist.y
Out[4]:
Under the hood, the Bins
mark is really a Bars
mark, with some additional magic to control the binning. The data in sample
is binned into equal-width bins. The parameters controlling the binning are the following traits:
bins
sets the number of bins. It is either a fixed integer (10 by default), or the name of a method to determine the number of bins in a smart way ('auto', 'fd', 'doane', 'scott', 'rice', 'sturges' or 'sqrt').
min
and max
set the range of the data (sample
) to be binned
density
, if set to True
, normalizes the heights of the bars.
For more information, see the documentation of numpy
's histogram
In [5]:
x_sc = LinearScale()
y_sc = LinearScale()
hist = Bins(sample=x_data, scales={'x': x_sc, 'y': y_sc}, padding=0,)
ax_x = Axis(scale=x_sc, tick_format='0.2f')
ax_y = Axis(scale=y_sc, orientation='vertical')
Figure(marks=[hist], axes=[ax_x, ax_y], padding_y=0)
In [6]:
# Changing the number of bins
hist.bins = 'sqrt'
In [7]:
# Changing the range
hist.min = 0
In [8]:
# Normalizing the count
x_sc = LinearScale()
y_sc = LinearScale()
hist = Bins(sample=x_data, scales={'x': x_sc, 'y': y_sc}, density=True)
ax_x = Axis(scale=x_sc, tick_format='0.2f')
ax_y = Axis(scale=y_sc, orientation='vertical')
Figure(marks=[hist], axes=[ax_x, ax_y], padding_y=0)
In [9]:
# changing the color
hist.colors=['orangered']
In [10]:
# stroke and opacity update
hist.stroke = 'orange'
hist.opacities = [0.5] * len(hist.x)
In [11]:
# Laying the histogram on its side
hist.orientation = 'horizontal'
ax_x.orientation = 'vertical'
ax_y.orientation = 'horizontal'
In [ ]: