This notebook is part of pybroom.
This notebook demonstrate using pybroom when fitting a set of curves (curve fitting) using lmfit.Model. We will show that pybroom greatly simplifies comparing, filtering and plotting fit results from multiple datasets.
In [ ]:
%matplotlib inline
%config InlineBackend.figure_format='retina' # for hi-dpi displays
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from lmfit import Model
import lmfit
print('lmfit: %s' % lmfit.__version__)
In [ ]:
sns.set_style('whitegrid')
In [ ]:
import pybroom as br
In [ ]:
N = 20
In [ ]:
x = np.linspace(-10, 10, 101)
In [ ]:
peak1 = lmfit.models.GaussianModel(prefix='p1_')
peak2 = lmfit.models.GaussianModel(prefix='p2_')
model = peak1 + peak2
In [ ]:
#params = model.make_params(p1_amplitude=1.5, p2_amplitude=1,
# p1_sigma=1, p2_sigma=1)
In [ ]:
Y_data = np.zeros((N, x.size))
Y_data.shape
In [ ]:
for i in range(Y_data.shape[0]):
Y_data[i] = model.eval(x=x, p1_center=-1, p2_center=2,
p1_sigma=0.5, p2_sigma=1.5,
p1_height=1, p2_height=0.5)
Y_data += np.random.randn(*Y_data.shape)/10
In [ ]:
plt.plot(x, Y_data.T, '-k', alpha=0.1);
In [ ]:
model1 = lmfit.models.GaussianModel()
In [ ]:
Results1 = [model1.fit(y, x=x) for y in Y_data]
In [ ]:
params = model.make_params(p1_center=0, p2_center=3,
p1_sigma=0.5, p2_sigma=1,
p1_amplitude=1, p2_amplitude=2)
In [ ]:
Results = [model.fit(y, x=x, params=params) for y in Y_data]
Fit results from an lmfit Model
can be inspected with
with fit_report
or params.pretty_print()
:
In [ ]:
#print(Results[0].fit_report())
#Results[0].params.pretty_print()
This is good for peeking at the results. However, extracting these data from lmfit objects is quite a chore and requires good knowledge of lmfit objects structure.
pybroom helps in this task: it extracts data from fit results and returns familiar pandas DataFrame (in tidy format). Thanks to the tidy format these data can be much more easily manipulated, filtered and plotted.
In [ ]:
dg = br.glance(Results, var_names='dataset')
dg.drop('model', 1).drop('message', 1).head()
A summary of the one-peak model fit:
In [ ]:
dg1 = br.glance(Results1, var_names='dataset')
dg1.drop('model', 1).drop('message', 1).head()
In [ ]:
dt = br.tidy(Results, var_names='dataset')
Let's see the results for a single dataset:
In [ ]:
dt.query('dataset == 0')
or for a single parameter across datasets:
In [ ]:
dt.query('name == "p1_center"').head()
In [ ]:
dt.query('name == "p1_center"')['value'].std()
In [ ]:
dt.query('name == "p2_center"')['value'].std()
Note that there is a much larger error in fitting p2_center
than p1_center
.
In [ ]:
dt.query('name == "p1_center"')['value'].hist()
dt.query('name == "p2_center"')['value'].hist(ax=plt.gca());
In [ ]:
da = br.augment(Results, var_names='dataset')
In [ ]:
da1 = br.augment(Results1, var_names='dataset')
In [ ]:
r = Results[0]
Let's see the results for a single dataset:
In [ ]:
da.query('dataset == 0').head()
Plotting a single dataset is simplified compared to a manual plot:
In [ ]:
da0 = da.query('dataset == 0')
In [ ]:
plt.plot('x', 'data', data=da0, marker='o', ls='None')
plt.plot('x', "Model(gaussian, prefix='p1_')", data=da0, lw=2, ls='--')
plt.plot('x', "Model(gaussian, prefix='p2_')", data=da0, lw=2, ls='--')
plt.plot('x', 'best_fit', data=da0, lw=2);
plt.legend()
But keep in mind that, for a single dataset, we could use the lmfit method as well (which is even simpler):
In [ ]:
Results[0].plot_fit();
However, things become much more interesting when we want to plot multiple datasets or models as in the next section.
In [ ]:
grid = sns.FacetGrid(da.query('dataset < 6'), col="dataset", hue="dataset", col_wrap=3)
grid.map(plt.plot, 'x', 'data', marker='o', ls='None', ms=3, color='k')
grid.map(plt.plot, 'x', "Model(gaussian, prefix='p1_')", ls='--')
grid.map(plt.plot, 'x', "Model(gaussian, prefix='p2_')", ls='--')
grid.map(plt.plot, "x", "best_fit");
Here we plot a comparison of the two fitted models (one or two peaks) for different datasets.
First we create a single tidy DataFrame with data from the two models:
In [ ]:
da['model'] = 'twopeaks'
da1['model'] = 'onepeak'
da_tot = pd.concat([da, da1], ignore_index=True)
Then we perfom a facet plot with seaborn:
In [ ]:
grid = sns.FacetGrid(da_tot.query('dataset < 6'), col="dataset", hue="model", col_wrap=3)
grid.map(plt.plot, 'x', 'data', marker='o', ls='None', ms=3, color='k')
grid.map(plt.plot, "x", "best_fit")
grid.add_legend();
Note that the "tidy" organization of data allows plot libraries such as seaborn to automatically infer most information to create complex plots with simple commands. Without tidy data, instead, a manual creation of such plots becomes a daunting task.