gatspy
This shows an example of finding single-band periodograms for RR Lyrae data with gatspy
We'll start by doing some preparatory imports:
In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
# use seaborn's default plotting styles for matplotlib
import seaborn; seaborn.set()
gatspy
includes loaders for some convenient datasets.
Here we'll take a look at some RR Lyrae light curves from Sesar 2010
In [2]:
from gatspy.datasets import fetch_rrlyrae
rrlyrae = fetch_rrlyrae()
This dataset object has an ids
attribute showing the light curve ids available, and a get_lightcurve
method which returns the light curve:
In [3]:
len(rrlyrae.ids)
Out[3]:
In [4]:
lcid = rrlyrae.ids[0]
t, y, dy, filts = rrlyrae.get_lightcurve(lcid)
Let's quickly visualize this data, to see what we're working with:
In [5]:
for filt in 'ugriz':
mask = (filts == filt)
plt.errorbar(t[mask], y[mask], dy[mask], fmt='.', label=filt)
plt.gca().invert_yaxis()
plt.legend(ncol=3, loc='upper left');
The dataset has a metadata
attribute, which can tell us things like the period (as determined by Sesar 2010:
In [6]:
period = rrlyrae.get_metadata(lcid)['P']
period
Out[6]:
Let's fold the lightcurve on this period and re-plot the points:
In [7]:
phase = t % period
for filt in 'ugriz':
mask = (filts == filt)
plt.errorbar(phase[mask], y[mask], dy[mask], fmt='.', label=filt)
plt.gca().invert_yaxis()
plt.legend(ncol=3, loc='upper left');
Now the characteristic shape of the RR Lyrae light curve can be seen!
We can now use the Lomb-Scargle periodogram to attempt to fit the period. We'll use the classic (single-band) version, and examine the best period for each band individually. Because of the large (~10 year) baseline, the fit takes a few seconds per band:
In [8]:
from gatspy.periodic import LombScargle, LombScargleFast
# LombScargleFast is slightly faster than LombScargle for the simplest case
# Both have the same interface.
model = LombScargleFast()
model.optimizer.set(quiet=True, period_range=(0.2, 1.2))
print("Sesar 2010:", period)
for filt in 'ugriz':
mask = (filts == filt)
model.fit(t[mask], y[mask], dy[mask])
print(filt + ': ', model.best_period)
We see that within each band, the lomb-scargle periodogram lands on the correct peak.
This model.best_period
hides some computation. To find the best period, the model optimizer determines the required resolution for a linear scan search, and computes the periodogram at each of these values.
We can view the periodogram for each model by calling the periodogram
method.
We'll put this in a function for later use
In [9]:
def plot_periodogram(model, lcid):
plt.figure()
rrlyrae = fetch_rrlyrae()
t, y, dy, filts = rrlyrae.get_lightcurve(lcid)
period = rrlyrae.get_metadata(lcid)['P']
periods = np.linspace(0.2, 1.0, 5000)
for i, filt in enumerate('ugriz'):
mask = (filts == filt)
model.fit(t[mask], y[mask], dy[mask])
power = model.periodogram(periods)
plt.plot(periods, i + power, lw=1, label=filt)
plt.xlim(periods[0], periods[-1])
plt.ylim(0, 5)
plt.legend(ncol=3)
plot_periodogram(LombScargle(), lcid)
The periodogram has very narrow peaks, of width approximately $\Delta P \approx P^2 / T$, where $T = (t_{max} - t_{min})$ is the baseline of the measurements. For 10 years of data, the width near the 0.6 day peak turns out to be around $10^{-4}$ days (about 8 seconds), which is why we need such fine sampling of the periodogram to see the peaks.
In [10]:
def plot_model(model, lcid):
plt.figure()
rrlyrae = fetch_rrlyrae()
t, y, dy, filts = rrlyrae.get_lightcurve(lcid)
period = rrlyrae.get_metadata(lcid)['P']
phase = t % period
tfit = np.linspace(0, period, 100)
for filt in 'ugriz':
mask = (filts == filt)
pts = plt.errorbar(phase[mask], y[mask], dy[mask], fmt='.', label=filt)
model.fit(t[mask], y[mask], dy[mask])
yfit = model.predict(tfit, period=period)
plt.plot(tfit, yfit, color=pts[0].get_color(), alpha=0.5)
plt.gca().invert_yaxis()
plt.legend(ncol=3, loc='upper left');
plot_model(LombScargle(), lcid)
The model is clearly fairly biased: that is, it doesn't have enough flexibility to truly fit the data. We can do a bit better by adding more terms to the Fourier series with the Nterms
parameter
In [11]:
plot_model(LombScargle(Nterms=4), lcid)
This model can also be used within the periodogram, using the same interface as above:
In [12]:
plot_periodogram(LombScargle(Nterms=4), lcid)
In [13]:
from gatspy.periodic import SuperSmoother
plot_periodogram(SuperSmoother(), lcid)
plot_model(SuperSmoother(), lcid)
Nice and easy!