In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
In [2]:
%matplotlib notebook
In this lecture we will examine alternative methods to search for periodic signals in astronomical time series. Earlier, we spent a great deal of time discussing the Lomb-Scargle periodogram. This is the "standard" in astronomy, in part because it was the first (good) method developed for noisy and sparse data.
We will now explore some alternatives to the L-S periodogram, and this is good, because LS: (i) does not handle outliers well, and (ii) works best on purely sinusoidal signals.
The string length method (Dworetsky), phase folds the data at trial periods and then minimizes the distance to connect the phase-ordered observations.
Phase Dispersion Minimization (PDM; Jurkevich 1971, Stellingwerth 1978), like LS, folds the data at a large number of trial frequencies $f$.
The phased data are then binned, and the variance is calculated in each bin, combined, and compared to the overall variance of the signal. No functional form of the signal is assumed, and thus, non-sinusoidal signals can be found.
Challenge: how to select the number of bins?
Analysis of Variance (AOV; Schwarzenberg-Czerny 1989) is similar to PDM. Optimal periods are defined via hypothesis testing, and these methods are found to perform best for certain types of astronomical signals.
Supersmoother (Reimann) is a least-squares approach wherein a flexible, non-parametric model is fit to the folded observations at many trial frequncies. The use of this flexible model reduces aliasing issues relative to models that assume a sinusoidal shape, however, this comes at the cost of requiring considerable computational time.
There have been some efforts to frame the period-finding problem in a Bayesian framework. Bretthorst 1988 developed Bayesian generalized LS models, while Gregory & Loredo 1992 applied Bayesian techniques to phase-binned models.
More recently, efforts to use Gaussian processes (GPs) to model and extract a period from the light curve have been developed (Wang et al. 2012). These methods have proved to be especially useful for detecting stellar rotation in Kepler light curves (Angus et al. 2018).
(There will be more on GPs later during this session)
Conditional Entropy (CE; Graham et al. 2013), and other entropy based methods, aim to minimize the entropy in binned (normalized magnitude, phase) space. CE, in particular, is good at supressing signal due to the window function.
When tested on real observations, CE outperforms most of the alternatives (e.g., LS, PDM, etc).
The focus of today's exercise is conditional entropy (CE), which uses information theory and thus, in principle, works better in the presence of noise and outliers. Furthermore, CE does not make any assumptions about the underlying shape of the signal, which is useful when looking for non-sinusoidal patterns (such as transiting planets or eclipsing binaries).
For full details on the CE algorithm, see Graham et al. (2013).
Briefly, CE is based on the using the Shannon entropy (Cincotta et al. 1995), which is determined as follows:
Calculate the Shannon entropy, $H_0$, over the $k$ partitions in $(\phi, m)$:
$$H_0 = - \sum_{i=1}^{k} \mu_i \ln{(\mu_i)}\;\; \forall \mu_i \ne 0,$$ where $\mu_i$ is the occupation probability for the $i^{th}$ partition, which is just the number of data points in that partition divided by the total number of points in the data set.
As discussed in Graham et al. (2013), minimizing the Shannon entropy can be influenced by the window function, so they introduce the conditional entropy, $H_c(m|\phi)$, to help mitigate these effects. The CE can be calculated as:
$$H_c = \sum_{i,j} p(m_i, \phi_j) \ln \left( \frac{p(\phi_j)}{p(m_i, \phi_j)} \right), $$where $p(m_i, \phi_j)$ is the occupation probability for the $i^{th}$ partition in normalized magnitude and the $j^{th}$ partition in phase and $p(\phi_j)$ is the occupation probability of the $j^{th}$ phase partition
In this problem we will first calculate the Shannon entropy, then the CE to find the best-fit period of the eclipsing binary from the LS lecture.
Problem 1a
Create a function, gen_periodic_data
, that creates simulated data (including noise) over a grid of user supplied positions:
where $A, P, \phi$ are inputs to the function. gen_periodic_data
should include Gaussian noise, $\sigma_y$, for each output $y_i$.
In [3]:
def gen_periodic_data(x, period=1, amplitude=1, phase=0, noise=0):
'''Generate periodic data given the function inputs
y = A*cos(x/p - phase) + noise
Parameters
----------
x : array-like
input values to evaluate the array
period : float (default=1)
period of the periodic signal
amplitude : float (default=1)
amplitude of the periodic signal
phase : float (default=0)
phase offset of the periodic signal
noise : float (default=0)
variance of the noise term added to the periodic signal
Returns
-------
y : array-like
Periodic signal evaluated at all points x
'''
y = amplitude*np.sin(2*np.pi*x/(period) - phase) + np.random.normal(0, np.sqrt(noise), size=len(x))
return y
Problem 1b
Create a function, phase_plot
, that takes x, y, and $P$ as inputs to create a phase-folded light curve (i.e., plot the data at their respective phase values given the period $P$).
Include an optional argument, y_unc
, to include uncertainties on the y
values, when available.
In [29]:
def phase_plot(x, y, period, y_unc = 0.0):
'''Create phase-folded plot of input data x, y
Parameters
----------
x : array-like
data values along abscissa
y : array-like
data values along ordinate
period : float
period to fold the data
y_unc : array-like
uncertainty of the
'''
phases = (x/period) % 1
if type(y_unc) == float:
y_unc = np.zeros_like(x)
plot_order = np.argsort(phases)
norm_y = (y - np.min(y))/(np.max(y) - np.min(y))
norm_y_unc = (y_unc)/(np.max(y) - np.min(y))
plt.rc('grid', linestyle=":", color='0.8')
fig, ax = plt.subplots()
ax.errorbar(phases[plot_order], norm_y[plot_order], norm_y_unc[plot_order],
fmt='o', mec="0.2", mew=0.1)
ax.set_xlabel("phase")
ax.set_ylabel("signal")
ax.set_yticks(np.linspace(0,1,11))
ax.set_xticks(np.linspace(0,1,11))
ax.grid()
fig.tight_layout()
Problem 1c
Generate a signal with $A = 2$, $p = \pi$, and Gaussian noise with variance = 0.01 over a regular grid between 0 and 10. Plot the phase-folded results (and make sure the results behave as you would expect).
Hint - your simulated signal should have at least 100 data points.
In [30]:
x = np.linspace( # complete
y = # complete
# complete plot
Note a couple changes from the previous helper function –– we have added a grid to the plot (this will be useful for visualizing the entropy), and we have also normalized the brightness measurements from 0 to 1.
As noted above, to calculate the Shannon entropy we need to sum the data over partitions in the normalized $(\phi, m)$ plane.
This is straightforward using histogram2d
from numpy.
Problem 2a
Write a function shannon_entropy
to calculate the Shannon entropy, $H_0$, for a timeseries, $m(t_i)$, at a given period, p
.
Hint - use histogram2d
and a 10 x 10 grid (as plotted above).
In [35]:
def shannon_entropy(m, t, p):
'''Calculate the Shannon entropy
Parameters
----------
m : array-like
brightness measurements of the time-series data
t : array-like (default=1)
timestamps corresponding to the brightness measurements
p : float
period of the periodic signal
Returns
-------
H0 : float
Shannon entropy for m(t) at period p
'''
m_norm = # complete
phases = # complete
H, _, _ = np.histogram2d( # complete
occupied = np.where(H > 0)
H0 = # complete
return H0
Problem 2b
What is the Shannon entropy for the simulated signal at periods = 1, $\pi$-0.05, and $\pi$?
Do these results make sense given your understanding of the Shannon entropy?
In [46]:
print('For p = 1, \t\tH_0 = {:.5f}'.format( # complete
print('For p = pi - 0.05, \tH_0 = {:.5f}'.format( # complete
print('For p = pi, \t\tH_0 = {:.5f}'.format( # complete
We know the correct period of the simulated data is $\pi$, so it makes sense that this period minimizes the Shannon entropy.
Problem 2c
Write a function, se_periodogram
to calculate the Shannon entropy for observations $m$, $t$ over a frequency grid f_grid
.
In [147]:
def se_periodogram(m, t, f_grid):
'''Calculate the Shannon entropy at every freq in f_grid
Parameters
----------
m : array-like
brightness measurements of the time-series data
t : array-like
timestamps corresponding to the brightness measurements
f_grid : array-like
trial periods for the periodic signal
Returns
-------
se_p : array-like
Shannon entropy for m(t) at every trial freq
'''
# complete
for # complete in # complete
# complete
return se_p
Problem 2d
Plot the Shannon entropy periodogram, and return the best-fit period from the periodogram.
Hint - recall what we learned about frequency grids earlier today.
In [148]:
f_grid = # complete
se_p = # complete
fig,ax = plt.subplots()
# complete
# complete
# complete
print("The best fit period is: {:.4f}".format( # complete
The CE is very similar to the Shannon entropy, though we need to condition the calculation on the occupation probability of the partitions in phase.
Problem 3a
Write a function conditional_entropy
to calculate the CE, $H_c$, for a timeseries, $m(t_i)$, at a given period, p
.
Hint - if you use histogram2d
be sure to sum along the correct axes
Hint 2 - recall from session 8 that we want to avoid for
loops, try to vectorize your calculation.
In [198]:
def conditional_entropy(m, t, p):
'''Calculate the conditional entropy
Parameters
----------
m : array-like
brightness measurements of the time-series data
t : array-like
timestamps corresponding to the brightness measurements
p : float
period of the periodic signal
Returns
-------
Hc : float
Conditional entropy for m(t) at period p
'''
m_norm = # complete
phases = # complete
# complete
# complete
# complete
Hc = # complete
return Hc
Problem 3b
What is the conditional entropy for the simulated signal at periods = 1, $\pi$-0.05, and $\pi$?
Do these results make sense given your understanding of CE?
In [199]:
print('For p = 1, \t\tH_c = {:.5f}'.format( # complete
print('For p = pi - 0.05, \tH_c = {:.5f}'.format( # complete
print('For p = pi, \t\tH_c = {:.5f}'.format( # complete
Problem 3c
Write a function, ce_periodogram
, to calculate the conditional entropy for observations $m$, $t$ over a frequency grid f_grid
.
In [200]:
def ce_periodogram(m, t, f_grid):
'''Calculate the conditional entropy at every freq in f_grid
Parameters
----------
m : array-like
brightness measurements of the time-series data
t : array-like
timestamps corresponding to the brightness measurements
f_grid : array-like
trial periods for the periodic signal
Returns
-------
ce_p : array-like
conditional entropy for m(t) at every trial freq
'''
# complete
for # complete in # complete
# complete
return ce_p
Problem 3d
Plot the conditional entropy periodogram, and return the best-fit period from the periodogram.
In [202]:
f_grid = # complete
ce_p = # complete
fig,ax = plt.subplots()
# complete
# complete
# complete
print("The best fit period is: {:.4f}".format( # complete
The Shannon entropy and CE return nearly identical results for a simulated sinusoidal signal. Now we will examine how each performs with actual astronomical observations.
Problem 4a
Load the data from our favorite eclipsing binary from this morning's LS exercise. Plot the light curve.
Hint - if you haven't already, download the example light curve.
In [3]:
data = pd.read_csv("example_asas_lc.dat")
fig, ax = plt.subplots()
ax.errorbar( # complete
ax.set_xlabel('HJD (d)')
ax.set_ylabel('V (mag)')
ax.set_ylim(ax.get_ylim()[::-1])
fig.tight_layout()
Problem 4b
Using the Shannon entropy, determine the best period for this light curve.
Hint - recall this morning's discussion about the optimal grid for a period search
In [4]:
f_min = # complete
f_max = # complete
delta_f = # complete
f_grid = # complete
In [208]:
se_p = # complete
print("The best fit period is: {:.9f}".format( # complete
Problem 4c
Plot the Shannon entropy periodogram.
In [210]:
fig, ax = plt.subplots()
# complete
# complete
# complete
Problem 4d
Plot the light curve phase-folded on the best-fit period, as measured by the Shannon entropy periodogram.
Does this look reasonable? Why or why not?
Hint - it may be helpful to zoom in on the periodogram.
In [212]:
phase_plot(# complete
Problem 4e
Using the conditional entropy, determine the best period for this light curve.
In [204]:
ce_p = # complete
print("The best fit period is: {:.9f}".format( # complete
Problem 4f
Plot the CE periodogram.
In [216]:
fig, ax = plt.subplots()
# complete
# complete
# complete
Problem 4g
Plot the light curve phase-folded on the best-fit period, as measured by the CE periodogram.
Does this look reasonable? If not - can you make it look better?
In [214]:
phase_plot( # complete
This example demonstrates the primary strength of CE over the Shannon entropy.
If you zoom-in on the CE periodogram, there is no power at $p \approx 1\,\mathrm{d}$, unlike the LS periodogram or the Shannon entropy method. This will not be the case for every single light curve, but this is a very nice feature of the CE method. And one reason why it may be preferred to something like LS when analyzing every light curve in LSST.
In the previous example we used a simple uniform grid to identify the best-fit period for the eclipsing binary. However, the "best-fit" resulted in an estimate of the half period. One way to improve upon this estimate is to build a grid that has overlapping phase bins. This requirement results in better continuity in the phase-folded light curves (K.Burdge, private communication).
Challenge Problem
Build a function conditional_entropy_overlap
that utilizes overlapping bins in the CE calculation.
Can you use this function to identify the correct period of the binary?