In standard Lomb-Scargle, we start with our data
$$ D = \{t_i,y_i,\sigma_i\}_{i=1}^N $$We assume here that the true values of $y$ are centered at zero (we'll relax this momentarily)
The model is a simple 1-term sinusoid given by
$$ M(t,\omega~|~\theta) = \theta_0 \sin(\omega t) + \theta_1\cos(\omega t) $$The likelihood for the dataset is
$$ L(\{t,y,dy\},\omega~|~\theta) = \sum_i \frac{1}{\sqrt{2\pi \sigma_i^2}} \exp\left[ \frac{-(y_i - M(t_i,\omega~|~\theta)^2}{2\sigma_i^2} \right] $$Which leads to the chi-squared function (derived from the log-likelihood)
$$ \chi^2(\omega, \theta) = \sum_i\frac{[y_i - M(t_i,\omega~|~\theta)]^2}{2\sigma_i^2} $$If we re-express the model by defining
$$ X_\omega = \left[ \begin{array}{ll} \sin(\omega t_1) & \cos(\omega t_1)\\ \sin(\omega t_2) & \cos(\omega t_2)\\ \vdots & \vdots \\ \sin(\omega t_N) & \cos(\omega t_N)\\ \end{array} \right],~~~~ y = \left[ \begin{array}{l} y_1 \\ y_2\\ \vdots \\ y_N\\ \end{array} \right] $$And create the error matrix
$$ \Sigma_y = \left[ \begin{array}{lllll} \sigma_1^2 & 0 & 0 & \cdots & 0\\ 0 & \sigma_2^2 & 0 & \cdots & 0\\ 0 & 0 & \sigma_3^2 & \cdots & 0\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & \cdots & \sigma_N^2 \end{array} \right] $$then our model is given by
$$ M(\omega, \theta) = X_\omega\theta $$and our $\chi^2$ can be written
$$ \chi^2(\omega, \theta) = (y - X_\omega\theta)^T\Sigma_y^{-1}(y - X_\omega\theta) $$Minimizing this cost funciton with respect to $\theta$ gives the maximum likelihood parameters:
$$ \hat{\theta} = (X_\omega^T\Sigma_y^{-1}X_\omega)^{-1}X_\omega^T\Sigma_y^{-1}y $$We can simplify this a bit by defining
$$ X_{\omega,\ast} = \Sigma_y^{-1/2}X_\omega \\ y_\ast = \Sigma_y^{-1/2}y $$And the above becomes
$$ \hat{\theta} = (X_{\omega,\ast}^TX_{\omega,\ast})^{-1}X_{\omega,\ast}^Ty_\ast $$Now the $\chi^2$ of the model fit is given by
$$ \chi^2(\omega,\hat{\theta}) = \left[ y_\ast^Ty_\ast - y_\ast^TX_{\omega,\ast} (X_{\omega,\ast}^TX_{\omega,\ast})^{-1}X_{\omega,\ast}^Ty_\ast \right] $$The reference $\chi^2$ is
$$ \chi_0^2 = \bar{y}_\ast^T\bar{y}_\ast $$So the power $P_{LS} = 1 - \chi^2/\chi_0^2$ is given by
$$ P_{LS}(\omega) = \frac{y_\ast^TX_{\omega,\ast} (X_{\omega,\ast}^TX_{\omega,\ast})^{-1}X_{\omega,\ast}^T\bar{y}_\ast}{\bar{y}_\ast^T\bar{y}_\ast} $$The generalized lomb-scargle fits for the mean of $y$ as part of the model, rather than as a separete step.
So what changes is that the $X_\omega$ becomes:
$$ X_\omega = \left[ \begin{array}{lll} 1 & \sin(\omega t_1) & \cos(\omega t_1)\\ 1 & \sin(\omega t_2) & \cos(\omega t_2)\\ \vdots & \vdots & \vdots \\ 1 & \sin(\omega t_N) & \cos(\omega t_N)\\ \end{array} \right] $$With this, we can relax the requirement that $y$ is centered: it will be centered as part of the model. Everything else carries through, and we have
$$ P_{LS}(\omega) = \frac{y_\ast^TX_{\omega,\ast} (X_{\omega,\ast}^TX_{\omega,\ast})^{-1}X_{\omega,\ast}^T\bar{y}_\ast}{\bar{y}_\ast^T\bar{y}_\ast} $$Where the quantities $y_\ast$ and $X_{\omega,\ast}$ are the noise-corrected matrices as used above.
For multiple bands, we'll assume that there exists some fundamental model
$$ \theta_{0} = \{\omega, y_0, A_0, B_0\} $$which defines an underlying oscillation
$$ M_0(t~|~\theta_0) = y_0 + A_0\sin(\omega t) + B_0\cos(\omega_0 t) $$We'll assume that each band indexed by $b \in \{1, 2, 3...\}$ has a periodic offset function $Q_b$ parametrized by some $\theta_b$, such that the model for that band is
$$ M_b(t~|~\theta_0,\omega_0) = M_0(t~|~\theta_0) + Q_b(t~|~\omega, \theta_b) $$Where, in general, $Q_b$ can be any periodic function.
This will give us a single global underlying model $\theta_0$, plus a $\theta_f^\ast$ for each of the have $N_f$ filters.
But this problem is over-specified: one set of parameters here is redundant; note that we can easily reparametrize the model for each filter as
$$ \theta_f = \{y_f, A_f, B_f\} $$where we've defined
$$ y_f = y_f^\ast + y_0\\ A_f = A_f^\ast + A_0\\ B_f = B_f^\ast + B_0 $$and we've thus eliminated the ability to solve explicitly for $\{y_0, A_0, B_0\}$.
Say we go back to our $\chi^2$ expression:
$$ \chi^2(\omega, \theta) = (y - X_\omega\theta)^T\Sigma_y^{-1}(y - X_\omega\theta) $$We using a Tikhonov regularization, we can penalize the $\theta$ values:
$$ \chi^2(\omega, \theta) = (y - X_\omega\theta)^T\Sigma_y^{-1}(y - X_\omega\theta) + \theta^T \Gamma^T\Gamma \theta $$(Note that while this form is often assumed within a frequentist context, it can be derived within a Bayesian context where the priors on $\theta$ are Gaussian, centered at zero, with a covariance matrix $\Sigma_\theta = [\Gamma^T\Gamma]^{-1}$)
Minimizing this with respect to theta gives:
$$ X_\omega^T\Sigma_y^{-1}(X_\omega\theta - y) + \Gamma^T\Gamma\theta = 0 $$Or
$$ \hat{\theta} = \left(X_\omega^T\Sigma_y^{-1}X_\omega + \Gamma^T\Gamma\right)^{-1}\left(X_\omega^T\Sigma_y^{-1}y\right) $$Plugging this in to the expression for $\chi^2$, we get the following result:
$$ \chi^2 = y^Ty - y^TX\left(X^TX + \Gamma^T\Gamma\right)^{-1}X^Ty $$(TODO: fill in the missing sigma in the above expression)
That is, it's the same expression as above with the regularization term added to the pseudoinverse!!
So the regularized lomb-scargle power is given by:
$$ P_{LS} = \frac{y^TX\left(X^TX + \Gamma^T\Gamma\right)^{-1}X^Ty}{y^Ty} $$
In [ ]: