The augmented pseudo-data representation

"Normal equations" formulation

In Fitting Linear Mixed-Effects Models using lme4 we showed that the profiled log-likelihood for a linear mixed-effects model can be determined from the solution to a penalized least squares problem. The pseudo-data representation of the problem is $$ \min_{\bf u,\beta} \left\| \begin{bmatrix} \bf y\\ \bf0 \end{bmatrix} - \begin{bmatrix} \bf Z\Lambda & \bf X\\ \bf I & \bf 0 \end{bmatrix} \begin{bmatrix} \tilde{\bf u}\\ \hat{\bf\beta} \end{bmatrix} \right\|^2 $$

Once the solution is determined, by forming the cross-product of the extended model matrix $$ {\bf A}=\begin{bmatrix}\bf \Lambda'Z'Z\Lambda+I & \bf\Lambda'Z'X\\ \bf X'Z\Lambda & \bf X'X \end{bmatrix} $$ and its Cholesky factor, which I will write in the upper-triangular form, $$ {\bf R}=\begin{bmatrix} \bf R_Z & \bf R_{ZX}\\ \bf 0 & \bf R_X \end{bmatrix} $$ such that $$ \bf R'R = \bf A, $$ the solution is obtained by solving $$ \bf R'R \begin{bmatrix}\tilde{u}\\\hat{\beta}\end{bmatrix} = \begin{bmatrix}\Lambda'Z'y\\\bf X'y\end{bmatrix} $$ after which the penalized residual sum of squares (prss) $$ \bf\|y-X\hat{\beta}-Z\Lambda\tilde{u}\|^2+\|\tilde{\bf u}\|^2 $$ and the logarithm of the determinant, $|\bf R_Z|$, provide the profiled log-likelihood.

Augmenting the "normal equations"

The matrix, $\bf A$ is of size $(q+p)\times(q+p)$ where $q$ is the number of columns in $\bf Z$ and $p$ is the number of columns in $\bf X$. Although $p$ is generally small, $q$ can be large but not as large as $n$, the number of observations.

Evaluating the residual sum of squares requires evaluating the linear predictor, $\bf X\hat{\beta}-Z\Lambda\tilde{u}$, and the residuals. I have been experimenting with the augmented form of the model matrix for the penalized least squares problem, $$ \begin{bmatrix} \bf Z\Lambda & \bf X & \bf y\\ \bf I & \bf 0 & \bf 0 \end{bmatrix}, $$ the corresponding positive-definite system, $$ \begin{bmatrix} \bf\Lambda'Z'Z\Lambda+I & \bf\Lambda'Z'X & \bf\Lambda'Z'y\\ \bf X'Z\Lambda & \bf X'X & \bf X'y\\ \bf y'Z\Lambda & \bf y'X & \bf y'y \end{bmatrix}, $$ and its upper Cholesky factor, $$ \bf R = \begin{bmatrix} \bf R_Z & \bf R_{ZX} & \bf r_{Zy}\\ \bf 0 & \bf R_{X} & \bf r_{Xy}\\ \bf 0 & \bf 0 & r_{\bf y}. \end{bmatrix}. $$

I had expected that the scalar, $r_{\bf y}$, would be the square root of the residual sum of squares but discovered that it isn't. It's the penalized residual sum of squares. This makes sense if you consider the pseudo-data representation.

Thus it is only necessary to form the Cholesky factor of the augmented system, $\bf B$, to evaluate the profiled log-likelihood. This has the advantages:

  1. The matrix $\bf B$ can be generated from the set of cross-products, $\bf Z'Z$, $\bf Z'X$, $\bf Z'y$, $\bf X'X$, $\bf X'y$ and $\bf y'y$, plus the matrix $\bf\Lambda$. There is no need to use vectors of length $n$ or matrices with $n$ rows.
  2. Less computation and less fiddling around is the code is needed.

In this package I am working on further partitioning of the augmented positive-definite system, $\bf B$. The partitioning is based on the vertical sections of $\bf Z$ determined by the grouping factors for the random-effects terms.