Representation of mixed-effects models

The representation of a linear mixed-effects model

type LinearMixedModel{T <: AbstractFloat} <: MixedModel
    formula::Formula
    mf::ModelFrame
    wttrms::Vector
    trms::Vector
    sqrtwts::Diagonal{T}
    Λ::Vector
    A::Hermitian
    L::LowerTriangular
    optsum::OptSummary{T}
end

underlies all the other types of mixed-effects models. The members of this struct are

  • formula: the formula for the model
  • mf: the model frame, the terms component for labelling fixed effects
  • wttrms: a length nt vector of weighted model matrices. The last two elements are X and y.
  • trms: a vector of unweighted model matrices. If isempty(sqrtwts) the same object as wttrms
  • Λ: a length nt - 2 vector of lower triangular matrices
  • sqrtwts: the Diagonal matrix of the square roots of the case weights. Allowed to be size 0
  • A: a Hermitian blocked matrix representing hcat(Z,X,y)'hcat(Z,X,y)
  • L: a LowerTriangular blocked matrix - the Cholesky factor of Λ'AΛ+I
  • opt: an OptSummary object

If there are no case weights then the size of sqrtwts is $0\times 0$ and wttrms is the same as trms. To describe the other components, it is helpful to consider a few simple examples.

A model with a single, scalar random-effects term


In [1]:
using Feather, LinearAlgebra, MixedModels

In [2]:
dsfilename = Pkg.dir("MixedModels", "test", "data", "Dyestuff.feather")
dyestuff = Feather.read(dsfilename, nullable = false);

In [3]:
m1 = lmm(Yield ~ 1 + (1 | Batch), dyestuff);

In the model formula there are three terms: the response term, Yield, the Intercept term, 1, which is part of the fixed-effects, and the scalar random-effects term, (1 | Batch). Random-effects terms consist of an expression on the left-hand side of the |, which is evaluated as a model matrix, and an expression on the right hand side: Batch, in this case, which is evaluated to a categorical array called the grouping factor for the term. When the left-hand side evaluates to a model matrix with a single column, as it does here, the term is said to be a scalar random-effects term.

The trms member contains these three terms in the order random-effects, fixed-effects, response.


In [4]:
m1.trms


Out[4]:
3-element Array{Any,1}:
 MixedModels.ScalarReMat{Float64,String,Int32}(CategoricalArrays.CategoricalValue{String,Int32}["A","A","A","A","A","B","B","B","B","B"  …  "E","E","E","E","E","F","F","F","F","F"],[1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0  …  1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0],:Batch,String["(Intercept)"])
 [1.0; 1.0; … ; 1.0; 1.0]                                                                                                                                                                                                                                                                               
 [1545.0; 1440.0; … ; 1480.0; 1445.0]                                                                                                                                                                                                                                                                   

The ScalarReMat is a compact representation of the indicator matrix for the grouping factor. Its columns are indicators of the 6 levels of the grouping factor, Batch.


In [5]:
full(m1.trms[1])


Out[5]:
30×6 Array{Float64,2}:
 1.0  0.0  0.0  0.0  0.0  0.0
 1.0  0.0  0.0  0.0  0.0  0.0
 1.0  0.0  0.0  0.0  0.0  0.0
 1.0  0.0  0.0  0.0  0.0  0.0
 1.0  0.0  0.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0  0.0  0.0
 0.0  0.0  1.0  0.0  0.0  0.0
 0.0  0.0  1.0  0.0  0.0  0.0
 0.0  0.0  1.0  0.0  0.0  0.0
 ⋮                        ⋮  
 0.0  0.0  0.0  1.0  0.0  0.0
 0.0  0.0  0.0  1.0  0.0  0.0
 0.0  0.0  0.0  0.0  1.0  0.0
 0.0  0.0  0.0  0.0  1.0  0.0
 0.0  0.0  0.0  0.0  1.0  0.0
 0.0  0.0  0.0  0.0  1.0  0.0
 0.0  0.0  0.0  0.0  1.0  0.0
 0.0  0.0  0.0  0.0  0.0  1.0
 0.0  0.0  0.0  0.0  0.0  1.0
 0.0  0.0  0.0  0.0  0.0  1.0
 0.0  0.0  0.0  0.0  0.0  1.0
 0.0  0.0  0.0  0.0  0.0  1.0

The A member is a blocked matrix with $3$ blocks in each dimension, corresponding to the 3 terms. The $(i, j)$ block is m1.trms[i]'m1.trms[j] Because of symmetry only the blocks in the lower triangle are stored explicitly. (Generally the term Hermitian is used in Julia rather than symmetric even for real-valued matrices, where they are synonyms.)


In [6]:
m1.A[1, 1]


Out[6]:
6×6 Diagonal{Float64}:
 5.0   ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅   5.0   ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅   5.0   ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅   5.0   ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅   5.0   ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅   5.0

In [7]:
m1.A[2, 1]


Out[7]:
1×6 Array{Float64,2}:
 5.0  5.0  5.0  5.0  5.0  5.0

In [8]:
m1.A[2, 2]


Out[8]:
1×1 Array{Float64,2}:
 30.0

In [9]:
m1.A[3, 1]


Out[9]:
1×6 Array{Float64,2}:
 7525.0  7640.0  7820.0  7490.0  8000.0  7350.0

In [10]:
m1.A[3, 2]


Out[10]:
1×1 Array{Float64,2}:
 45825.0

In [11]:
m1.A[3, 3]


Out[11]:
1×1 Array{Float64,2}:
 7.01129e7

Blocks on the diagonal of A will be positive-semi-definite Hermitian matrices. More importantly, diagonal blocks corresponding to scalar random-effects terms are diagonal matrices.

The Λ member is a vector of lower-triangular matrices whose values are modified during the iterations. They correspond to the random-effects terms. In the case of a scalar random-effects term, the corresponding element of Λ is a multiple of the identity. The multiple is the parameter $\theta$ over which the log-likelihood is optimized.


In [12]:
m1.Λ


Out[12]:
1-element Array{UniformScaling{Float64},1}:
 UniformScaling{Float64}
1.0*I

In [13]:
setθ!(m1, [0.75258]);
m1.Λ


Out[13]:
1-element Array{UniformScaling{Float64},1}:
 UniformScaling{Float64}
0.75258*I

After setting a value of $\theta$ the blocked lower Cholesky is updated.

function cholBlocked!{T}(m::LinearMixedModel{T})
    A, Λ, L = m.A.data.blocks, m.Λ, m.L.data.blocks
    n = LinAlg.checksquare(A)
    for j in 1:n, i in j:n
        inject!(L[i, j], A[i, j])  # like copy! but L can be more general than A
    end
    for (j, λ) in enumerate(Λ)
        for i in j:n
            A_mul_B!(L[i, j], λ)
        end
        for jj in 1:j
            Ac_mul_B!(λ, L[j, jj])
        end
        L[j, j] += I
    end
    for j in 1:n
        Ljj = L[j, j]
        cholUnblocked!(Ljj, Val{:L})
        Ljjlt = isa(Ljj, Diagonal) ? Ljj : LowerTriangular(Ljj)
        for i in (j + 1):n
            LinAlg.A_rdiv_Bc!(L[i, j], Ljjlt)
        end
        for i in (j + 1):n
            Lij = L[i, j]
            Lii = L[i, i]
            rankUpdate!(-one(T), Lij, isa(Lii, Diagonal) ? Lii : Hermitian(Lii, :L))
            for jj in (i + 1):n
                A_mul_Bc!(-one(T), L[jj, j], Lij, one(T), L[jj, i])
            end
        end
    end
    m
end