In [1]:
using DataFrames, Feather, MixedModels, RCall
INFO: Recompiling stale cache file /home/bates/.julia/lib/v0.5/DataStreams.ji for module DataStreams.
INFO: Recompiling stale cache file /home/bates/.julia/lib/v0.5/RCall.ji for module RCall.
INFO: Using R installation at /opt/anaconda3/lib/R
In [3]:
R"""
load("kb07.rda")
"""
@rget kb07;
In [4]:
@time mm1 = fit!(lmm(RTtrunc ~ 1+P+S+C+SP+SC+PC+SPC+
(1+S+P+C+SP+SC+PC+SPC|subj)+
(1+S+P+C+SP+SC+PC+SPC|item), kb07))
53.580779 seconds (11.59 M allocations: 448.169 MB, 0.69% gc time)
Out[4]:
Linear mixed model fit by maximum likelihood
Formula: RTtrunc ~ 1 + P + S + C + SP + SC + PC + SPC + ((1 + S + P + C + SP + SC + PC + SPC) | subj) + ((1 + S + P + C + SP + SC + PC + SPC) | item)
logLik -2 logLik AIC BIC
-1.42931726×10⁴2.85863452×10⁴2.87483452×10⁴2.91930328×10⁴
Variance components:
Column Variance Std.Dev. Corr.
subj (Intercept) 90612.3205 301.018804
S 5180.4346 71.975236 -0.44
P 5530.2096 74.365379 -0.47 0.08
C 7580.6305 87.066817 0.21 -0.20 0.41
SP 8838.5383 94.013501 0.19 -0.76 -0.54 -0.20
SC 1813.9315 42.590275 0.47 -0.53 -0.10 -0.44 0.28
PC 7393.0576 85.982892 -0.10 0.13 -0.05 -0.86 -0.06 0.70
SPC 3808.3909 61.712162 -0.48 0.41 -0.39 -0.09 0.18 -0.78 -0.39
item S 129609.6341 360.013380
S 1842.1843 42.920674 -0.35
P 62348.4893 249.696795 -0.68 -0.45
C 2920.9622 54.045927 0.20 -0.05 -0.18
SP 1054.5444 32.473750 0.57 -0.78 0.02 0.02
SC 1663.1372 40.781579 0.27 0.05 -0.26 0.45 -0.21
PC 4721.5379 68.713448 0.08 -0.24 0.21 -0.13 -0.26 0.02
SPC 4813.1935 69.377183 0.04 -0.50 0.32 -0.69 0.65 -0.67 -0.10
Residual P 399646.8438 632.176276
Number of obs: 1790; levels of grouping factors: 56, 32
Fixed-effects parameters:
Estimate Std.Error z value P(>|z|)
(Intercept) 2180.63 76.7573 28.4094 <1e-99
P -333.878 47.6492 -7.00702 <1e-11
S -66.9927 19.3229 -3.46701 0.0005
C 78.9886 21.2119 3.72378 0.0002
SP 22.1534 20.3491 1.08867 0.2763
SC -18.9216 17.5403 -1.07875 0.2807
PC 5.25918 22.4247 0.234527 0.8146
SPC -23.9526 21.0171 -1.13967 0.2544
In [5]:
mm1.optsum
Out[5]:
Initial parameter vector: [1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0 … 1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0]
Initial objective value: 30014.36976860628
Optimizer (from NLopt): LN_BOBYQA
Lower bounds: [0.0,-Inf,-Inf,-Inf,-Inf,-Inf,-Inf,-Inf,0.0,-Inf … 0.0,-Inf,-Inf,-Inf,0.0,-Inf,-Inf,0.0,-Inf,0.0]
ftol_rel: 1.0e-12
ftol_abs: 1.0e-8
xtol_rel: 0.0
xtol_abs: [1.0e-10,1.0e-10,1.0e-10,1.0e-10,1.0e-10,1.0e-10,1.0e-10,1.0e-10,1.0e-10,1.0e-10 … 1.0e-10,1.0e-10,1.0e-10,1.0e-10,1.0e-10,1.0e-10,1.0e-10,1.0e-10,1.0e-10,1.0e-10]
Function evaluations: 1742
Final parameter vector: [0.476163,-0.0495364,-0.055594,0.0292095,0.0289941,0.0319883,-0.0136437,-0.0464125,0.102512,-0.0167764 … 0.000173176,-0.0377045,0.0362933,0.0137344,0.029079,-0.0245135,-0.0120874,4.23458e-5,0.000132778,1.15826e-6]
Final objective value: 28586.345191687305
Return code: FTOL_REACHED
In [6]:
mm1.Λ[1]
Out[6]:
8×8 LowerTriangular{Float64,Array{Float64,2}}:
0.476163 ⋅ ⋅ ⋅ … ⋅ ⋅
-0.0495364 0.102512 ⋅ ⋅ ⋅ ⋅
-0.055594 -0.0167764 0.102302 ⋅ ⋅ ⋅
0.0292095 -0.0160742 0.0774595 0.108889 ⋅ ⋅
0.0289941 -0.11122 -0.094225 0.00518803 ⋅ ⋅
0.0319883 -0.024384 0.00557116 -0.053748 … ⋅ ⋅
-0.0136437 0.0124569 -0.0128618 -0.134133 0.0 ⋅
-0.0464125 0.0218546 -0.0653738 0.0512141 7.48771e-7 0.000132175
In [7]:
mm1.Λ[2]
Out[7]:
8×8 LowerTriangular{Float64,Array{Float64,2}}:
0.569483 ⋅ ⋅ … ⋅ ⋅ ⋅
-0.0234283 0.0637232 ⋅ ⋅ ⋅ ⋅
-0.266682 -0.288296 0.0421306 ⋅ ⋅ ⋅
0.0174978 0.00218032 -0.0211528 ⋅ ⋅ ⋅
0.0291642 -0.0320205 -0.025332 ⋅ ⋅ ⋅
0.017738 0.00989025 0.0212628 … 0.029079 ⋅ ⋅
0.00863582 -0.0246445 0.0955937 -0.0245135 4.23458e-5 ⋅
0.00489853 -0.0568596 -0.0284809 -0.0120874 0.000132778 1.15826e-6
In [8]:
R"""
baboon <- within(readRDS("finaldat.tbt.rds"), {
stimulus <- factor(stimulus)
lex <- factor(lex)
Outcomes <- factor(Outcomes)
id <- factor(id)
dec.FINAL <- factor(dec.FINAL)
dec.GAM2 <- factor(dec.GAM2)
})
str(baboon)
"""
'data.frame': 316871 obs. of 44 variables:
$ ID : int 1 2 3 4 5 6 7 8 9 10 ...
$ trial : int 1 2 3 4 5 6 7 8 9 10 ...
$ subject : Factor w/ 6 levels "ARI","ART","CAU",..: 4 4 4 4 4 4 4 4 4 4 ...
$ stimulus : Factor w/ 8139 levels "ABBS","ABBT",..: 6929 1613 7705 7220 339 339 7244 6929 5483 92 ...
$ lex : Factor w/ 2 levels "NW","W": 2 1 2 1 2 2 1 2 1 1 ...
$ acc : int 0 1 0 1 0 0 1 0 1 1 ...
$ Response : Factor w/ 2 levels "NW","W": 1 1 1 1 1 1 1 1 1 1 ...
$ Outcomes : Factor w/ 2 levels "NW","W": 2 1 2 1 2 2 1 2 1 1 ...
$ Block : num 1 1 1 1 1 1 1 1 1 1 ...
$ dec.HOG : Factor w/ 2 levels "NW","W": 2 1 1 1 1 2 2 1 1 2 ...
$ dec.HOG.0 : logi FALSE TRUE FALSE TRUE FALSE TRUE ...
$ actN : num 0 0 0.279 0.147 0.308 ...
$ actW : num 0 0.144 0.112 0.278 0.266 ...
$ priorN : num 0 0 0.59 0.596 1.094 ...
$ priorW : num 0 0.507 0.544 0.979 1.025 ...
$ first : logi TRUE TRUE TRUE TRUE TRUE FALSE ...
$ agr : num 0 1 1 1 1 0 0 1 1 0 ...
$ acc.model : num 1 1 0 1 0 1 0 0 1 0 ...
$ diff : num 0 -0.1438 -0.1666 -0.1312 -0.0422 ...
$ diff.abs : num 0 0.1438 0.1666 0.1312 0.0422 ...
$ id : Factor w/ 308 levels "ACME","AHEM",..: 266 172 279 172 9 9 172 266 172 172 ...
$ first.app : Factor w/ 52 levels "1","3","4","5",..: 1 NA 1 NA 1 1 NA 1 NA NA ...
$ dist : Factor w/ 62 levels "0","1","2","3",..: 1 NA 1 NA 1 1 NA 1 NA NA ...
$ first.app.num: num 1 1 1 1 1 1 1 1 1 1 ...
$ first.trial : num 1 1 1 1 1 1 1 1 1 1 ...
$ priorsWDS : num 0 0 0 0 0 ...
$ actsWDS : num 0 0 0 0 0 ...
$ dec.FINAL : Factor w/ 2 levels "NW","W": 1 1 1 1 1 1 2 1 1 2 ...
$ unique : num 0 0 2 2 3 3 3 3 3 3 ...
$ old20 : num 0 0 NA NA NA NA NA NA NA NA ...
$ Correct : logi FALSE TRUE FALSE TRUE FALSE FALSE ...
$ acc.f : logi FALSE TRUE FALSE TRUE FALSE FALSE ...
$ first.tri : Factor w/ 1991 levels "ABB","ABL","ACH",..: 1693 390 1873 1747 69 69 1750 1693 1354 16 ...
$ last.tri : Factor w/ 1539 levels "ABS","ABT","ACE",..: 942 310 270 597 40 40 633 942 1232 365 ...
$ first.bi : Factor w/ 184 levels "AB","AC","AD",..: 159 44 175 165 17 17 165 159 132 6 ...
$ mid.bi : Factor w/ 205 levels "AB","AC","AD",..: 136 56 48 94 11 11 97 136 167 63 ...
$ last.bi : Factor w/ 209 levels "AB","AD","AF",..: 152 178 116 86 118 118 125 152 67 107 ...
$ id.f : Factor w/ 308 levels "ACME","AHEM",..: 266 172 279 172 9 9 172 266 172 172 ...
$ dec.GAM : logi NA NA NA NA NA NA ...
$ GAM : num 0.206 0.139 0.115 0.139 0.113 ...
$ acc.gam : num 1 1 0 0 0 0 1 1 1 1 ...
$ GAM2 : num 0.2599 0.1071 0.0872 0.1071 0.087 ...
$ dec.GAM2 : Factor w/ 2 levels "NW","W": 1 1 1 1 1 1 2 1 1 1 ...
$ acc.gam2 : num 0 1 0 1 0 0 0 0 1 1 ...
Out[8]:
RCall.RObject{RCall.NilSxp}
NULL
In [9]:
@rget(baboon);
In [10]:
for (n,v) in eachcol(baboon)
println(rpad(n, 20), typeof(v))
end
ID Array{Int32,1}
trial Array{Int32,1}
subject CategoricalArrays.CategoricalArray{String,1,UInt32}
stimulus CategoricalArrays.CategoricalArray{String,1,UInt32}
lex CategoricalArrays.CategoricalArray{String,1,UInt32}
acc Array{Int32,1}
Response CategoricalArrays.CategoricalArray{String,1,UInt32}
Outcomes CategoricalArrays.CategoricalArray{String,1,UInt32}
Block Array{Float64,1}
dec.HOG CategoricalArrays.CategoricalArray{String,1,UInt32}
dec.HOG.0 BitArray{1}
actN Array{Float64,1}
actW Array{Float64,1}
priorN Array{Float64,1}
priorW Array{Float64,1}
first BitArray{1}
agr Array{Float64,1}
acc.model Array{Float64,1}
diff Array{Float64,1}
diff.abs Array{Float64,1}
id CategoricalArrays.CategoricalArray{String,1,UInt32}
first.app CategoricalArrays.NullableCategoricalArray{String,1,UInt32}
dist CategoricalArrays.NullableCategoricalArray{String,1,UInt32}
first.app.num NullableArrays.NullableArray{Float64,1}
first.trial Array{Float64,1}
priorsWDS Array{Float64,1}
actsWDS Array{Float64,1}
dec.FINAL CategoricalArrays.CategoricalArray{String,1,UInt32}
unique Array{Float64,1}
old20 NullableArrays.NullableArray{Float64,1}
Correct BitArray{1}
acc.f BitArray{1}
first.tri CategoricalArrays.CategoricalArray{String,1,UInt32}
last.tri CategoricalArrays.CategoricalArray{String,1,UInt32}
first.bi CategoricalArrays.CategoricalArray{String,1,UInt32}
mid.bi CategoricalArrays.CategoricalArray{String,1,UInt32}
last.bi CategoricalArrays.CategoricalArray{String,1,UInt32}
id.f CategoricalArrays.CategoricalArray{String,1,UInt32}
dec.GAM NullableArrays.NullableArray{Bool,1}
GAM Array{Float64,1}
acc.gam Array{Float64,1}
GAM2 Array{Float64,1}
dec.GAM2 CategoricalArrays.CategoricalArray{String,1,UInt32}
acc.gam2 Array{Float64,1}
In Julia column names with embedded .
characters must be quoted in an awkward way. It is better to replace the .
characters with _
.
In [11]:
nms_ = [Symbol(replace(string(nm), '.', '_')) for nm in names(baboon)]
names!(baboon, nms_)
baboon[:dec_HOG_0] = convert(Vector{Int}, baboon[:dec_HOG_0])
for (n,v) in eachcol(baboon)
println(rpad(n, 20), typeof(v))
end
ID Array{Int32,1}
trial Array{Int32,1}
subject CategoricalArrays.CategoricalArray{String,1,UInt32}
stimulus CategoricalArrays.CategoricalArray{String,1,UInt32}
lex CategoricalArrays.CategoricalArray{String,1,UInt32}
acc Array{Int32,1}
Response CategoricalArrays.CategoricalArray{String,1,UInt32}
Outcomes CategoricalArrays.CategoricalArray{String,1,UInt32}
Block Array{Float64,1}
dec_HOG CategoricalArrays.CategoricalArray{String,1,UInt32}
dec_HOG_0 DataArrays.DataArray{Int64,1}
actN Array{Float64,1}
actW Array{Float64,1}
priorN Array{Float64,1}
priorW Array{Float64,1}
first BitArray{1}
agr Array{Float64,1}
acc_model Array{Float64,1}
diff Array{Float64,1}
diff_abs Array{Float64,1}
id CategoricalArrays.CategoricalArray{String,1,UInt32}
first_app CategoricalArrays.NullableCategoricalArray{String,1,UInt32}
dist CategoricalArrays.NullableCategoricalArray{String,1,UInt32}
first_app_num NullableArrays.NullableArray{Float64,1}
first_trial Array{Float64,1}
priorsWDS Array{Float64,1}
actsWDS Array{Float64,1}
dec_FINAL CategoricalArrays.CategoricalArray{String,1,UInt32}
unique Array{Float64,1}
old20 NullableArrays.NullableArray{Float64,1}
Correct BitArray{1}
acc_f BitArray{1}
first_tri CategoricalArrays.CategoricalArray{String,1,UInt32}
last_tri CategoricalArrays.CategoricalArray{String,1,UInt32}
first_bi CategoricalArrays.CategoricalArray{String,1,UInt32}
mid_bi CategoricalArrays.CategoricalArray{String,1,UInt32}
last_bi CategoricalArrays.CategoricalArray{String,1,UInt32}
id_f CategoricalArrays.CategoricalArray{String,1,UInt32}
dec_GAM NullableArrays.NullableArray{Bool,1}
GAM Array{Float64,1}
acc_gam Array{Float64,1}
GAM2 Array{Float64,1}
dec_GAM2 CategoricalArrays.CategoricalArray{String,1,UInt32}
acc_gam2 Array{Float64,1}
In [12]:
mm1 = fit!(glmm(acc ~ 1 + dec_HOG_0 + (1 | stimulus) + (1 | subject), baboon, Bernoulli()))
Out[12]:
Generalized Linear Mixed Model fit by minimizing the Laplace approximation to the deviance
Formula: acc ~ 1 + dec_HOG_0 + (1 | stimulus) + (1 | subject)
Distribution: Distributions.Bernoulli{Float64}
Link: GLM.LogitLink()
Deviance (Laplace approximation): 344103.0381
Variance components:
Column Variance Std.Dev.
stimulus (Intercept) 0.729003384 0.8538169
subject (Intercept) 0.019953909 0.1412583
Number of obs: 316871; levels of grouping factors: 8139, 6
Fixed-effects parameters:
Estimate Std.Error z value P(>|z|)
(Intercept) 1.06353 0.0587912 18.09 <1e-72
dec_HOG_0 -0.176357 0.0131468 -13.4144 <1e-40
In [13]:
@time mm1a = fit!(glmm(
acc ~ 1 + dec_HOG_0 + (1 | stimulus) + (1 | subject), baboon, Bernoulli()), fast = true)
20.402963 seconds (210.08 k allocations: 1.302 GB, 1.35% gc time)
Out[13]:
Generalized Linear Mixed Model fit by minimizing the Laplace approximation to the deviance
Formula: acc ~ 1 + dec_HOG_0 + (1 | stimulus) + (1 | subject)
Distribution: Distributions.Bernoulli{Float64}
Link: GLM.LogitLink()
Deviance (Laplace approximation): 344103.8657
Variance components:
Column Variance Std.Dev.
stimulus (Intercept) 0.728887359 0.85374900
subject (Intercept) 0.021818356 0.14771038
Number of obs: 316871; levels of grouping factors: 8139, 6
Fixed-effects parameters:
Estimate Std.Error z value P(>|z|)
(Intercept) 1.01788 0.0613767 16.5841 <1e-61
dec_HOG_0 -0.169316 0.0131424 -12.8832 <1e-37
In [14]:
mm1a.LMM.optsum
Out[14]:
Initial parameter vector: [1.0,1.0]
Initial objective value: 344296.1817233196
Optimizer (from NLopt): LN_BOBYQA
Lower bounds: [0.0,0.0]
ftol_rel: 1.0e-12
ftol_abs: 1.0e-8
xtol_rel: 0.0
xtol_abs: [1.0e-10,1.0e-10]
Function evaluations: 35
Final parameter vector: [0.853749,0.14771]
Final objective value: 344103.86574724223
Return code: FTOL_REACHED
In [15]:
@time mm1 = fit!(glmm(
acc ~ 1 + dec_HOG_0 + (1 | stimulus) + (1 | subject), baboon, Bernoulli()))
89.008223 seconds (248.00 k allocations: 4.008 GB, 0.53% gc time)
Out[15]:
Generalized Linear Mixed Model fit by minimizing the Laplace approximation to the deviance
Formula: acc ~ 1 + dec_HOG_0 + (1 | stimulus) + (1 | subject)
Distribution: Distributions.Bernoulli{Float64}
Link: GLM.LogitLink()
Deviance (Laplace approximation): 344103.0381
Variance components:
Column Variance Std.Dev.
stimulus (Intercept) 0.729003384 0.8538169
subject (Intercept) 0.019953909 0.1412583
Number of obs: 316871; levels of grouping factors: 8139, 6
Fixed-effects parameters:
Estimate Std.Error z value P(>|z|)
(Intercept) 1.06353 0.0587912 18.09 <1e-72
dec_HOG_0 -0.176357 0.0131468 -13.4144 <1e-40
In [16]:
mm1.LMM.optsum
Out[16]:
Initial parameter vector: [1.0,1.0]
Initial objective value: 344295.93829161214
Optimizer (from NLopt): LN_BOBYQA
Lower bounds: [0.0,0.0]
ftol_rel: 1.0e-12
ftol_abs: 1.0e-8
xtol_rel: 0.0
xtol_abs: [1.0e-10,1.0e-10]
Function evaluations: 81
Final parameter vector: [1.06353,-0.176357,0.853817,0.141258]
Final objective value: 344103.0380669706
Return code: FTOL_REACHED
In [17]:
@time mm2 = fit!(glmm(
acc ~ 1 + dec_HOG_0 + (1 | stimulus) + (1 | first_tri) + (1 | last_tri) + (1 | subject),
baboon, Bernoulli()))
1122.436415 seconds (210.29 M allocations: 110.799 GB, 1.67% gc time)
Out[17]:
Generalized Linear Mixed Model fit by minimizing the Laplace approximation to the deviance
Formula: acc ~ 1 + dec_HOG_0 + (1 | stimulus) + (1 | first_tri) + (1 | last_tri) + (1 | subject)
Distribution: Distributions.Bernoulli{Float64}
Link: GLM.LogitLink()
Deviance (Laplace approximation): 337966.5911
Variance components:
Column Variance Std.Dev.
stimulus (Intercept) 0.016587679 0.12879316
first_tri (Intercept) 0.225358482 0.47471937
last_tri (Intercept) 0.331953946 0.57615445
subject (Intercept) 0.021949342 0.14815310
Number of obs: 316871; levels of grouping factors: 8139, 1991, 1539, 6
Fixed-effects parameters:
Estimate Std.Error z value P(>|z|)
(Intercept) 1.09759 0.064784 16.9424 <1e-63
dec_HOG_0 -0.132331 0.0130988 -10.1025 <1e-23
Content source: dmbates/MixedModelsinJulia
Similar notebooks: