Take the dispersion into account for GLMM familes with a dispersion param#291
Take the dispersion into account for GLMM familes with a dispersion param#291
Conversation
Codecov ReportPatch coverage has no change and project coverage change:
Additional details and impacted files@@ Coverage Diff @@
## main #291 +/- ##
==========================================
- Coverage 95.83% 92.74% -3.10%
==========================================
Files 35 23 -12
Lines 3267 1722 -1545
==========================================
- Hits 3131 1597 -1534
+ Misses 136 125 -11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
For tinkering and comparison to lme4, look at this gist. |
72ab191 to
0858435
Compare
ef8e380 to
7300ef3
Compare
|
After pulling my hair out for far too long trying to figure out why fast fits worked but 'slow' ones didn't, I tried a different optimizer. It turns out that I accidentally stumbled across a case where BOBYQA fails dramatically. The old two-stage optimization for slow fits didn't have this problem because it started from the fast fits don't. julia> ┌ Warning: Model has not been fit
└ @ MixedModels ~/Work/MixedModels.jl/src/generalizedlinearmixedmodel.jl:563
julia> bobyqa = GeneralizedLinearMixedModel(@formula(yield ~ 1 + (1|batch)), MixedModels.dataset(:dyestuff),Normal(),SqrtLink());
julia> fit!(bobyqa)
┌ Warning: Model has not been fit
└ @ MixedModels ~/Work/MixedModels.jl/src/generalizedlinearmixedmodel.jl:563
┌ Warning: Model has not been fit
└ @ MixedModels ~/Work/MixedModels.jl/src/generalizedlinearmixedmodel.jl:563
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
yield ~ 1 + (1 | batch)
Distribution: Normal{Float64}
Link: SqrtLink()
Deviance: 58893.7814
Variance components:
Column Variance Std.Dev.
batch (Intercept) 1954.5903 44.21075
Residual 2178.8889 46.67857
Number of obs: 30; levels of grouping factors: 6
Fixed-effects parameters:
──────────────────────────────────────────────────
Estimate Std.Error z value P(>|z|)
──────────────────────────────────────────────────
(Intercept) 39.0793 18.0493 2.17 0.0304
──────────────────────────────────────────────────
julia> bobyqa.optsum
Initial parameter vector: [39.08324449172631, 1.0]
Initial objective value: 58893.79526421842
Optimizer (from NLopt): LN_BOBYQA
Lower bounds: [-Inf, 0.0]
ftol_rel: 1.0e-12
ftol_abs: 1.0e-8
xtol_rel: 0.0
xtol_abs: [1.0e-10]
initial_step: [39.08324449172631, 0.75]
maxfeval: -1
Function evaluations: 26
Final parameter vector: [39.0788080375932, 0.998348758382195]
Final objective value: 58893.7814171595
Return code: FTOL_REACHED
julia> neldermead = GeneralizedLinearMixedModel(@formula(yield ~ 1 + (1|batch)), MixedModels.dataset(:dyestuff),Normal(),SqrtLink());
julia> ┌ Warning: Model has not been fit
└ @ MixedModels ~/Work/MixedModels.jl/src/generalizedlinearmixedmodel.jl:563
┌ Warning: Model has not been fit
└ @ MixedModels ~/Work/MixedModels.jl/src/generalizedlinearmixedmodel.jl:563
julia> neldermead.optsum.optimizer = :LN_NELDERMEAD;
julia> fit!(neldermead)
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
yield ~ 1 + (1 | batch)
Distribution: Normal{Float64}
Link: SqrtLink()
Deviance: 58890.8511
Variance components:
Column Variance Std.Dev.
batch (Intercept) 599.565 24.486016
Residual 2178.889 46.678570
Number of obs: 30; levels of grouping factors: 6
Fixed-effects parameters:
──────────────────────────────────────────────────
Estimate Std.Error z value P(>|z|)
──────────────────────────────────────────────────
(Intercept) 39.0793 9.99691 3.91 <1e-4
──────────────────────────────────────────────────
julia> neldermead.optsum
Initial parameter vector: [39.08324449172631, 1.0]
Initial objective value: 58893.79526421842
Optimizer (from NLopt): LN_NELDERMEAD
Lower bounds: [-Inf, 0.0]
ftol_rel: 1.0e-12
ftol_abs: 1.0e-8
xtol_rel: 0.0
xtol_abs: [1.0e-10]
initial_step: [39.08324449172631, 0.75]
maxfeval: -1
Function evaluations: 80
Final parameter vector: [39.07930704835212, 0.5529134950119795]
Final objective value: 58890.851102667264
Return code: FTOL_REACHEDFor reference, here's the slow fit: julia> fastfit = fit(MixedModel, @formula(yield ~ 1 + (1|batch)), MixedModels.dataset(:dyestuff),Normal(),SqrtLink(), fast=true)
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
yield ~ 1 + (1 | batch)
Distribution: Normal{Float64}
Link: SqrtLink()
Deviance: 58890.8511
Variance components:
Column Variance Std.Dev.
batch (Intercept) 599.6439 24.487628
Residual 2178.8889 46.678570
Number of obs: 30; levels of grouping factors: 6
Fixed-effects parameters:
──────────────────────────────────────────────────
Estimate Std.Error z value P(>|z|)
──────────────────────────────────────────────────
(Intercept) 39.0793 9.99757 3.91 <1e-4
──────────────────────────────────────────────────
julia> fastfit.optsum
Initial parameter vector: [1.0]
Initial objective value: 58893.79517237531
Optimizer (from NLopt): LN_BOBYQA
Lower bounds: [0.0]
ftol_rel: 1.0e-12
ftol_abs: 1.0e-8
xtol_rel: 0.0
xtol_abs: [1.0e-10]
initial_step: [0.75]
maxfeval: -1
Function evaluations: 26
Final parameter vector: [0.55294989485893]
Final objective value: 58890.85110260193
Return code: FTOL_REACHED |
|
Based on tinkering elsewhere (e.g. toying with fitting a Gamma model with identity and log links to the I don't know if the optimizer issues are particular to the datasets I've been testing on (which aren't ideal for the models I've been fitting) or indicative or a larger issue in optimization for GLMMs with dispersion parameters. But given all that, I think it's ready for an initial review. |
…mm_dispersion_deviance
…mm_dispersion_deviance
…mm_dispersion_deviance
…_dispersion_deviance
…_dispersion_deviance
|
Do you want to continue to use |
|
For tests, yes, because there could in theory be further optimizations to the MersenneTwister (or more specifically, methods that interpret the stream produced by MT). On the PRNG front: I'm still looking at how we can take advantage of Xoshiro and related improvements for our embarrassingly parallel things. |
…_dispersion_deviance
…_dispersion_deviance
Closes #206
varest(this is handled implictly)devianceloglikelihoodcontra)devianceloglikelihoodcbpp,verbagg)devianceloglikelihooddevianceloglikelihooddevianceloglikelihooddevianceloglikelihoodgrouseticks)devianceloglikelihoodNote (to self) that some distributions (e.g. Gamma) require changing the linear predictor + dispersion into a different parameterization (e.g. shape, scale).