Skip to content

example where lack of AD-based outer gradient (3rd deriv) makes a big (bad) difference #96

@paciorek

Description

@paciorek

I just re-ran Laplace on the MSOM carnivores example (for the ISBA Bulletin piece). It runs much slower (something like 5x slower) if ADuseNormality=TRUE (which is now the default). It turns out it is not whether we use AD or analytic gradients of the latent effects in terms of the Laplace approx. Rather what makes a difference is that if we use normality we can't use AD-based gradient of the Laplace approx (the outer optimization) and this turns out to matter a lot more than I would have expected (though I think Perry made a comment a while back that indicated he was less sanguine than me about using finite element derivs for the outer optimization).

I think that setting ADuseNormality=TRUE by default was a reasonable decision (see PR #60)], but it seems like we might want to think more about some rules for setting it to FALSE in some cases (perhaps if there are no large dmnorm nodes or no dmnorm nodes at all).

For a reproducible example, I have the code for the ISBA Bulletin piece (or we could go back to the MSOM report code).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions