The epinowcast package currently uses Hamiltonian Monte Carlo (HMC) via Stan (cmdstanr) to perform Bayesian inference. HMC is an accurate but expensive (slow) inference method. Other inference methods (e.g. variational inference, Laplace approximations) are less accurate, but less expensive (faster).
Hypothetically, it would be preferable to estimate R(t) across all states simultaneously using a joint spatial model. This would allow for partial pooling of information. However, implementing a spatial nowcasting model which runs in a reasonable time using HMC is challenging.
To faciliate this application, as well as other potential extensions of the epinowcast methodology. I propose to implement the epinowcast model (it remains to define exactly what is meant by the model) using Template Model Builder (TMB). The TMB R package allows users to specify the log-posterior for their model, and perform inference using a marginal Laplace approximation. It is also possible to TMB to perform integrated nested Laplace approximations (see Chapter 6 of my PhD thesis).
As the epinowcast package itself has a lot of Stan code in it, I imagine the best place to start with this project will be to port over some of the work @adrianlison has done in generative-nowcasting-study. The next step would be to test the inferential accuracy of the TMB version of the model, as compared with a HMC version. This is aided by the tmbstan R package. Supposing the accuracy is sufficiently good, this would merit moving onto trying to define a good spatial extension. (Also, even users not looking to use a more complicated model may wish to have a different trade-off in terms of speed and accuracy).
Moving forward, I am also excited about the integration of other data sources into the R(t) estimates. Enabling more complex model structures is a positive step in that direction.
Posting here for potential feedback, discussion, advice! (Perhaps I should edit to be more clear about my uncertainties and points I’d particularly appreciate input on!)
Hey @athowes, that’s an exciting project! I was wondering how much the laplace() and the variational() methods in cmdstanr can give us an idea of how good of an approximation to expect (as a cheap test for different versions of the model), compared to the accuracy of marginal/nested Laplace approximations in TMB…
The Laplace approximates the whole posterior by a Gaussian, whereas the marginal Laplace just a marginal posterior distribution. I’m not that familiar with the model, but I would assume there are some parameters (e.g. those with non-Gaussian priors) that using a Gaussian would seem quite inappropriate for. I’m unsure exactly how laplace() works in cmdstanr. (Presumably they are using approximations on the unconstrained space, then mapping back?) So in that sense, I would expect the marginal Laplace to perform better, but it’s unclear to me how much better (and at what cost).
About the variational() method, I would assume it is inaccurate in estimating the uncertainty.
I think you are making a good point, which as I interpret it is that I should be starting by trying to use (and testing the accuracy of) the pathfinder (#464) and Laplace methods (perhaps not yet in?) for the current cmdstanr version of the model, before trying to implement something in another language. I agree. Perhaps as an intermediate task I could try 1. getting Laplace with cmdstanr into epinowcast 2. writing a vignette comparing the different inference methods possible with epinowcast via rstanarm (could skip straight to 2. by omitting Laplace).
Caveated that I don’t know exactly what epinowcast is doing; but two concerns I would have and investigate early:
If the severe correlation structure of time dependent parameters makes the Hessian really dense, then TMB is really memory intensive / slow. This became an issue for Kai Hon.
The Gaussian approximation becomes poor for some epidemic model parameters and so the Laplace approximation confidence intervals become bad. This especially plays up for initial pulse and growth rate parameters. Maybe less issue in the “nowcast” (vs. early-outbreak) scenario. I would expect that to also create sampling challenges for HMC.
I agree with @adrianlison that this seems like a good idea for a project and would definitely be useful for people trying to develop complex models (though I think I would push back a little on the idea that you can’t do this using NUTs (when you have within chain multithreading) in a semi-reasonable time).
This does seem like a good place to start for sure. As you say I would expect inaccurate uncertainty from both methods but having an understanding of that would still be helpful for things like rapid prototyping. The only package side change we would need here is helpers for variational and laplace in the vein of enw_sample and enw_pathfinder.
For both this and an exploration in TMB I agree that starting with a limited subset of the model would make sense (i.e using a AR model for Rt with latent delays and non-parametric hazard model observed reported delays) or some set of these with slightly wider coverage (include multiple strata or different count datasets for example).
Agree with these (though I don’t think there is a clear delineation between an outbreak scenario and a nowcasting one (you can nowcast in both just with less information). For this adds weight with starting with a lighter subset of models then trying to generalise to give recommendations for when you do and don’t need NUTs
I think this comes down to the definition of “complex” and “semi-reasonable”. For sufficient complexity then you can’t do it using NUTS. I’d be interested as to whether there are benefits from more complex models (more data) that make up for the loss of inferential accuracy.
Either case, I think the rapid prototyping argument is also a strong one. epinowcast already has lots of flexibility with its formula interface, and modelers might want to try out many options to get an overview over what modeling decisions have the strongest impact.
Of course the performance of the approximations will limit the use cases - if the method only gives reliable MAP estimates but not good uncertainty quantification, it might not help me to find out which modeling choices influence uncertainty towards the present the most…