Community Seminar - 2025-07-02 - Maile Thayer talking about Real-time modeling during the 2024-2025 Dengue Epidemic in Puerto Rico: Applications and insights

Today at 3pm!

Post async questions and thoughts here.

Really enjoyed this talk and great to see so much modelling being used in practice!

I just read https://www.medrxiv.org/content/10.1101/2024.11.09.24315999v2 in detail.

I really like how it presents nowcasting as a solvable problem and looks across different locations and models. The model based failure evaluation (really score) is nice and close to the kind of model based evaluation that @kath-sherratt has been proposing.

I think there are some issues with the epinowcast setup and details though that reflect wider issues I have seen that I think are worth discussing (also reflects on my question at the talk about how to provide tools to people doing applied modelling).

(note I used an LLM to translate these out of an email I sent to the paper authors with minor clean up)

Framework

Epinowcast is designed as a framework for building context-specific models rather than providing a single pre-configured approach. This flexibility is powerful but can be challenging for users expecting an out-of-the-box solution. The getting started vignette provides one example implementation, but users often need to adapt it substantially for their specific context in practice this often doesn’t happen leading to misspecified models.

Timestep

When working with weekly data, it’s important to use the timestep argument rather than the default daily model. The default data cleaning process can introduce artifacts when handling temporal mismatches that might not be immediately obvious. We could try and detect this but it would then lead it issues where they didn’t have a timestep?

This is part of a broader issue for how users discover features i.e. there is documentation but clearly not at a high enough level. We can make a point of focussing on this feature but what about others, such as missing data.

MCMC Settings and Runtime

Different MCMC configurations (chains, iterations) between methods can make performance comparisons challenging (i.e. here they use 8 chains for epinowcast and 1 for nobs). It might be helpful to provide clearer guidance on appropriate settings for different use cases if we want to set ourselves out as something to validate against? I have no idea what this would look like? We could focus more on ess per second being the key thing to compare?

Model Diagnostics

I think we all agree posterior predictive checks and examining the fitted delay distribution are crucial for model assessment. These diagnostics can reveal whether parametric assumptions (like lognormal delays) are appropriate or if nonparametric alternatives might work better.

The support for these in epinowcast is there but perhaps it isn’t that great. I wonder what we can do to improve things?

Potential Documentation Improvements

Here are some areas where we might improve our documentation:

  1. Clearer framework guidance: Better explain when and how to adapt the example models. I just looked at the getting started though and I think it does this fairly well?
  2. Timestep examples: Add explicit examples for weekly, monthly, and other temporal resolutions
  3. Diagnostic tutorials: Expand guidance on model checking and validation
  4. Comparison guidelines: Provide advice for researchers conducting method comparisons

Questions for the Community

  • What aspects of the documentation have you found most/least helpful?
  • Are there specific examples or use cases we should prioritise?
  • Would a “choosing your model specification” flowchart be useful?
  • How can we better support researchers who are new to the package?
  • Should we create a wrapper package with preconfigured models for common epidemiological settings (e.g., weekly disease surveillance, daily COVID-19 reporting with a nonparametric time-varying delay?)?