For the June seminar we have Alba Halliday more here Alba Halliday - Modelling diseases with delayed reporting and nested structures using a hierarchical framework – Epinowcast
Please post asynchronous questions below.
For the June seminar we have Alba Halliday more here Alba Halliday - Modelling diseases with delayed reporting and nested structures using a hierarchical framework – Epinowcast
Please post asynchronous questions below.
reminder that this is tomorrow!
Delayed reports hide
Bayesian models reveal
Truth in nested data
Also its quite a stealthy code base (reminded by @jamesazam) so see here: GitHub - AlbaMH/nimbleCast: What the Package Does (One Line, Title Case)
Really great stuff!
Here is a LLM generated summary for people.
Presenter: Alba Halliday (University of Glasgow) Collaborators: Oliver Stoner (University of Glasgow), Leonardo Bastos (Oswaldo Cruz Foundation, Brazil), Theo Economou (University of Exeter)
• Development of a hierarchical Bayesian framework for nowcasting disease surveillance data with reporting delays and nested structures
• Application to Brazilian SARI (Severe Acute Respiratory Illness) hospitalisations and nested COVID-19 cases
• Joint modelling approach that links total SARI cases with COVID-positive subset to improve prediction accuracy
• Implementation challenges and solutions for operational disease surveillance systems
• SARI hospitalisations: individuals with fever and cough onset within 10 days requiring hospitalisation
• Reporting delays: cases occur in week T but are reported with delays of 0, 1, 2+ weeks
• Nested structure: proportion of SARI cases that test positive for COVID-19
• Challenge: COVID test results have unknown reporting delays (no timestamp when results are added to records)
• Joint model for total counts (negative binomial) and partial counts (generalised Dirichlet multinomial)
• More flexible than standard multinomial approaches due to additional dispersion parameter • Better captures covariance structure in partial counts
• Implemented as series of beta binomial distributions for improved MCMC efficiency
• COVID cases modelled as proportion of total SARI hospitalisations using beta binomial distribution
• Key innovation: link between expected SARI trend and expected COVID proportion through shared parameter (δ)
• Intuition: waves in SARI often driven by COVID waves • Additional censoring layer to account for incomplete COVID reporting
• Improved prediction precision compared to existing nowcasting approaches for COVID fatalities
• Joint modelling shows better performance than separate models, particularly for capturing trend dynamics
• Age covariate further improved COVID predictions (elderly more likely to be hospitalised for COVID)
• Model successfully captures both temporal trends and delay distributions in real data
• FIOCruz runs InfoGripe surveillance system using INLA (fast, 2-3 minutes runtime)
• Marginal model approach with simpler computational requirements
• More flexible model specification (handles non-standard distributions)
• Potentially more robust (INLA installation issues, struggles with sparse data)
• Enables joint modelling of multiple data streams
• Computational cost: ~12 hours for rolling nowcast experiment vs. minutes for INLA
• Requires MCMC parameter specification and more programming
• Barrier for novel users compared to INLA’s simplicity
• General package for fitting various nowcasting models
• Similar syntax to INLA to facilitate transitions
• Supports both GDM and simpler approaches
• Available on presenter’s GitHub (still in development)
• Extension to multiple viruses (RSV, influenza) alongside COVID within SARI framework
• Forecasting capabilities beyond nowcasting for preventive interventions
• Application to other nested surveillance structures:
• Potential pathogen interaction modelling with semi-mechanistic approaches
• Data: Brazilian SARI surveillance 2021-2024, 27 federal units
• Delays: up to 20 weeks for SARI, 30 weeks for COVID
• Window: 60-week rolling window for analysis • Implementation: R package Nimble for MCMC sampling
• Spatial effects: independent across federal units (extensible to spatiotemporal)
• Stability of COVID-SARI link over time and with changing epidemiology
• Potential for multiple pathogen interaction terms
• Window length optimisation for operational use
• Alternative approaches (e.g., downloading historic datasets to reconstruct delays, fitting to the counts with a joint model)
• Computational efficiency improvements (new sampling methods in development)
My main question is what would you recommend thinking about adding to epinowcast from this work and in what order?