Pre-preprint: Composable probabilistic models can lower barriers to rigorous infectious disease modelling

Some work that grew out of a proof of concept @sambrand and I wrote a few years ago for how to do composable epi modelling as well as the ideas that were floating around in a few recent grant applications as to why that might be a good idea in terms of solving a lot of tooling problems we have (and from that science and policy problems).

The idea is to use these ideas to improve/unlock the best practices from the A workflow for infectious disease modelling work.

Feedback is very welcome the plan is to shortly submit this to Plos Comp B and preprint.

Abstract

Recent outbreaks of Ebola, COVID-19 and mpox, alongside routine surveillance of endemic pathogens, have demonstrated the value of modelling for synthesising data to inform decision making. For modelling evidence to effectively inform policy it must be timely, rigorous, and collaborative, yet current approaches struggle to be all three. Methods broadly fall into approaches that chain separate models together, offering flexibility but losing information and introducing bias, or approaches that rigorously analyse all data together but cannot be separated into reusable parts. Composable models, where components can be reused across contexts, can be both rigorous and flexible, enabling rapid collaborative model development. We outline proposed requirements for a composable infectious disease modelling framework and present a proof of concept domain-specific language built on the Turing.jl probabilistic programming language in Julia with an R interface. We demonstrate our approach conceptually using models from published epidemiological analyses, and in practice through a worked autoregressive example. We replicate three published analyses, composing elements of our autoregressive example with shared and novel components: a COVID-19 analysis for South Korea using a renewal process, adding components for reporting delays and day-of-week effects to replicate EpiNow2 for real-time nowcasting, and an ordinary differential equation analysis of influenza outbreak data. We then discuss strengths, limitations, and alternative approaches. We find that our proof of concept can address the tension between rigour and flexibility, though work remains to realise this potential. Our approach enables interdisciplinary collaboration by lowering technical barriers for domain experts to contribute specialised components, supporting both routine surveillance and outbreak response. For multi-model efforts, common components enable attribution of differences to assumptions rather than implementation. Our approach is also well suited for large language model assisted model construction. Given the unpredictable nature of future infectious disease threats, investment in adaptable modelling infrastructure is critical.

1 Like

This is neat stuff (if I do say so myself).

Come for the epi modelling, stay for the composable approach to ARIMA.

1 Like