Addressing Critical Gaps in Generation Time Estimation During Outbreaks - grant application

samabbott · 30 June 2025 20:25

I just submitted the following application to the BMBR call in the UK and thought I would share it for interest, feedback (though as this is now submitted please be kind) and to see if anyone is thinking along similar lines and wants to collab in the future.

Thanks a lot to those of you who already gave feedback or agreed to be uncosted collabs where I will bother you if this gets funded. Note that this draws on ideas from Generation interval estimation as a censoring problem - #3 by samabbott and Handling multimodal delay distributions - #2 by adrianlison amongst other places.

Abstract

Generation times underpin infectious disease transmission modelling and so inform public health decision-making, help determine outbreak spread, and allow evaluation of control measures. During the COVID-19 pandemic, UK reproduction number estimates relied on potentially inappropriate generation time estimates from different populations, introducing systematic bias into transmission assessments. Generation times are part of a broader family of transmission-pair-dependent (TPD) delay distributions—including serial intervals (time between symptom onset in connected cases), test-to-test intervals, and other measures that require knowing relationships between individuals. Despite their critical importance, public health agencies lack robust methods for estimating these delays during outbreaks. Current approaches suffer from limitations: inadequate handling of observation biases, inflexibility to diverse transmission contexts (households versus community settings), implausible assumptions about infectiousness-symptom relationships, and inability to produce robust high-resolution estimates. The observation process adds further complexity, transmission pairs may be partially or completely unobserved, with events recorded at different stages. One reason these challenges persist is because delay estimation methods used in infectious disease modelling have developed in methodological isolation from survival analysis, despite both fields addressing fundamentally similar time-to-event problems. Recent work demonstrates the value of bridging these disciplines, but methods remain underutilised for TPD delays.

This project will develop a coherent framework for TPD delay estimation, extending our established methods for delays that depend on events from a single individual. Building on proven mixture model approaches for serial intervals with unknown transmission pairs, we will extend these methods to support the complex observation patterns typical of outbreak data. We will create flexible non-parametric methods that adjust for interval censoring and right truncation to enable estimation of delays in challenging scenarios where parametric delay distributions are not representative. For simpler surveillance scenarios, we will develop robust, granular, and fast estimation methods where the infection process can be incorporated through informative priors rather than requiring joint modelling. We will extend this approach to handle flexible infection processes across diverse reporting settings, addressing uncertainty around transmission pairs, unobserved transmission chains, and partial observation of events. Building on growing recognition of the need to bridge survival analysis and infectious disease modelling, we will establish cross-domain collaborations to identify transferable innovations for transmission-pair-dependent delay estimation. Finally, we will evaluate whether unified time-to-event frameworks for infection and delay processes provide advantages over current compartmental or renewal approaches with separate delay estimation.

Starting with enhancements to existing tools, the project will deliver a unified framework for transmission-pair-dependent delay estimation during outbreaks, implemented as open-source software. By addressing methodological gaps and providing robust implementations of our solutions, this project will enable more accurate real-time outbreak analysis and support evidence-based decisions.

Full research plan: research-plan.pdf (126.4 KB)

adrianlison · 1 July 2025 14:56

Nice plan, fingers crossed. I think you are in a great position to push forward on these problems based on your recent work and infrastructure building and the network we have around. Collaborating more closely with the survival analysis field sounds very sensible!

Some random remarks:

Mixtures

You write in WP1:

We will then extend this approach to include independent distributions
(where different delay types combine to produce the observed interval)

and then in WP3:

… we will extend the Ward et al. latent variable framework using mixture models from
WP1 to handle unknown transmission pairs and the mixture of the generation time
and incubation period

First of all, supporting mixtures is really great. I was just wondering what exactly you are thinking of in WP3 in terms of mixing generation times and incubation periods - is this more in the sense of directly modeling a convolution of two parametric distributions rather than explicitly estimating the underlying distributions together? I guess I don’t fully understand how closely linked this is to modeling mixture distributions in the framework you envision.

epinowcast

We will also modularise epinowcast’s joint architecture to make infection process modelling optional, enabling standalone time-to-event based delay estimation

Nice, that would be very interesting as a feature in general, to see how much of the delay distribution estimates in nowcasting are informed simply by the delay data vs. autocorrelation of cases. Will you achieve this simply by supporting uncorrelated case counts (broad or improper prior for daily cases) or in a different way?

Transmission priors

Unlike joint infection process models, we will represent the infection process using
priors, making methods more computationally tractable and flexible whilst retaining
the ability to propagate uncertainty.

I didn’t fully get what you mean by priors for the infection process here and how this is different from joint modeling. Are you thinking of estimating the epidemic trajectory independently and inserting it through some kind of multivariate prior in the generation time model? I’m probably thinking in the wrong direction here.

Output and training materials

Super important and not easy. When you allow people

to mix and match delay distributions (parametric, non-parametric, mixture models)

I guess guidance on how to decide what a good fit is and what to report will be really crucial. In terms of downstream use, I think another important problem at the moment is that if uncertainty of distribution estimates is reported, it is typically for one specific set of parameters (e.g. the shape and scale of a Gamma distribution), and the correlation between those parameters is often not reported. This makes it hard to correctly include uncertainty information when you have a differently parameterized model (for example mean and sd of the Gamma). Maybe you can come up with some recommendations here.

The crowd

aiming for 20,000+ downloads across the project lifespan

I know that EpiNow2 also has several ten-thousands of downloads, so this is probably achievable? But where do all these modelers come from? I am probably underestimating the size of the community!

samabbott · 8 July 2025 15:20

Thanks for the comments @adrianlison!

Collaborating more closely with the survival analysis field sounds very sensible!

Yeah I am really keen on this idea more generally.

WP3 in terms of mixing generation times and incubation periods - is this more in the sense of directly modeling a convolution of two parametric distributions

Yeah I realise now I should really have been clearer about this. The mitey etc approach is a mixture of convolved delays and it is that which we would want to take advantage of for generation time estimation (i.e. data might have different delays from infection to report in the data). Good point!

Will you achieve this simply by supporting uncorrelated case counts (broad or improper prior for daily cases) or in a different way?

Yeah I agree this is interesting. So I thought this would be one option to play around with but I was also thinking about a different likelihood to be only used retrospectively that depends on the final counts (i.e multinomial).

I didn’t fully get what you mean by priors for the infection process here and how this is different from joint modeling.

Here I mean not having an infection process model i.e. like the work we have been doing in epidist. There we can inform the first events prior with a growth rate etc and I think this makes sense as a first pass generation time estimation model.

I guess guidance on how to decide what a good fit is and what to report will be really crucial.

Yes agree.

and the correlation between those parameters is often not reported

yes this is a good point. I would like to make some tooling around this for epidist to make it a bit easier.

But where do all these modelers come from

haha I think a lot of it is CI calls!

Topic		Replies	Views
Generation interval estimation as a censoring problem Project Proposals	2	48	9 May 2025
New preprint: Estimating epidemiological delay distributions for infectious diseases Projects preprint , delay-estimation	2	155	19 January 2024
CensoredDistributions.jl: Julia Meets Messy Outbreak Delay Distribution Data	11	48	13 August 2025
Adding a new package to epinowcast github Project Proposals	2	451	27 February 2023
Community seminar - 2025-05-07 - Dongxuan Chen - From generation interval to superspreading potential, from population level estimates to setting-specific estimates, in the case of COVID-19 Meetings	1	35	7 May 2025