Minimum viable model for generation time estimation

Hi All,

@samabbott and I have been discussing creating a tool that features the minimum viable model for generation time (GT) estimation that can be a component of a larger modelling framework.

The main idea is that rather than adding GT estimation to an existing complex tool (that also does forecasting, nowcasting, or similar), the goal is to build the smallest model that correctly handles interval censoring, right truncation, and the serial interval / GT conflation, and then make the output drop cleanly into downstream Rt​, nowcasting, and forecasting pipelines.

I’ve attached the current project proposal write-up and would be very keen for feedback. Thanks!

Minimum_viable_GT_model.pdf (223.0 KB)

2 Likes

This looks a really clear document!

Only thing I’d add is that to make this minimum viable, I think that you do need to specify a distribution of primary infection times to create a full generative model/be explicit in the marginalisation choices. I think its legit to just say its a flat distribution as a minimum viable approx, but it should be stated somewhere.

This looks great! Really interesting idea. A few initial comments:

  • I think it might be worth calling beta(t) the unnormalised infectiousness profile (i.e., including R) in your Euler-Lotka equation - maybe just me but I was confused looking for R.
  • “In practice, surveillance systems capture symptom onset dates for transmission pairs identified through contact investigation” - they do but exposure times (or time windows) are usually also part of contact tracing efforts. However:
  • Exposure times are often given as multi-day ranges (e.g. when the suspected exposure was in the household) so I think interval censoring could be more complicated than you state.
  • If your data includes any contacts in the household (which can be overrepresented in the identification of transmission pairs as they are the easiest to identify) you’ll also have to consider competition for and depletion of susceptibles, I think.

Anyway I think this would be a great starting point and perhaps just needs to be phrased slightly more narrowly as relating to a particular type of data (which I understand to be: transmission pairs and their symptom onset dates identified with 100% confidence, excluding households and without information on exposure windows) that future work can build on. That in itself would be really cool and immensely useful!

Thanks for sharing this @kylieainslie (I really like the idea of more open early project sharing as even within instituation/group its not uncommon to be surprised by work that is quite far along. Definitely takes some practice to feel confident to share openly though (at least for me!).

For others there are some connections to this: Addressing Critical Gaps in Generation Time Estimation During Outbreaks - grant application - #5 by samabbott here especially the first work package.

Good comments @sambrand @sbfnk I put some responses below. My general meta point is to reiterate that we are aiming to create a composable MVP here that can then be extended to the many different connected considerations. In particular we want something that is infection GP independent so that can be considered as a plug in at a later date.

Something that I think it would be great to add here is a section on current simplificatons and future potential extensions/enhancements (i.e tracking the comments here etc).

I also have some other work being spec’d out (more on this soon) that connects to this looking at deriving a specilaised distribution for generation times. I think that this could pair up very nicely

Also the model in the new composable world we will soon be living in. I think this is a key point in how I am thinking about it especially as to why it being a MVP is fine good as it means the model can always be enhanced vs having to live with MVP estimates.

This is the assumption and yes I agree we should flag it.

I think this is a feature to add in extension work vs for the MVP as both of those considerations add the need for some kind of infection generating process which should be a composable extension.

I agree we should generalise this to multi-day ranges as model wise it makes no difference.

@sbfnk and @sambrand thanks for the feedback! As @samabbott has already indicated, you make some good points both for clarification and for extension, which we can be more specific about in the proposal.

1 Like

@samabbott thanks for pointing me to this.

This proposal looks fantastic.

My only comment is about the simulation study.

When an epidemic grows exponentially, we expect forward serial intervals to be long because infectors will have shorter incubation periods than infectees. So even in the absence of truncation, fitting a convolution model that assumes identical incubation period distribution could introduce some bias. So it seems like accounting for truncation won’t be sufficient to get rid of this dynamical effect…

On the other hand, if we group serial intervals based on the exposure dates of the infectors, then I think the dynamical effect goes away. And so accounting for truncation would be sufficient…

I’m having trouble articulating the problem in more detail because these two scenarios are basically re-ordering of data but they’re somehow giving different answers/intuitions (but I think it’s an important distinction that needs to be made and addressed in practice)… I suspect that the difference is in the amount of truncation that needs to be taken care of. In the first case, the backward incubation period distributions are not truncated but are dynamically biased. And so the amount of truncation is from symptom onset to observation time. In the second scenario, there is no dynamical effect but the amount of truncation is from infection to observation time. So this difference suggests that depending on how the amount of truncation is implemented in the estimation framework, dynamical effect on incuabtion periods could unintentionally bias the estimates of generation interval. I think it might be important to think about this more carefully…