Forecasting with variants

I was wondering about the community thoughts on epi forecast models with variants. Obviously, this was (and is) very topical around Covid modelling after new variants emerge.

However, I was wondering about the extent to which new variant emergence makes sense as part of a generative model for forecasting purposes.

For example, one could model new variants of some pathogen emerging according to a Poisson process with some kind of (conditionally random) epidemiological trait (e.g. partial avoidance of immune response in people recovered from previous forms of the pathogen). You can make inference on this kind of model, and therefore make some kind of forecast using it.

However, I can’t decide how valid this approach is from a practical point of view, e.g. in any reasonable time horizon the probability of a new variant emerging would be deep in the tail of the predictive dist and poorly sampled by a usual (few thousand) MCMC samples.

Also, the predictions conditional on new variant would be so different from those conditional on no new variant that the viz of the forecast would be rather wild.

I was wondering about the communities thoughts on this.

My thought on this has been that it should be something you can easily adapt a forecast model to capture i.e when there is some evidence of a new variant but not something to build into the models at all times.

So the workflow would be most of the time you use your standard model and then when domain expertise indicates there is some (even very small evidence for a variant) I would switch model composition to include a variant/novel strata with priors centred on the original.

Its less elegant in the sense of being a joint model than your approach but I think more practical due to the issues with it being in the tail of the distribution.

This and other similar problems has been one of the motivations for easy model composition and decomposition for me.

This and other similar problems has been on of the motivations for easy model composition and decomposition for me.

I think this is a good point.

1 Like

I think this is a really interesting question. I am getting flashbacks to BA.2 BA.5 times where this was salient.

The approach we broadly settled on was just to qualitatively be aware of the global and uk variant situation. Typically, by the time we are confident a variant is driving a new wave, admissions had already started to rise due to the large lags in genotyping (even with a great variant nowcast). With a novel variant there were so many unknowns it always felt like the only thing we could be half-confident on was “there is likely to be another wave”.

A simpler version of the problem might be influenza types rather than novel variants. We forecast A&B combined as a target, but there’s a good case for separate A & B models. They emerge at different times in the season each year so there is a drift from one being prevalent to the other, and there are different subtypes the proportion of which are unknown at the start of the season.

Perhaps it’s allowing your overall epi parameters to vary over time as the relative abundance of each strain changes. ie start with type A params and drift toward B.

The wider question maybe I’m getting at, is how “broad” should forecast targets be? Seems to be disease (COVID, Flu etc) is the widest used and the key metric, but is that making the problem harder to model from the start? Should targets be more granular? Quality of surveillance is the big blocker there but it’s nice to think about.

1 Like

Thinking about this is a nice case study as well as it should be broadly applicable to the variant case.

Should targets be more granular? Quality of surveillance is the big blocker there but it’s nice to think about.

Yes I think so and I agree.