New paper looking at joint truncation adjustment and Rt estimation

Thanks @samabbott for the summary! But I think that the model in this paper is actually very similar to what we have been doing, and the background is that the two points

  • treating final counts as known and so using a binomial of beta-binomial obs model
  • treating I as a discrete parameter

are actually directly related. The paper models a sequential Binomial sampling of reports, with success probabilities given by the reporting hazard. My understanding is that this is equivalent to a multinomial distribution of reports where the number of trials is I(t) (total number of cases) and the reporting probabilities can be derived from the hazards (as we do in epinowcast).

Now if you assume that I(t) is Poisson distributed with mean E[I(t)], then by the properties of the Multinomial, the reports (C(t' | t'') - C(t | t') in the paper notation) are also individually Poisson distributed with mean E[I(t)] * p (where p is the reporting probability, not hazard), which is equivalent to what we are modeling in epinowcast. This also generalizes to the negative binomial, see an earlier post by @johannes (Some thoughts on parameterization of negative binomials).

This means that given the E[I(t)], and assuming independent Poisson noise, the reporting model in the paper is equivalent to what we are doing. The difference depends on how we interpret the infection model in epinowcast:

  • If we interpret our model as having a deterministic infection process, then the Poisson noise in case numbers must come from reporting errors, so the difference is indeed that we are modeling reporting errors but no infection noise in turn.
  • If we interpret our model as approximating a stochastic infection process, i.e. the number of infections is itself Poisson or Negative Binomial distributed, then we are also modeling no reporting errors. In this case, the only difference between epinowcast and the model in the paper is that by explicitly sampling the I(t), they take into account the dependence in infection noise over time, which is something we ignore when only sampling E[I(t)].

In the implementation for Generative Bayesian modeling to nowcast the effective reproduction number from line list data with missing symptom onset dates I tried to approximate the infection noise using a continuous distribution, but still used Poisson distributed cell counts, which basically means that I assumed both infection noise + reporting errors, i.e. even more variation. Not sure if this is justified, but this was the only viable option I could implement in stan. My experience with the models I have tried is that the differences in estimates are typically quite small especially when case numbers are not extremely low, but that the sampling speed of HMC can differ a lot (I think this has to do with the autocorrelation in the infection process).