This is just a note so I don’t forget - will circle back.
All based on work with @sangwoopark with any good bits being theirs and any fever dreams being mine.
The problem
In current discrete type joint primary incidence and delay models (like epinowcast
) there is a fundamental reliance on the primary event being accurately reported within a day. This is rarely the case and is a particular issue for things like the date of symptom onset.
In individual level models the problem is fairly simple as we can give priors for each primary observation over the censored window. This approach is hard to fit into the population level approach taken in most/all joint nowcast models.
Solutions
Retooling to individual level likelihood
This would involve retooling to individually fit data points but continuing to use information from the discrete time incidence model to inform the priors on the primary incidence dates.
This seems like a good option but would require a substantial refactor. It may actually be easier to start with current individual level approaches and add methods to inform the priors. Given that at least for now I don’t see a pathway in epinowcast
(but this is well suited to epidist
).
A mixture of the current likelihood
Instead of taking the direct approach we could fit a mixture for each entry in the reporting triangle across multiple primary event times. This would be like aggregating our uncertainty for individuals primary event times into a population level aggregate. The mixture weighting would represent the combined censoring window. This probability would in theory need to be based on the underlying expectation model to inform the prior (as it depends on the growth rate) but could in the first instance be assumed to be uniform (this would be nice as we could use static mixture weights making this more tractable).
You could in theory build more complex priors or let them vary by observation to account for differences such as weekend reporting.
For each observation this would look something like the following over a 3 day censoring period.
N_{td} \sim \mathcal{Poisson}\left( w_1 \lambda_{t-1} p_{d+1} + w_2 \lambda_t p_d + w_3 \lambda_{t+1} p_{d-1} \right)
Where
\sum_{i = 1}^3 w_i = 1,
and uniform censoring period would imply,
w_i = \frac{1}{3}
If you wanted to have incidence based (i.e. a growth rate-based prior with some reporting model) I guess you would decompose w into an incidence weighting and some reporting weighting with potentially normalisation to ensure the sum to 1 constraint.
This is based on the model formulation and notation from the epinowcast
documentation.
It’s not entirely clear to me the second proposal really makes sense. It would be nice to find some kind of reformulation that does though.