I’ve been involved in a few collaborative nowcasting/forecasting projects (i.e. CDC COVID19, Germany/Poland COVID19 forecasting/nowcasting, European COVID19 forecasting, SPI-M-O short-term forecasts + reproduction number estimation) as a contributor and stood near people as they run these projects.
I was wondering if people had any thoughts about them both in terms of their experience but also in terms of their limitations and how they can be improved.
My take on the main aims of these projects is generally:
- Provide improved forecasts to stakeholders by ensembling a range of approaches and evaluating forecasts that may already exist in a single framework
- Drive iterative improvement for the forecasting task at hand by feedback and model selection
- Improve the practice of forecasting more generally by highlighting what works well and what doesn’t
I think we are currently doing 1. pretty well here with 2 happening a little bit but perhaps mostly via model selection (i.e. drop outs) vs model improvement (would love to be wrong about this). Not sure we have made massive progress with 3. and its not clear that just doing more of the same will get us there?
The main issue I have found personally as a contributor is prioritising these kinds of initiatives given how much work they take. We have tried to do that in the past by attaching research projects to submissions (for example https://www.medrxiv.org/content/10.1101/2022.10.12.22280917v1.full.pdf) but except in one instance it has been so much work that the research part of these projects has ended up getting dropped just because there is no time/resource left. The incentive is enough to do them (especially if its your first time) but maybe not to keep iterating/innovating?
That then relies on those doing secondary data analysis to pull out findings from the hub results themselves but that is such an overwhelming job it seems to be a real challenge to dig deep enough (and through all the noise of realistic data and model variations) to get real insights to drive iterative improvements. Maybe these will start being used as a data source for others to do secondary analysis and this will help with this resource issue? I haven’t seen loads of pure secondary data analysis coming out that wasn’t by hub organisers but perhaps I am looking in the wrong places. People do definitely love to evaluate against hub ensembles as a baseline which does seem like a very strong win for the community and for the organisers (but again not really those contributing).
I haven’t seen a huge amount on how we can do better here but perhaps I just haven’t been looking in the right places? Does anyone know if there is an initiative underweight from the various stakeholders (i.e. hub organisers, contributors, funders, and those consuming the forecasts) to think about what the next generation of these projects look like?
The biggest success for me has been @dwolffram and @johannes nowcasting hub. I don’t think there is a clear reason why exactly apart from that the statistical problem is a little clearer/well phrased and most people were using fairly similar models (making learning what works/doesn’t work a bit easier). I wonder if that suggests that a more limited scope could be helpful? It could also have been @dwolffram charm of course but were very charming people involved in the other projects I contributed to.
I’ll circle back to this and make it less of a ramble in a bit!