Google hurricane forecasting evaluation

samabbott · 5 January 2026 16:28

I really enjoyed this recent preprint on hurricane forecasting from Google folks: https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/how-we-re-supporting-better-tropical-cyclone-prediction-with-ai/skillful-joint-probabilistic-weather-forecasting-from-marginals.pdf

aside from the interesting modelling etc what really struck me was the similar but different probablistic forecasting metrics that they are using to evaluate their model. This relates to Do your evaluations have enough power? - #4 by samabbott in that it is clear other fields have different standards of practice.

In particular I liked their different approaches for pooling CRPS and their calibration measure (centred on 1).

They also talk about a “scorecard” which is very ML and reminded me of this: Baseball Stats, Model Cards, and Forecasting Performance

I was wondering about seeing if they might like to speak somewhere about this - epinowcast might not be a goer but perhaps we could get them to come and give a talk at LSHTM if they are up the road at Kings cross?

Topic		Replies	Views
Do your evaluations have enough power?	27	297	17 February 2026
Baseball Stats, Model Cards, and Forecasting Performance Project Proposals	17	288	11 March 2026
Community Seminar 2024-08-07 - Kaitlyn Johnson - Wastewater modeling to forecast hospital admissions in the US: Challenges and opportunities Meetings	19	184	14 August 2024
Design sketch: Julia equivalents of scoringRules and scoringutils — feedback welcome Project Proposals	4	20	18 February 2026
Scoring best practice: Should we always have scoring simulations in our papers?	5	35	27 April 2026

Google hurricane forecasting evaluation

Related topics