Baseball Stats, Model Cards, and Forecasting Performance

I just skim read: https://www.medrxiv.org/content/medrxiv/early/2026/02/18/2026.02.12.26346156.1.full.pdf

by @jack @mariatang @jonathon.mellor and others which uses a performance above replacement based approach to assess model contribution to an ensemble. They argue that this is justified by a functional ANOVA decomposition (I haven’t found more detail on this yet in the paper but I am reading the ref -_interested in hearing more).

There isn’t a justification in here for why this approach vs the model importance approach of @nickreich, Kim etc I wonder if it is the same justification as we have been following (i.e ensemble size confounding).

As a note I much prefer this being talked about as ensemble contribution vs model contribution as the latter just makes me think to what.

This is all very related to our proposal so perhaps there is an interest in having a chat about this a bit?

Will circle back when I have read it in more detail.

Also @jack the code link 404s: https://github.com/jcken95/subensemble-evaluation.

Update I did some git stalking and this is just a name type.The repo lives here: GitHub - jcken95/sub-ensemble-evaluation: Code supporting the manuscript "Evaluation of short-term multi-target respiratory forecasts over winter 2024-25 in England using sub-ensemble contribution analyses"

Great file name here: sub-ensemble-evaluation/src/R/prj/nowcast/whooping.R at 8591e534fb237654dfe0a11604143a86ceebd7a5 · jcken95/sub-ensemble-evaluation · GitHub