Lots of GP models + nowcasting

Automatic kernel selection with GPs

I’m a really big fan of this package GitHub - probsys/AutoGP.jl: Automated Bayesian model discovery for time series data which does automated Gaussian process kernel selection in a particularly clever way.

Way back in the day I was interested in particle filters/SMC methods and their interface to parameter inference (i.e. pMCMC etc). AutoGP does a really clever multi-stage inference powered by the Gen.jl PPL:

  • Outer layer is an ensemble of particles. Each particle represents a GP model with a particular structure and parameters. As you pass this more data you reweight and resample the ensemble (standard particle filter technique.)
  • In-between new data ingestion you can propose new kernal structures according to a really neat walk on a graph structure of kernels. Its not as simple as proposing going from say SE → Linear, you can make those proposal but also combination moves of kernel * new_kernel, kernel + new_kernel and CP(kernel, new_kernel) which represents proposing a change point in the validity of one auto-covariance structure in favour of another. These get proposed using a specialise MH step which wraps HMC steps for the continuous parameters in the proposed discrete structures.

AutoGP has a really nice API to do the above on a big chunk of data or in small sequential chunks.

Nowcasting

The big problem for an epi application point of view is that it doesn’t have a “natural” way to include nowcast modelling. AutoGP is heavily specialised into being the best it can be in the restricted domain of pure time series modelling i.e. no covariates and no multidimenisional inputs like x = (reference_date, report_date).

To get around this I’ve developed NowcastAutoGP with my Center for Forecasting and Outbreak Analytics hat on (ok I actually only have one hat) GitHub - CDCgov/NowcastAutoGP: Combining AutoGP (Gaussian process ensembles with kernel structure discovery) with data revisions .

The idea is pretty simple; NowcastAutoGP ingests nowcast samples from any nowcast model that can generate nowcast samples, and uses the AutoGP sequential data ingestion API to batch forecasts over the set of sampled nowcasts.

The upside is that this is very flexible, you can choose your favourite nowcasting model to generate nowcasts and pipe into AutoGP’s handy forecast tooling. The downside is that because its not a joint nowcast/forecast model the posterior distribution of the nowcasts is not influenced by the likely trajectory of the GP models… which opens you up to misspecification in extreme examples.