Automatic kernel selection with GPs
I’m a really big fan of this package GitHub - probsys/AutoGP.jl: Automated Bayesian model discovery for time series data which does automated Gaussian process kernel selection in a particularly clever way.
Way back in the day I was interested in particle filters/SMC methods and their interface to parameter inference (i.e. pMCMC etc). AutoGP does a really clever multi-stage inference powered by the Gen.jl PPL:
- Outer layer is an ensemble of particles. Each particle represents a GP model with a particular structure and parameters. As you pass this more data you reweight and resample the ensemble (standard particle filter technique.)
- In-between new data ingestion you can propose new kernal structures according to a really neat walk on a graph structure of kernels. Its not as simple as proposing going from say SE → Linear, you can make those proposal but also combination moves of kernel * new_kernel, kernel + new_kernel and CP(kernel, new_kernel) which represents proposing a change point in the validity of one auto-covariance structure in favour of another. These get proposed using a specialise MH step which wraps HMC steps for the continuous parameters in the proposed discrete structures.
AutoGP has a really nice API to do the above on a big chunk of data or in small sequential chunks.
Nowcasting
The big problem for an epi application point of view is that it doesn’t have a “natural” way to include nowcast modelling. AutoGP is heavily specialised into being the best it can be in the restricted domain of pure time series modelling i.e. no covariates and no multidimenisional inputs like x = (reference_date, report_date).
To get around this I’ve developed NowcastAutoGP with my Center for Forecasting and Outbreak Analytics hat on (ok I actually only have one hat) GitHub - CDCgov/NowcastAutoGP: Combining AutoGP (Gaussian process ensembles with kernel structure discovery) with data revisions .
The idea is pretty simple; NowcastAutoGP ingests nowcast samples from any nowcast model that can generate nowcast samples, and uses the AutoGP sequential data ingestion API to batch forecasts over the set of sampled nowcasts.
The upside is that this is very flexible, you can choose your favourite nowcasting model to generate nowcasts and pipe into AutoGP’s handy forecast tooling. The downside is that because its not a joint nowcast/forecast model the posterior distribution of the nowcasts is not influenced by the likely trajectory of the GP models… which opens you up to misspecification in extreme examples.
EDIT:
Since @samabbott wanted a bit more context. The basic idea of SMC is to evolve a distribution towards a target distribution, this can take many forms but the most common are things like Kalman filters (the distribution is known to be invariantly Normal therefore you only need to update mean vectors and covariance matrices) and particle filters (which are the same idea as samples in MCMC but you increment all the particles rather than sampling a chain of them).
The upside of SMC is that a common way of incrementing them is to observe new data, this corresponds to the vibey term of “updating your priors”. So in the case where your nowcast data is short compared to your longer time series of stable reports this is very convenient think:
- Learn everything about the stable reporting past you can.
- Cache that
- Batch a set of new possible learnings over your nowcasting ensemble that increment your cached learning
- In each batch do a forecast.
This corresponds to usual posterior predictive modelling.
In an MCMC approach you could do this, but it would be very computationally painful depending on your model since 2000 nowcasts would imply running the MCMC over 2000 effective datasets (including all the stable past data points). Using MCMC you’d be much better working harder to create a proper joint model of latent process, eventual reports and current reports… but that is such a hard challenge you’d need an advanced code base, a community, maybe some seminars and a forum to discuss such a difficult modelling challenge
.
Significant downsides to the NowcastAutoGP approach
Unfortunately, the convenience above comes with plenty of costs:
- The nowcasts arrive as part of a “pipeline” analysis, which is convenient but isn’t a joint model of process and reporting. I can easily imagine cases where this goes wrong.
- This is really fast when you only do the “outer loop” of particle reweighting, and gets successively slower as you add more particle refresh steps.
- It relies on
Genand I’m having a few esoteric issues withdeepcopyover various model object.
Full SMC?
The above kind of suggests that some bright spark should do a full model with SMC inference rather than this kind of fast glue project… back to you @samabbott …