My first Msc was in neural networks back in the mists of time (2006), so I’ve always had an interest whilst not really going down a ML maxing path in my research.
An aspect of epi Renewal models is that the underlying infection generating process
I(t) = R(t) \sum_{s\geq 1} g(s) I(t-s)
can be thought of as a sequence of two step operations:
A weighted sum over a “hidden” or carried state of recent infections
Multiplication by the $t$th “input” value of R(t)
The generation of this process for T time steps from an initial hidden state and some input vector R(t) involves a T-fold recursion. This is precisely the kind of modelling implemented by a recurrence layer in a neural network Recurrent neural network - Wikipedia
Going from latent infections to the expected number of observations in a simple model is just a one dim convolution
O(t) = \left( I \circ f \right)(t)
Convolution layers are also standard in NN libraries.
Neural networks are designed to be the composition of input-output layers (functional composition) hence the simple renewal model is just a chain of commonly used NN layers with a Poisson link at the end.
The main potential wins here are (IMO):
Neural networks are fun and have plenty of libraries, and naturally promote compositional modelling in terms of function compositions (i.e. layers)
NNs are popular and have lots of compute tricks that people have spent a lot of time on (e.g. efficient adjoints, GPU kernels, optimised compute/AD libraries like Reactant.jl etc etc)
I have some real scrappy code I did quick by shouting at claude, that I’m not yet going to put on main of even a “try things out” repo. This fuses the ideas I’ve outlined above using the NN library Lux.jl (my current fav NNlib in Julia)
This looks great Sam and I agree there is a lot of potential here. As we threw back and forth in the composability work Lux.jl and other NN packages also other a ready made solution to composabiltiy and so are attractive in terms of that.
Something else to note is that it is quite common to represent AR processes as NN where you have time varying parameters on the AR. As the renewal is a constrained AR that also suggests this is a natural thing to do (@OvertonC and I were chat about this earlier today). You could also make it so that Rt is modelled as time-varying AR with the renewal as an AR and generation time weights also time evolving.
I think you should be able to represent ascertainment using this approach fairly easily as well which is pretty neat.
Nice links. I’d need to more than glance to be sure, but at a glance, putting the renewal equation solve in is like a simple special case of the mechanistic layer in the first linked paper.
The way I’m thinking about this is:
One route is to make the dynamics “NN-ified”, thats this toy model and the first link. The upsides are unified familiar compute frameworks plus potential under the hood optimizations.
Another route is to take a NN and “dynamic-ify” it. The upsides are the huge amounts of adaptive solver work that have gone into ODE solvers and their adjoint systems. This was the original motivation of Neural Ordinary Differential Equations i.e. why have a recurrent layer and try and tune the number of recurrence steps when you can outsource this to a adaptive solver.