Epinowcast - Filtering by earliest observed report date - separate function

enw_add_incidence()

I am working on this “good first issue”. One line of existing code is in question:

reports <- reports[,
    .SD[reference_date >= min(report_date) | is.na(reference_date)],
    by = by
  ]

I’m reading to get up to speed, but various dates have me a little confused.
My understanding is reference date is date of first positive test for specifc individual. Wouldn’t it be a data error if report_date came BEFORE the reference date?

Likewise, if is.na(reference_date) is TRUE this, too, seems like data problem.
Can you refer me to sample data examples so I can understand the issues better?
Thx.

Thanks.

1 Like

Thanks for this @jimrothstein and sorry it took so long to get back to you! For others this is now being addressed in ISSUE 305: first attempt to put reference_date >= min(report_date) into separate… by jimrothstein · Pull Request #430 · epinowcast/epinowcast · GitHub

My understanding is reference date is date of first positive test for specifc individual. Wouldn’t it be a data error if report_date came BEFORE the reference date?

So it is possible in retrospective aggregate count datasets which many users will have. I think in this instance this is a correction to catch that.

Likewise, if is.na(reference_date) is TRUE this, too, seems like data problem.
Can you refer me to sample data examples so I can understand the issues better?
Thx.

This is because we are using aggregate counts and not individual-level data and so it is possible that some people are missing reference dates (and this is then something we support the modelling of). To get a handle on this I suggest looking at the package vignettes or the example scripts in inst/examples.

I defer to the others. But if something here or elsewhere I can do, please let me know.
jim

1 Like