I am working on this “good first issue”. One line of existing code is in question:
reports <- reports[,
.SD[reference_date >= min(report_date) | is.na(reference_date)],
by = by
]
I’m reading to get up to speed, but various dates have me a little confused.
My understanding is reference date is date of first positive test for specifc individual. Wouldn’t it be a data error if report_date came BEFORE the reference date?
Likewise, if is.na(reference_date) is TRUE this, too, seems like data problem.
Can you refer me to sample data examples so I can understand the issues better?
Thx.
My understanding is reference date is date of first positive test for specifc individual. Wouldn’t it be a data error if report_date came BEFORE the reference date?
So it is possible in retrospective aggregate count datasets which many users will have. I think in this instance this is a correction to catch that.
Likewise, if is.na(reference_date) is TRUE this, too, seems like data problem.
Can you refer me to sample data examples so I can understand the issues better?
Thx.
This is because we are using aggregate counts and not individual-level data and so it is possible that some people are missing reference dates (and this is then something we support the modelling of). To get a handle on this I suggest looking at the package vignettes or the example scripts in inst/examples.