To update on this we now have a draft analysis plan we are circulating and the first working version of the package. We are currently doing a bit more refactoring to help with modularity and then will start thinking about some simplified user interfaces so if anyone has thoughts on that that would be very welcome!
Package docs are here https://baselinenowcast.epinowcast.org/
Package has had a lot of updates since the last check in with the final issues being to improve the interface and do some bulk renaming. Please weigh in if any thoughts.
The preprint is also nearly in place (thanks all!) so should be able to share that soon.
We also have some cools plans cooking of working with folks at Mass PH department in the US to create a user friendly vignette for a data source, NSSP, which state PH over there have access to. From that we should be able to do a validation study using the baseline package compared to a method they use in production which is a pretty neat proof of concept.
The main issues we need thoughts on are:
opened 02:03PM - 28 May 25 UTC
first release
# Function Naming Consistency in baselinenowcast
This is a @seabbs monitored LL… M review
## Summary
The current function naming conventions in baselinenowcast show inconsistencies, particularly between `get_` and `generate_` prefixes, and overloading of the term "nowcast" for different operations.
This issue proposes a systematic naming scheme that aligns with the planned high-level interface design discussed in issue #86.
## Current Function Inventory
From the package namespace, we have 14 exported functions:
- `apply_delay`
- `combine_obs_with_pred`
- `estimate_dispersion`
- `generate_pt_nowcast_mat`
- `generate_pt_nowcast_mat_list`
- `generate_triangle`
- `generate_triangles`
- `get_delay_estimate`
- `get_nowcast_draw`
- `get_nowcast_draws`
- `get_nowcast_pred_draw`
- `get_nowcast_pred_draws`
- `truncate_triangle`
- `truncate_triangles`
## Key Issues Identified
### 1. Semantic confusion with `get_` prefix
`get_` functions perform different types of operations:
- `get_delay_estimate()` - Calculates/estimates delay distribution (deterministic)
- `get_nowcast_draw()` - Samples/generates new random draws (stochastic)
- `get_nowcast_pred_draw()` - Samples predicted values only (stochastic)
### 2. Overloading of "nowcast" terminology
The term "nowcast" is used for three distinct concepts:
- **Full nowcast**: Combined observed + predicted values (`get_nowcast_draw`)
- **Predictions only**: Just the unobserved/predicted portion (`get_nowcast_pred_draw`)
- **Point nowcast matrix**: Filled/completed reporting triangle (`generate_pt_nowcast_mat`)
### 3. Verbose and unclear naming
- `generate_pt_nowcast_mat_list` is unnecessarily verbose
- The "pt" prefix is not immediately clear (point nowcast)
## Proposed Naming Scheme
Based on the interface design in issue #86 and the principle of clear verb semantics, I recommend:
### Verb Prefixes
- `estimate_*` - Statistical estimation procedures (deterministic)
- `sample_*` - Random draws/sampling (stochastic)
- `construct_*` - Building data structures
- `fill_*` - Filling in missing values (preferred over `complete_*` for brevity)
- `apply_*` - Transformations (keep as is)
- `combine_*` - Merging operations (keep as is)
- `truncate_*` - Truncation operations (keep as is)
- `split_*` - Separating by strata (for future interface)
- `as_*` - Conversion functions (S3 generic pattern)
### Core Function Mappings
```r
# Estimation functions
get_delay_estimate() → estimate_delay()
estimate_dispersion() → estimate_dispersion() # already good
# Sampling functions (disambiguated)
get_nowcast_draw() → sample_nowcast() # full nowcast
get_nowcast_draws() → sample_nowcasts() # multiple full nowcasts
get_nowcast_pred_draw() → sample_predictions() # predictions only
get_nowcast_pred_draws() → sample_predictions() # with n_draws parameter
# Matrix completion functions
generate_pt_nowcast_mat() → fill_triangle()
generate_pt_nowcast_mat_list() → fill_triangles()
# Structure creation functions
generate_triangle() → construct_triangle() # creates NA structure
generate_triangles() → construct_triangles() # creates multiple
# Keep as-is (already clear)
apply_delay() → apply_delay()
combine_obs_with_pred() → combine_obs_with_pred()
truncate_triangle() → truncate_triangle()
truncate_triangles() → truncate_triangles()
```
### Additional Naming Scheme Options Considered
#### Option 1: Action-based naming
```r
get_delay_estimate() → compute_delay_distribution()
get_nowcast_draw() → draw_nowcast()
generate_pt_nowcast_mat() → impute_triangle()
```
#### Option 2: R tidyverse-inspired naming
```r
get_delay_estimate() → calc_delay()
get_nowcast_draw() → draw_nowcast()
generate_pt_nowcast_mat() → fill_na_triangle()
```
#### Option 3: Statistical operation focus
```r
get_delay_estimate() → fit_delay_distribution()
get_nowcast_draw() → sample_posterior_nowcast()
generate_pt_nowcast_mat() → expectation_fill_triangle()
```
## Integration with Planned Interface (Issue #86)
The proposed high-level `baselinenowcast()` function and supporting infrastructure introduce new naming needs:
### New S3 Classes and Methods
- `reporting_triangle` class with `as_reporting_triangle()` generic
- `baselinenowcast` class for results
- Supporting print, plot, and summary methods
### Internal Workflow Functions
Following the interface design, these internal functions would use our naming conventions:
```r
# Data handling
as_reporting_triangle() # S3 generic for conversion
validate_reporting_triangle() # Validation helper
split_reporting_triangle() # Split by strata
# Multi-step operations
estimate_delay_and_uncertainty() # Combined workflow
process_strata() # Apply operations to each stratum
```
### Multi-Action Functions and Pipeline Operations
The interface design emphasizes modular composition. Our naming should support this:
#### For list operations:
Use simple plural forms:
```r
construct_triangle() → construct_triangles()
fill_triangle() → fill_triangles()
sample_nowcast() → sample_nowcasts()
```
#### For optional multi-behavior:
Use parameters rather than separate functions:
```r
# Instead of get_nowcast_draw() and get_nowcast_draws()
sample_nowcast(n_draws = 1) # Single draw
sample_nowcast(n_draws = 100) # Multiple draws
# Instead of separate _pred_ variants
sample_nowcast(type = "full") # Full nowcast (observed + predicted)
sample_nowcast(type = "predictions") # Predictions only
```
#### For pipeline functions:
The high-level `baselinenowcast()` will internally compose our modular functions:
```r
baselinenowcast() {
# Uses: as_reporting_triangle(), estimate_delay(),
# fill_triangle(), estimate_dispersion(), sample_nowcast()
}
```
## Benefits of This Scheme
1. **Clear verb semantics**: `estimate_` vs `sample_` immediately distinguishes deterministic from stochastic operations
2. **Interface alignment**: Works naturally with the planned S3 class system and high-level wrapper
3. **Reduced redundancy**: Parameters replace function proliferation (e.g., `n_draws` instead of separate functions)
4. **R ecosystem consistency**: Follows established patterns (`as_*` for conversion, `sample_*` for random draws)
5. **Future extensibility**: New verbs can be added systematically
## Migration Strategy
1. **Phase 1**: Add new names as aliases alongside old ones
2. **Phase 2**: Update all internal usage and documentation
3. **Phase 3**: Add deprecation warnings to old names
4. **Phase 4**: Remove old names after 2-3 release cycles
## Questions for Discussion
1. Should we consolidate `sample_nowcast()` and `sample_predictions()` into one function with a `type` parameter?
2. How to handle the transition period where both naming schemes exist?
3. Should internal helpers (`.function_name`) follow the same conventions?
## Recommendation
I strongly recommend adopting the **verb-prefix scheme with `sample_*` for stochastic operations**. This approach:
1. **Clarifies intent**: Users immediately understand what each function does
2. **Supports the interface vision**: The naming naturally fits with the planned `baselinenowcast()` wrapper and S3 infrastructure
3. **Reduces API surface**: Using parameters instead of separate functions for variations
4. **Maintains flexibility**: The modular functions remain available for advanced users
The complete recommended mapping:
```r
# Core transformations
get_delay_estimate() → estimate_delay()
get_nowcast_draw(s)() → sample_nowcast(n_draws = )
get_nowcast_pred_draw(s)() → sample_predictions(n_draws = )
generate_pt_nowcast_mat(_list)() → fill_triangle(s)()
generate_triangle(s)() → construct_triangle(s)()
# Keep as-is
apply_delay(), combine_obs_with_pred(),
truncate_triangle(s)(), estimate_dispersion()
```
This naming scheme will provide a solid foundation for both the current modular interface and the planned high-level user interface, while maintaining backward compatibility during the transition.
and
opened 04:33PM - 10 Apr 25 UTC
first release
**Goal**
Want to have a default behavior for the wrapper function with number o… f historical observations for delay estimates and number of triangles for the uncertainty estimate set
1 Like