Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add data wrangling challenge to get required input columns for {cfr} #119

Open
avallecam opened this issue Sep 29, 2024 · 0 comments
Open
Labels
enhancement New feature or request

Comments

@avallecam
Copy link
Member

avallecam commented Sep 29, 2024

we need to isolate deaths from the outcomes variable that includes deaths and recoveries

library(simulist)
library(linelist)
library(incidence2)
library(cfr)
library(tidyverse)

# run this together --------------------
set.seed(111)

dat <- sim_linelist() %>% 
  as_tibble()
# run this together --------------------

challenge (or assessment)
from a tidy {linelist} to {incidence2} and {cfr}
learning outcome: recognize the “tidy data” framework in a linelist

  • known outcome (recovered, death) in one column only
  • temporal states (quarantine, UCI, hospitalization) w/admision and discharge
dat %>% 
  dplyr::select(id, date_onset, outcome,date_outcome) %>% 
  # [key] data wrangling step ---------------------------------------------------
  # isolate deaths from outcomes
  dplyr::mutate(
    date_death = dplyr::case_when(
      outcome == "died" ~ date_outcome,
      TRUE ~ NA_Date_
    )
  ) %>% 
  # [end] of key data step ------------------------------------------------------
  linelist::make_linelist(
    id = "id",
    date_onset = "date_onset",
    date_death = "date_death"
  ) %>% 
  linelist::validate_linelist() %>% 
  linelist::tags_df() %>% 
  incidence2::incidence(date_index = c("date_onset","date_death")) %>% 
  cfr::prepare_data(
    cases_variable = "date_onset",
    deaths_variable = "date_death"
  ) %>% 
  dplyr::as_tibble()
#> NAs in cases and deaths are being replaced with 0s: Set `fill_NA = FALSE` to prevent this.
#> # A tibble: 294 × 3
#>    date       deaths cases
#>    <date>      <int> <int>
#>  1 2023-01-01      0     1
#>  2 2023-01-02      0     0
#>  3 2023-01-03      0     0
#>  4 2023-01-04      0     0
#>  5 2023-01-05      0     0
#>  6 2023-01-06      0     0
#>  7 2023-01-07      0     0
#>  8 2023-01-08      0     0
#>  9 2023-01-09      0     0
#> 10 2023-01-10      0     0
#> # ℹ 284 more rows
Ā
#> Error in eval(expr, envir, enclos): object 'Ā' not found

Created on 2024-09-29 with reprex v2.1.0

@avallecam avallecam added the enhancement New feature or request label Sep 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

1 participant