Numeric Blur Example

Not all sensitive data is recorded as strings - features such as age, date of birth, or income could result in aspects of a data set being personally identifiable. To aid with these challenges we include the ‘numeric blur’ method (in comparison to the ‘blur’ for categorical data). As the ‘blur’ transform aggregates categorical features according to a new taxonomy, so too does ‘numeric blur’ create aggregation for numeric features.

At present the methods require pre-defined points at which to divide the data.

library(deident)

quantile_cuts <- quantile(ShiftsWorked$`Daily Pay`, c(0.25, 0.5, 0.75))

numeric_blur_pipe <- ShiftsWorked |>
  add_numeric_blur(`Daily Pay`, cuts = quantile_cuts)

apply_deident(ShiftsWorked, numeric_blur_pipe)
#> # A tibble: 3,100 × 7
#>    `Record ID` Employee   Date       Shift `Shift Start` `Shift End` `Daily Pay`
#>          <int> <chr>      <date>     <chr> <chr>         <chr>       <fct>      
#>  1           1 Maria Cook 2015-01-01 Night 17:01         00:01       (70.9,144] 
#>  2           2 Stephen C… 2015-01-01 Day   08:01         16:01       (144,208]  
#>  3           3 Kimberly … 2015-01-01 Day   08:01         16:01       (70.9,144] 
#>  4           4 Nathan Al… 2015-01-01 Day   08:01         15:01       (144,208]  
#>  5           5 Samuel Pa… 2015-01-01 Night 16:01         23:01       (208, Inf] 
#>  6           6 Scott Mor… 2015-01-01 Night 17:01         00:01       (70.9,144] 
#>  7           7 Nathan Sa… 2015-01-01 Rest  <NA>          <NA>        (-Inf,70.9]
#>  8           8 Jose Lopez 2015-01-01 Night 17:01         00:01       (208, Inf] 
#>  9           9 Donna Bro… 2015-01-01 Night 16:01         00:01       (208, Inf] 
#> 10          10 George Ki… 2015-01-01 Night 16:01         00:01       (208, Inf] 
#> # ℹ 3,090 more rows