The most direct method of removing identifiability is via ‘encryption’ which comprises two processes:
This approach is implemented via the Encrypter
methods
(which uses the openssl::sha256
implementation by
default):
name_pipe <- starwars |>
add_encrypt(name, hash_key = "hash123", seed="seed456")
apply_deident(starwars, name_pipe)
#> # A tibble: 87 × 14
#> name height mass hair_color skin_color eye_color birth_year sex gender
#> <hash> <int> <dbl> <chr> <chr> <chr> <dbl> <chr> <chr>
#> 1 23520cf… 172 77 blond fair blue 19 male mascu…
#> 2 2e23636… 167 75 <NA> gold yellow 112 none mascu…
#> 3 bd14c29… 96 32 <NA> white, bl… red 33 none mascu…
#> 4 54afb56… 202 136 none white yellow 41.9 male mascu…
#> 5 d3273ca… 150 49 brown light brown 19 fema… femin…
#> 6 89b0c58… 178 120 brown, gr… light blue 52 male mascu…
#> 7 7ae0e4d… 165 75 brown light blue 47 fema… femin…
#> 8 1ccc4a0… 97 32 <NA> white, red red NA none mascu…
#> 9 3ddc072… 183 84 black light brown 24 male mascu…
#> 10 9f22169… 182 77 auburn, w… fair blue-gray 57 male mascu…
#> # ℹ 77 more rows
#> # ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,
#> # vehicles <list>, starships <list>