This package provides utilities to set different common contrast
coding schemes for use with regression models. Detailed usage is
available in the contrasts
vignette with
vignette('contrasts', 'contrastable')
.
You can install from CRAN with:
install.packages("contrastable")
You can install the development version from GitHub with:
# install.packages("devtools")
::install_github("tsostarics/contrastable", build_vignettes = TRUE) devtools
To cite contrastable in publications, please use
Sostarics, T. (2024). contrastable: Contrast Coding Utilities in R. R package version 1.0.1.
A BibTeX entry for LaTeX users is
@Manual{,
author = {Thomas Sostarics},
title = {{contrastable}: Contrast Coding Utilities in {R}},
year = {2024},
note = {R package version 1.0.1},
url = {https://github.com/tsostarics/contrastable},
doi = {10.5281/zenodo.11869427},
}
See the Citation Examples section in the contrasts vignette for suggestions and examples of how to cite this package in a paper.
Here is a simple example showing how to set particular factors to a specific contrast scheme.
library(contrastable)
<- mtcars
my_data $gear <- ordered(my_data$gear) # Set as ordered factor in dataframe my_data
set_contrasts
can be used to set the contrasts onto the
dataframe itself, which is needed when a modeling function lacks a
contrasts
argument.
# Specify the contrast schemes we want, factor conversion done automatically
# Set reference level with + and intercept with *
<- set_contrasts(my_data,
my_data ~ scaled_sum_code + 6,
cyl ~ helmert_code,
carb ~ treatment_code + 1,
vs print_contrasts = TRUE)
#> Converting to factors: cyl carb vs
#> Expect contr.treatment or contr.poly for unset factors: gear
#> $cyl
#> 4 8
#> 4 2/3 -1/3
#> 6 -1/3 -1/3
#> 8 -1/3 2/3
#>
#> $carb
#> <2 <3 <4 <6 <8
#> 1 -1/2 -1/3 -1/4 -1/5 -1/6
#> 2 1/2 -1/3 -1/4 -1/5 -1/6
#> 3 0 2/3 -1/4 -1/5 -1/6
#> 4 0 0 3/4 -1/5 -1/6
#> 6 0 0 0 4/5 -1/6
#> 8 0 0 0 0 5/6
#>
#> $vs
#> 0
#> 0 1
#> 1 0
We can use glimpse_contrasts
to get information about
the factors and diagnostics about the scheme we have set.
# Create a reusable list to use with multiple functions
<- list(
contrast_schemes ~ scaled_sum_code + 6,
cyl ~ helmert_code,
carb ~ treatment_code + 1
vs
)
# Get information about our contrasts, even those we didn't explicitly set
# (gear is ordered, and so uses contr.poly by default)
glimpse_contrasts(my_data,
contrast_schemes,add_namespace = TRUE,
show_all_factors = TRUE) |>
::kable() knitr
factor | n | level_names | scheme | reference | intercept |
---|---|---|---|---|---|
cyl | 3 | 4, 6, 8 | contrastable::scaled_sum_code | 6 | grand mean |
carb | 6 | 1, 2, 3,…. | contrastable::helmert_code | NA | grand mean |
vs | 2 | 0, 1 | contrastable::treatment_code | 1 | mean(1) |
gear | 3 | 3, 4, 5 | stats::contr.poly | NA | grand mean |
enlist_contrasts
can be used to generate a named list of
contrasts that can be used in the contrasts
argument of
various modeling functions.
# Get a list of the contrasts we've explicitly set
enlist_contrasts(mtcars, contrast_schemes)
#> $cyl
#> 4 8
#> 4 2/3 -1/3
#> 6 -1/3 -1/3
#> 8 -1/3 2/3
#>
#> $carb
#> <2 <3 <4 <6 <8
#> 1 -1/2 -1/3 -1/4 -1/5 -1/6
#> 2 1/2 -1/3 -1/4 -1/5 -1/6
#> 3 0 2/3 -1/4 -1/5 -1/6
#> 4 0 0 3/4 -1/5 -1/6
#> 6 0 0 0 4/5 -1/6
#> 8 0 0 0 0 5/6
#>
#> $vs
#> 0
#> 0 1
#> 1 0
You can also set multiple contrasts at once using
{tidyselect}
functionality.
# Create a new dataframe with a bunch of factors
<-
my_data2 data.frame(a = gl(2,10),
b = gl(5,2, ordered = TRUE),
c = gl(5,2),
d = 1:10,
e = 11:20)
enlist_contrasts(my_data2,
where(is.ordered) ~ polynomial_code,
where(is.unordered) ~ helmert_code,
+ e ~ sum_code)
d #> $b
#> .L .Q .C ^4
#> 1 -6.324555e-01 0.5345225 -3.162278e-01 0.1195229
#> 2 -3.162278e-01 -0.2672612 6.324555e-01 -0.4780914
#> 3 -3.510833e-17 -0.5345225 1.755417e-16 0.7171372
#> 4 3.162278e-01 -0.2672612 -6.324555e-01 -0.4780914
#> 5 6.324555e-01 0.5345225 3.162278e-01 0.1195229
#>
#> $a
#> <2
#> 1 -0.5
#> 2 0.5
#>
#> $c
#> <2 <3 <4 <5
#> 1 -0.5 -0.3333333 -0.25 -0.2
#> 2 0.5 -0.3333333 -0.25 -0.2
#> 3 0.0 0.6666667 -0.25 -0.2
#> 4 0.0 0.0000000 0.75 -0.2
#> 5 0.0 0.0000000 0.00 0.8
#>
#> $d
#> 2 3 4 5 6 7 8 9 10
#> 1 -1 -1 -1 -1 -1 -1 -1 -1 -1
#> 2 1 0 0 0 0 0 0 0 0
#> 3 0 1 0 0 0 0 0 0 0
#> 4 0 0 1 0 0 0 0 0 0
#> 5 0 0 0 1 0 0 0 0 0
#> 6 0 0 0 0 1 0 0 0 0
#> 7 0 0 0 0 0 1 0 0 0
#> 8 0 0 0 0 0 0 1 0 0
#> 9 0 0 0 0 0 0 0 1 0
#> 10 0 0 0 0 0 0 0 0 1
#>
#> $e
#> 12 13 14 15 16 17 18 19 20
#> 11 -1 -1 -1 -1 -1 -1 -1 -1 -1
#> 12 1 0 0 0 0 0 0 0 0
#> 13 0 1 0 0 0 0 0 0 0
#> 14 0 0 1 0 0 0 0 0 0
#> 15 0 0 0 1 0 0 0 0 0
#> 16 0 0 0 0 1 0 0 0 0
#> 17 0 0 0 0 0 1 0 0 0
#> 18 0 0 0 0 0 0 1 0 0
#> 19 0 0 0 0 0 0 0 1 0
#> 20 0 0 0 0 0 0 0 0 1
The functions in this package aim to be helpful when potential
mistakes are made and transparent when things happen behind the scenes
(e.g., automatic factor coercion). You can check out descriptions of
various messages and warnings in the warnings
vignette with
vignette('warnings', 'contrastable')
.