This R package calculates the outcome weights of Knaus (2024). Its use is illustrated in the average effects R notebook and the heterogeneous effects R notebook as supplementary material to the paper.
The core functionality is the get_outcome_weights()
method implementing the theoretical result in Proposition 1 of the
paper. It shows that the outcome weights vector can be obtained in the
general form \(\boldsymbol{\omega'} =
(\boldsymbol{\tilde{Z}'\tilde{D}})^{-1}
\boldsymbol{\tilde{Z}'T}\) where \(\boldsymbol{\tilde{Z}}\), \(\boldsymbol{\tilde{D}}\) and \(\boldsymbol{T}\) are pseudo-instrument,
pseudo-treatment and the transformation matrix, respectively.
In the future it should be compatible with as many estimated R objects as possible.
The package can be downloaded from CRAN:
install.packages("OutcomeWeights")
The package is work in progress. Find here the current state (suggestions welcome):
grf
packagecausal_forest()
outcome weights for CATEinstrumental_forest()
outcome weights CLATEcausal_forest()
outcome weights for ATE from
average_treatment_effect()
average_treatment_effect()
regression_forest()
of grf
package)dml_with_smoother()
function runs for PLR, PLR-IV,
AIPW-ATE, and Wald_AIPW and is compatible with
get_outcome_weights()
DoubleML
(this is a non-trivial task as the mlr3
environment it
builds on does not provide smoother matrices)mlr3
available, where
possiblemlr3
accessible within
DoubleMLget_outcome_weights()
method for DoubleML
estimatorsThe following code creates synthetic data to showcase how causal forest weights are extracted and that they perfectly replicate the original output:
if (!require("OutcomeWeights")) install.packages("OutcomeWeights", dependencies = TRUE)
library(OutcomeWeights)
# Sample from DGP borrowed from grf documentation
= 500
n = 10
p = matrix(rnorm(n * p), n, p)
X = rbinom(n, 1, 0.5)
W = pmax(X[, 1], 0) * W + X[, 2] + pmin(X[, 3], 0) + rnorm(n)
Y
# Run outcome regression and extract smoother matrix
= grf::regression_forest(X, Y)
forest.Y = predict(forest.Y)$predictions
Y.hat = grf::get_forest_weights(forest.Y)
outcome_smoother
# Run causal forest with external Y.hats
= grf::causal_forest(X, Y, W, Y.hat = Y.hat)
c.forest
# Predict on out-of-bag training samples.
= predict(c.forest)$predictions
cate.oob
# Predict using the forest.
= matrix(0, 101, p)
X.test 1] = seq(-2, 2, length.out = 101)
X.test[, = predict(c.forest, X.test)$predictions
cate.test
# Calculate outcome weights
= get_outcome_weights(c.forest, S = outcome_smoother)
omega_oob = get_outcome_weights(c.forest, S = outcome_smoother, newdata = X.test)
omega_test
# Observe that they perfectly replicate the original CATEs
all.equal(as.numeric(omega_oob$omega %*% Y),
as.numeric(cate.oob))
all.equal(as.numeric(omega_test$omega %*% Y),
as.numeric(cate.test))
# Also the ATE estimates are perfectly replicated
= get_outcome_weights(c.forest,target = "ATE", S = outcome_smoother, S.tau = omega_oob$omega)
omega_ate all.equal(as.numeric(omega_ate$omega %*% Y),
as.numeric(grf::average_treatment_effect(c.forest, target.sample = "all")[1]))
The development version is available using the devtools
package:
library(devtools)
install_github(repo="MCKnaus/OutcomeWeights")
Knaus, M. C. (2024). Treatment effect estimators as weighted outcomes, arXiv:2411.11559