This package provides various statistical methods for designing and analyzing two-stage randomized controlled trials. Two-stage randomized controlled trials can be used to estimate spillover effects as well as direct treatment effects.
The methods in this package address situations were some control units decide to take the treatment while others in the treatment group refuse to receive one. Often, researchers cannot force experimental subjects to adhere to protocol and the methods in this package allow analysis of two-stage randomized experiments with both interference and noncompliance.
RSBY provides access to an insurance plan that covers all pre-existing diseases and there is no age limit of the beneficiaries. The data was collected through a randomized trial to determine whether RSBY increases access to hospitalization (and health) and reduces impoverishment due to high medical expenses. The Indian governemtn announced a new scheme to build on RSBY and provide coverage for almost 500 million Indians, but has not yet decided its design or how much to fund it. Spillover effects are of concern because formal insurance may crow our informal insurance; the enrollment in RSBY by one household may depend on the treatment assignment of other households. Additionally, we must address noncompliance because some households in the treatment group decided not to enroll in RSBY while some in the control group were able to join the insurance program.
The evaluation study is based on a total of 11089 above poverty line households in two districts of Karnataka State with no pre-existing health insurance coverage living within 25 km of a RSY empaneled hospital. A two-stage randomzied design was employed to study both direct and spillover over effects of RSBY. In the first stage, 219 randomly selected villages were assigned to the “High” treatment assignment mechanism and the rest were assigned to the “Low” treatment assignment mechanism. In the second stage, 80% of the households in the “High” assignment mechanism within a cluster were completely randomly assigned to the treatment condition, while the rest of the households were assigned to the control group. In contrast, under the “Low” assignment mechanism, 40% of the households within a cluster were completely randomly assigned to the treatment condition.
The households in the treatment group were given RSBY for free, whereas some households in the control group could buy RSBY for around INR 200. Upon being informed of the assignment treatment conditions, households were given the opportunities to enroll in RSBY from April to May, 2015. 18 months later the post-treatment survey was carried out, in which a variety of outcomes were measured.
Village-level arms | Household-level arms | ||||||
---|---|---|---|---|---|---|---|
Mechanisms | Number of villages | Treatment | Control | Number of households | Enrollment rates | ||
High | 219 | 80% | 20% | 5,714 | 67.0% | ||
Low | 216 | 40% | 60% | 5,373 | 46.2% |
The data set is a subset of data from the randomized evaluation of the India’s National Health Insurance Program (RSBY). The data initially contained six variables as listed below and after processing the for the purposes of the package, there remain four variables of interest which we remained for the purposes of analysis:
Z
: treatment status
A
: treatment assignment mechanism
D
: enrolled in RSBY
Y
: hospital expenditure (the outcome variable).
There are three functions in this package:
CADErand
: computes the point estimates and variance
estimates of the complier average direct effect (CADE) and the complier
average spillover effect (CASE). The estimators calculated using this
function are either individual weighted or cluster-weighted. The point
estimates and variances of ITT effects are also included.
CADEreg
: computes the point estimates of the
complier average direct effect (CADE) and four different variance
estimates: the HC2 variance, the cluster-robust variance, the
cluster-robust HC2 variance and the variance proposed in the
reference.
CADEparamreg
: computes the point estimates of the
complier average direct effect (CADE) and the complier average spillover
effect (CASE) following the model-based approach presented in the
appendix.
Before we begin, lets load the library and our example data set into R.
library(RCT2)
data(india)
$id <- factor(india$id) india
To run the CADErand command, simply type in the following:
<- CADErand(india, 0.95)
rand print(rand)
## names reps2 estimate variance stds lcis rcis
## 1 CADE 0 1984.42477 1474406.32319 1214.25134 1908.283 2060.567
## 2 CADE 1 -1648.53065 1128010.63709 1062.07845 -1715.13 -1581.931
## 3 CASE 0 6568.04971 335327387.08457 18311.94657 5419.767 7716.333
## 4 CASE 1 -15900.39663 237301575.94459 15404.59594 -16866.369 -14934.424
## 5 DEY 0 875.43729 280649.14689 529.76329 842.218 908.657
## 6 DEY 1 -795.24119 263884.36586 513.69676 -827.453 -763.029
## 7 DED 0 0.44115 0.00044 0.02099 242.86 350.527
## 8 DED 1 0.48239 0.00052 0.02277 -1425.617 -1322.353
## 9 SEY 0 296.69351 737020.66452 858.49908 0.44 0.442
## 10 SEY 1 -1373.98496 677958.83661 823.38256 0.481 0.484
## 11 SED 0 0.04517 0.00077 0.02778 0.043 0.047
## 12 SED 1 0.08641 0.00281 0.05298 0.083 0.09
Note that you can specify the confidence interval level of your
choosing with the parameter ci
in the CADErand
function. You can access any specific value with the $
operator. For example:
$CADE rand
## A_cluster0 A_cluster1
## [1,] 1984.425 -1648.531
allows you to access just the CADE
estimates.
In order to analyze our data using a regression based method, we use the CADEreg function.
<- CADEreg(india, ci.level = 0.90)
reg print(reg)
## [[1]]
## name estimate left CI right CI
## 1 CADE1 -485.205567558982 -2604427.23867043 2603456.82753532
## 2 CADE0 3751.62334625516 -4491562.03320184 4490591.62206672
##
## [[2]]
## var(CADE1) var(CADE0)
## cluster_robust_variance 1844654 2692774
## HC2_variance 1759371 3036458
## cluster_robust_HC2_variance 1853609 2705597
## individual_variance 1307332 2754695
## proposed_variance 1583084 2730381
This gives us the point estimates of CADE1 and CADE0 and their confidence intervals, and various types of variances for the CADE1 and CADE0. We can again access these by using the dollar sign notation. Note that we can use the parameter to specify the confidence interval level (i.e. 95%, 90%).
$CADE1 reg
## M
## -485.2056
CADEparamreg offers a regression-based method for the computing the ITT effects and the average direct effects and spillover effects.
<- CADEparamreg(india, assign.prob = 0.8, ci.level = 0.95)
paramreg print(paramreg)
## [[1]]
## Method Treatment Control Treatment CI Control CI
## 1 ITT DE -1253 1447 -2646, 139 -94, 2988
## 2 IV DE -6013 8724 -4631, -1131 -1991, 1630
## 3 ITT SE -2881 -180 -11872, 139 407, 2988
## 4 IV SE -11715 3022 -19445, -1131 -4927, 1630
##
## [[2]]
## ITT pvalues ITT tstat IV pvalues IV tstat
## (Intercept) 6.731101e-31 11.5966578 0.11434527 1.5790968
## Z 2.917099e-02 2.1814799 0.02241845 2.2835546
## A 1.148445e+00 -0.1871396 0.47074057 0.7213018
## Z:A 1.972694e+00 -2.2074303 1.97260512 -2.2061660
print(paramreg)[[1]]
## Method Treatment Control Treatment CI Control CI
## 1 ITT DE -1253 1447 -2646, 139 -94, 2988
## 2 IV DE -6013 8724 -4631, -1131 -1991, 1630
## 3 ITT SE -2881 -180 -11872, 139 407, 2988
## 4 IV SE -11715 3022 -19445, -1131 -4927, 1630
Note how we use to specify the assignment probability to the different assignment mechanisms. We also use again to specify the confidence intervals.