Function to balance the covariate distributions using covariate balancing propensity score CBPS
Source: R/joint_calib_cbps.R
joint_calib_cbps.Rd
joint_calib_cbps
allows quantile or mean and quantile balancing of the covariate distributions of the control and treatment groups using the covariate balancing propensity score method (Imai & Ratkovic (2014)). CBPS::CBPS()
and CBPS::hdCBPS()
are used a backend for estimating the parameters.
This function works in a similar way to the joint_calib_att()
function, i.e. the user can specify variables for the balancing means as well as the quantiles.
Usage
joint_calib_cbps(
formula_means = NULL,
formula_quantiles = NULL,
treatment = NULL,
data,
probs = c(0.25, 0.5, 0.75),
control = control_calib(),
standardize = FALSE,
method = "exact",
variable_selection = FALSE,
target = NULL,
...
)
Arguments
- formula_means
a formula with variables to be balanced at means,
- formula_quantiles
a formula with variables to be balanced at quantiles,
- treatment
a formula with a treatment indicator,
- data
a data.frame with variables,
- probs
a vector or a named list of quantiles to be balanced (default is
c(0.25, 0.5, 0.75)
),- control
a control list of parameters for creation of X_q matrix based on
formula_quantiles
andprobs
(seejoint_calib_create_matrix()
),- standardize
default is FALSE, which normalizes weights to sum to 1 within each treatment group (passed to
CBPS()
function),- method
default is "exact". Choose "over" to fit an over-identified model that combines the propensity score and covariate balancing conditions; choose "exact" to fit a model that only contains the covariate balancing conditions (passed to
CBPS()
function)- variable_selection
default is FALSE. Set to TRUE to select high dimension CBPS via
CBPS::hdCBPS()
,- target
specify target (y) variable for
hdCBPS
function,- ...
other parameters passed to
CBPS
orhdCBPS
functions.
References
Imai, K., and Ratkovic, M. (2014). Covariate balancing propensity score. Journal of the Royal Statistical Society Series B: Statistical Methodology, 76(1), 243-263.
Fong C, Ratkovic M, and Imai K (2022). CBPS: Covariate Balancing Propensity Score. R package version 0.23, https://CRAN.R-project.org/package=CBPS.
Examples
## generate data as in the hbal package (see [hbal::hbal()])
set.seed(123)
N <- 1500
X1 <- rnorm(N)
X2 <- rnorm(N)
X3 <- rbinom(N, size = 1, prob = .5)
X1X3 <- X1*X3
D_star <- 0.5*X1 + 0.3*X2 + 0.2*X1*X2 - 0.5*X1*X3 - 1
D <- ifelse(D_star > rnorm(N), 1, 0) # Treatment indicator
y <- 0.5*D + X1 + X2 + X2*X3 + rnorm(N) # Outcome
dat <- data.frame(D = D, X1 = X1, X2 = X2, X3 = X3, X1X3 = X1X3, Y = y)
head(dat)
#> D X1 X2 X3 X1X3 Y
#> 1 0 -0.56047565 -0.8209867 0 0.00000000 -1.83035284
#> 2 0 -0.23017749 -0.3072572 0 0.00000000 -1.24711079
#> 3 0 1.55870831 -0.9020980 0 0.00000000 -0.80059376
#> 4 0 0.07050839 0.6270687 1 0.07050839 -0.09295057
#> 5 0 0.12928774 1.1203550 0 0.00000000 0.98562207
#> 6 0 1.71506499 2.1272136 1 1.71506499 7.22227890
## Balancing means of X1, X2 and X3 and quartiles (0.25, 0.5, 0.75) of X1 and X2.
result <- joint_calib_cbps(formula_means = ~ X1 + X2 + X3,
formula_quantiles = ~ X1 + X2,
treatment = ~ D,
data = dat)
## CBPS output is presented
result
#>
#> Call: CBPS::CBPS(formula = stats::as.formula(paste(names(Xs)[1], "~ .")),
#> data = Xs, ATT = 0, standardize = standardize, method = method)
#>
#> Coefficients:
#> Treated
#> (Intercept) -3.33937
#> X1 0.96194
#> X2 1.16056
#> X3 -0.04035
#> X1.25 151.98431
#> X1.50 24.80205
#> X1.75 129.06492
#> X2.25 111.06168
#> X2.50 78.79001
#> X2.75 161.70110
#> Residual Deviance: 1274
#> J-Statistic: 0.0171219
#> Log-Likelihood: -636.7567
## calculate ATE by hand
w_1 <- dat$D/fitted(result)
w_1 <- w_1/mean(w_1)
w_0 <- (1-dat$D)/(1-fitted(result))
w_0 <- w_0/mean(w_0)
mean((w_1-w_0)*dat$Y)
#> [1] 0.3557789
## Compare with standard CBPS using only means
result2 <- CBPS::CBPS(D ~ X1 + X2 + X3, data = dat, method = "exact", standardize = FALSE, ATT = 0)
## calculate ATE by hand
w_1a <- dat$D/fitted(result2)
w_1a <- w_1a/mean(w_1a)
w_0a <- (1-dat$D)/(1-fitted(result2))
w_0a <- w_0a/mean(w_0a)
mean((w_1a-w_0a)*dat$Y)
#> [1] 0.3348222