Function to balance the covariate distributions of a control and treatment group using joint_calib

joint_calib_att allows quantile or mean and quantile balancing of the covariate distributions of the control and treatment groups. It provides a user-friendly interface for specifying the variables and quantiles to be balanced. joint_calib_att uses joint_calib function, so the user can apply different methods to find the weights that balance the control and treatment groups. For more details see joint_calib() and Beręsewicz and Szymkowiak (2023) working paper.

Usage

joint_calib_att(
  formula_means = NULL,
  formula_quantiles = NULL,
  treatment = NULL,
  data,
  probs = c(0.25, 0.5, 0.75),
  ...
)

Arguments

formula_means: a formula with variables to be balanced at means,
formula_quantiles: a formula with variables to be balanced at quantiles,
treatment: a formula with a treatment indicator,
data: a data.frame with variables,
probs: a vector or a named list of quantiles to be balanced (default is c(0.25, 0.5, 0.75)),
...: other parameters passed to joint_calib function.

Value

Returns a list with containing:

g -- g-weight that sums up to treatment group size,
Xs -- matrix used for balancing (i.e. Intercept, X based on formula_means and X_q transformed for balancing of quantiles based on formula_quantiles and probs),
totals -- a vector of treatment reference size (N), means (pop_totals) and order of quantiles (based on formula_quantiles and probs).
method -- selected method,
backend -- selected backend.

References

Beręsewicz, M. and Szymkowiak, M. (2023) A note on joint calibration estimators for totals and quantiles Arxiv preprint https://arxiv.org/abs/2308.13281

Greifer N (2023). WeightIt: Weighting for Covariate Balance in Observational Studies. R package version 0.14.2, https://CRAN.R-project.org/package=WeightIt.

Greifer N (2023). cobalt: Covariate Balance Tables and Plots. R package version 4.5.1, https://CRAN.R-project.org/package=cobalt.

Ho, D., Imai, K., King, G., and Stuart, E. A. (2011). MatchIt: Nonparametric Preprocessing for Parametric Causal Inference. Journal of Statistical Software, 42(8), 1–28. https://doi.org/10.18637/jss.v042.i08

Xu, Y., and Yang, E. (2023). Hierarchically Regularized Entropy Balancing. Political Analysis, 31(3), 457-464. https://doi.org/10.1017/pan.2022.12

Author

Maciej Beręsewicz

Examples


## generate data as in the hbal package
set.seed(123)
N <- 1500
X1 <- rnorm(N)
X2 <- rnorm(N)
X3 <- rbinom(N, size = 1, prob = .5)
X1X3 <- X1*X3
D_star <- 0.5*X1 + 0.3*X2 + 0.2*X1*X2 - 0.5*X1*X3 -1
D <- ifelse(D_star > rnorm(N), 1, 0)
y <- 0.5*D + X1 + X2 + X2*X3 + rnorm(N)
dat <- data.frame(D = D, X1 = X1, X2 = X2, X3 = X3, X1X3=X1X3, Y = y)
head(dat)
#>   D          X1         X2 X3       X1X3           Y
#> 1 0 -0.56047565 -0.8209867  0 0.00000000 -1.83035284
#> 2 0 -0.23017749 -0.3072572  0 0.00000000 -1.24711079
#> 3 0  1.55870831 -0.9020980  0 0.00000000 -0.80059376
#> 4 0  0.07050839  0.6270687  1 0.07050839 -0.09295057
#> 5 0  0.12928774  1.1203550  0 0.00000000  0.98562207
#> 6 0  1.71506499  2.1272136  1 1.71506499  7.22227890

## Balancing means of X1, X2 and X3 and quartiles (0.25, 0.5, 0.75) of X1 and X2
## sampling::raking is used
results <- joint_calib_att(
formula_means = ~ X1 + X2 + X3,
formula_quantiles = ~ X1 + X2,
treatment = ~ D,
data = dat,
method = "raking"
)

## Results are presented with summary statistics of balance weights (g-weights)
## and information on the accuracy of reproducing reference treatment distributions
results
#> Weights calibrated using: raking with sampling backend.
#> Summary statistics for g-weights:
#>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
#> 0.004337 0.078634 0.151831 0.221498 0.293544 1.957529 
#> Totals and precision (abs diff: 9.097514e-06)
#>           totals     precision
#> N       272.0000  4.706215e-06
#> X1 0.25   0.2500  9.885191e-09
#> X1 0.50   0.5000  1.235785e-08
#> X1 0.75   0.7500  1.436596e-08
#> X2 0.25   0.2500  9.738133e-09
#> X2 0.50   0.5000  1.221405e-08
#> X2 0.75   0.7500  1.402932e-08
#> X1      123.2252 -9.227300e-07
#> X2      130.8188 -1.066397e-06
#> X3      126.0000  2.329582e-06

## An interaction between X1 and X2 is added to means
results2 <- joint_calib_att(
formula_means = ~ X1 + X2 + X3 + X1*X3,
formula_quantiles = ~ X1 + X2,
treatment = ~ D,
data = dat,
method = "raking"
)

## Results with interaction are presented below
results2
#> Weights calibrated using: raking with sampling backend.
#> Summary statistics for g-weights:
#>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
#> 0.002461 0.073372 0.145635 0.221498 0.283639 2.411691 
#> Totals and precision (abs diff: 2.003603e-05)
#>            totals     precision
#> N       272.00000  1.115847e-05
#> X1 0.25   0.25000  2.759414e-08
#> X1 0.50   0.50000  3.246100e-08
#> X1 0.75   0.75000  3.614879e-08
#> X2 0.25   0.25000  2.113723e-08
#> X2 0.50   0.50000  2.837084e-08
#> X2 0.75   0.75000  3.410142e-08
#> X1      123.22524 -3.321686e-06
#> X2      130.81875 -1.204567e-06
#> X3      126.00000  3.198817e-06
#> X1:X3    23.06458 -9.726806e-07

## As noted in the documentation, the probs argument can be a named list of different orders
## In this example, we specify that X1 should be balanced at the mean,
## while X2 should be balanced at Q1 and Q3
results3 <- joint_calib_att(
formula_means = ~ X1 + X2 + X3 + X1*X3,
formula_quantiles = ~ X1 + X2,
treatment = ~ D,
data = dat,
method = "raking",
probs = list(X1 = 0.5, X2 = c(0.25, 0.75))
)

## Results with different orders are presented below
results3
#> Weights calibrated using: raking with sampling backend.
#> Summary statistics for g-weights:
#>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
#> 0.003475 0.076911 0.145924 0.221498 0.278003 2.779197 
#> Totals and precision (abs diff: 5.695273e-06)
#>            totals     precision
#> N       272.00000  3.214904e-06
#> X1 0.50   0.50000  7.916963e-09
#> X2 0.25   0.25000  8.242246e-09
#> X2 0.75   0.75000  1.067702e-08
#> X1      123.22524  3.830925e-08
#> X2      130.81875 -1.362999e-06
#> X3      126.00000  1.025260e-06
#> X1:X3    23.06458 -2.696433e-08

## Finally, we specify an order of quantile for the interaction
results4 <- joint_calib_att(
formula_means = ~ X1 + X2 + X3,
formula_quantiles = ~ X1 + X2 + X1:X3,
treatment = ~ D,
data = dat,
probs = list(X1=0.5, X2 = c(0.25, 0.5), `X1:X3` = 0.75),
method = "raking"
)

## Results with Q3 balancing for interaction are presented below
results4
#> Weights calibrated using: raking with sampling backend.
#> Summary statistics for g-weights:
#>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
#> 0.008147 0.083688 0.149140 0.221498 0.276006 2.340755 
#> Totals and precision (abs diff: 1.554366e-06)
#>              totals     precision
#> N          272.0000  7.408541e-07
#> X1 0.50      0.5000  1.721413e-09
#> X2 0.25      0.2500  2.127525e-09
#> X2 0.50      0.5000  2.367036e-09
#> X1:X3 0.75   0.7500  2.147864e-09
#> X1         123.2252  2.995701e-08
#> X2         130.8188 -3.915528e-07
#> X3         126.0000  3.836380e-07