Generates synthetic data set based on different GPS models and covariates.

generate_syn_data_covs(sample_size = 10000, gps_spec = 1)

Arguments

sample_size

Desired size of generated synthetic dataset

gps_spec

A flag that determines the level of complexity in the generated synthetic data. Available options:

  • gps_spec == 1: In this scenario, there's no confounding, meaning the treatment variable is independent of the covariates. The mu (mean of the truncated normal distribution from which treatment values are drawn) is set to 3.

  • gps_spec == 2: In this scenario, the confounding is included. The treatment is not independent of the covariates; it's influenced by the variables cf[, 1], cf[, 2], cf[, 3], cf[, 4], cf5, and cf6. These factors are incorporated into the computation of mu which is then used to generate the treatment variable.

  • gps_spec == 3: Similar to gps_spec == 2, but it introduces additional complexity. Not only are the variables cf[, 1], cf[, 2], cf[, 3], cf[, 4], cf5, and cf6 affecting the treatment, but the effect modifiers em1 and em2 are also included in the mu calculation.

Value

A data.frame of synthetic data set that includes covariates and treatment.

Examples


data <- generate_syn_data_covs(sample_size = 500, gps_spec = 1)