I think the following part of vim-factors.R needs to be removed. Not sure why I put it there (sorry) - Chris, you correctly noted that. It's causing errors, but the errors persist if I have > 10 missing values of Y (see little sim below).
TODO (CK): don't do this, in order to use the delta missingness estimation.
# To avoid crashing TMLE function just drop obs missing A or Y if the
# total number of missing is < 10
if (sum(deltat == 0) < 10) {
Yt = Yt[deltat == 1]
At = At[deltat == 1]
Wtsht = Wtsht[deltat == 1, , drop = FALSE]
deltat = deltat[deltat == 1]
}
Simulation
set.seed(1, "L'Ecuyer-CMRG")
N <- 200
num_normal <- 4
X <- as.data.frame(matrix(rnorm(N * num_normal), N, num_normal))
Y <- rbinom(N, 1, plogis(.2X[, 1] + .1X[, 2] - .2X[, 3] + .1X[, 3]X[, 4] - .2abs(X[, 4])))
Add some missing data to X so we can test imputation.
for (i in 1:10) X[sample(nrow(X), 1), sample(ncol(X), 1)] <- NA
Y[c(4,6,7,8,11,15,20,21,28,32,72)] <- NA
####################################
Basic example, fails to run with NA values in Y
vim <- varimpact(Y = Y, data = X)
I think the following part of vim-factors.R needs to be removed. Not sure why I put it there (sorry) - Chris, you correctly noted that. It's causing errors, but the errors persist if I have > 10 missing values of Y (see little sim below).
TODO (CK): don't do this, in order to use the delta missingness estimation.
# To avoid crashing TMLE function just drop obs missing A or Y if the
# total number of missing is < 10
if (sum(deltat == 0) < 10) {
Yt = Yt[deltat == 1]
At = At[deltat == 1]
Wtsht = Wtsht[deltat == 1, , drop = FALSE]
deltat = deltat[deltat == 1]
}
Simulation
set.seed(1, "L'Ecuyer-CMRG")
N <- 200
num_normal <- 4
X <- as.data.frame(matrix(rnorm(N * num_normal), N, num_normal))
Y <- rbinom(N, 1, plogis(.2X[, 1] + .1X[, 2] - .2X[, 3] + .1X[, 3]X[, 4] - .2abs(X[, 4])))
Add some missing data to X so we can test imputation.
for (i in 1:10) X[sample(nrow(X), 1), sample(ncol(X), 1)] <- NA
Y[c(4,6,7,8,11,15,20,21,28,32,72)] <- NA
####################################
Basic example, fails to run with NA values in Y
vim <- varimpact(Y = Y, data = X)