RE: An approach for imputing missing independent variable (covariate)

From: Vladimir Piotrovskij Date: September 21, 2000 technical Source: cognigencorp.com
From: "Piotrovskij, Vladimir [JanBe]" <VPIOTROV@janbe.jnj.com> Subject: RE: An approach for imputing missing independent variable (covariate) Date: Thu, 21 Sep 2000 14:34:03 +0200 >First, let me ask Vladimir why he says his method operates "without assuming any >explicit model for a covariate" The inverted model for the DV is an explicit model >for the covariate, is it not? Sorry, my phrasing was indeed ambiguous. What I meant saying "explicit model" was a model like THETA(.) + ETA(..) where we explicitly assume normal distribution for a covariate. >More importantly, however, Vladimir's approach has >at least two problems: (i) it is non-convergent: each data imputation >at step 4 generates a different data set, which will yield >a different estimate at step 5. This will never stop. (ii) Even >if it converges "well enough" to a "region", >it will not yield correct standard errors. I believe the algorithm will converge, however, I don't think I will have time to check this and also to assess the magnitude of the bias (unless I will do myself modeling of data with missing covariate values; currently I do not have such a problem). The data set remained essentially unchanged except missing IDV are substituted by estimates obtained at the previous iteration. I presume this will work nicely if the proportion of missing values is small (20 % as in my example, or less). I believe "correct standard errors" is a kind of unachievable ideal even if there are no missing predictors at all. >To see (ii), imagine the (absurd) situation that all but two data >points from one individual were missing: the algorithm would wind up filling in >all missing data points from the line defined by the two actual observations >(without any error) and would eventually >report perfect precision for the estimate of the slope >and intercept defined by the two observations. >This is not to say that anyone would try such an analysis; it merely >points out that the method fails as it approaches a limit, which >should make one suspect that it will have problems, >perhaps of lesser severity, away from that limit. The reason for >the problem is that uncertainty in the (posthoc) parameter >estimates is ignored (more on this below). With this absurd situation no imputation can be made at all. Multiple imputation will probably fail as well. >The more difficult issue, though, >is how to compute standard errors? The standard errors >from the last step of Vladimir's last iteration >can't be right, as these are conditional on the imputed data, >treating them as known, when in fact they are unknown. Missing values are unknown by definition, and I am not sure multiple imputation may change this. >A simpler method, which doesn't require an invertable >function such as Valdimir's, and which is >theoretically sound (i.e. gives unbiased estimates and >correct standard errors) is multiple imputation. >This method requires the ability to draw samples of the missing >data from their posterior distribution. This is what I wanted to avoid: sampling covariates from (unknown) distribution. Best regards, Vladimir
Sep 11, 2000 Paul S. Collier missing data items
Sep 11, 2000 Lewis B. Sheiner Re: missing data items
Sep 11, 2000 Mats Karlsson Re: missing data items
Sep 11, 2000 Nick Holford Missing data values
Sep 11, 2000 Lewis B. Sheiner Re: missing data items
Sep 20, 2000 Vladimir Piotrovskij An approach for imputing missing independent variable (covariate)
Sep 20, 2000 Leonid Gibiansky RE: An approach for imputing missing independent variable (covariate)
Sep 20, 2000 Lewis B. Sheiner Re: An approach for imputing missing independent variable (covariate)
Sep 20, 2000 Lewis B. Sheiner Re: An approach for imputing missing independent variable (covariate)
Sep 21, 2000 Vladimir Piotrovskij RE: An approach for imputing missing independent variable (covariate)
Sep 21, 2000 Vladimir Piotrovskij RE: An approach for imputing missing independent variable (covariate)
Sep 21, 2000 Lewis B. Sheiner Re: An approach for imputing missing independent variable (covariate)
Sep 22, 2000 Vladimir Piotrovskij RE: An approach for imputing missing independent variable (covariate)