RE: order of covariate inclusion -> avoiding stepwise a pproaches -> abandoning exploratory analysis?
From: lgibiansky@emmes.com
Subject: RE: [NMusers] order of covariate inclusion -> avoiding stepwise a pproaches -> abandoning exploratory analysis?
Date: 9/29/2003 11:16 AM
Ken,
Your message seems to imply that
(i) Large number (e.g., 30) of covariates can be found ONLY in problems where we
have a lot of collinear covariates,
(ii) Thinking through the list of covariates, we ALWAYS can reduce it to the
manageable number (and sort them out between different random effects).
I would disagree with both statements.
If you cannot create manageable full model starting from the base model, you need to
screen covariates, that leads us to the forward-addition algorithm (unless you work
for Mark Sale with the access to the genetic algorithm and a cluster of 1000+ computers
waiting for your input). Unspoken assumption (sufficient condition of convergency)
behind the forward addition algorithm is the convexity of the -log(likelihood) as a
function of covariates. One can easily create an artificial example where this assumption
is violated but it would be interesting to see any real example where this is not true.
Sure, I agree that collaterality is the issue that needs to be thought ahead of time, but
this is not the only source of the large covariate list. As to the pruning the list, I think
it is dangerous to do it too aggressively: this can be very subjective; can prevent us from
uncovering new totally unexpected dependencies. I would prefer to do more or less formal
search of the explanatory variables, and then interpret them in clinically relevant terms
rather than look for clinically relevant dependencies only. This may reduce subjectivity
of the analysis, restricting it to the subjective explanation of uncovered dependencies
instead of the subjective choice of alternatives to investigate.
Leonid