RE: order of covariate inclusion -> avoiding stepwise a pproaches -> abandoning exploratory analysis?

From: Kenneth Kowalski Date: September 29, 2003 technical Source: cognigencorp.com
From: Ken.Kowalski@pfizer.com Subject: RE: [NMusers] order of covariate inclusion -> avoiding stepwise a pproaches -> abandoning exploratory analysis? Date: 9/29/2003 2:37 PM Leonid, I think you are missing my point. I have nothing against investigating a large number of covariates (e.g., >=30) if they truly provide scientifically relevant and independent information. However, typically what happens when we have a long list of covariates is that many covariates may be collinear and hence are redundant. Such redundancy based on these nuisance covariates that are correlated with the important mechanistic covariates can cause havoc with any model building procedure, masking our ability to discern the true covariate effects. Forward selection procedures are particularly vulnerable because they can be blind to the collinearity issue as they can often find a good fitting model without running into stability/over-parameterization issues that a full model would be confronted with. However, as I've said previously, just because forward selection can find a good fitting model doesn't mean it found the right one. For example, by chance a nuisance covariate that is highly correlated with the true covariate may be selected first by a forward selection procedure and because of the order of testing the true covariate may never get further evaluated in the forward selection procedure and hence will be excluded in favor of the nuisance covariate. I can't tell you how many times that a modeler will say they have a difficult time interpreting covariate effects in the final model selected by a stepwise procedure because they felt certain excluded covariates were more scientifically plausible. I'm merely suggesting that we be a bit more discriminatory with developing our list of plausible covariates to investigate so that we have a set that are the most scientifically plausible and independent. The main reason we use a combination of forward selection/backward elimination with a higher alpha level for inclusion (e.g., alpha = 0.05) is precisely to help mitigate the problem with forward selection alone. By increasing the alpha level for inclusion we allow for "bigger" models to be tested before pruning to a parsimonious model using backward elimination with a smaller alpha level for exclusion (e.g., alpha =0.01 or 0.001). Of course, if we increase the alpha level for inclusion towards 1.0 the combination forward selection/backward elimination procedure will collapse to a purely backward elimination procedure. Moreover, it may only take setting the alpha level for inclusion to 0.20 to begin to develop bigger models using forward selection that will encounter the ill-conditioning problems due to collinearity that we observe with the full model (if we don't become a bit more discriminating in our choice of covariates). I agree that we could err on the other side as well and eliminate important explanatory covariates if we are too discriminatory. Nevertheless, we need to use our best scientific judgement as well as good statistical principles to really uncover the important covariates. Blindly ignoring the limitations of forward selection and just turning the crank to allow the algorithm to identify covariate effects is risky. That is not to say that I'm against forward selection, just that we need to know when it is appropriate to use it and when its not. I still maintain it is better to identify where the redundancies are and eliminate the least plausible covariates when redundancies exist. Certainly if a less plausible covariate is fairly independent of the other covariates then we don't need to eliminate it. I have no problem using forward selection to identify a parsimonious model once we've streamlined the list to those covariates that provide the most independent information where redundancies are removed based on the covariates that are the least plausible. The problem is knowing when we are in a situation where there is a lot of redundancy. Building a full model and looking at the diagnostics from the full model fit (e.g., COV step) will help us to know when we are in a situation where we should deal with the collinearity. Ken _______________________________________________________
Sep 25, 2003 Peter Bonate order of covariate inclusion
Sep 25, 2003 Leonid Gibiansky Re: order of covariate inclusion
Sep 25, 2003 Peter Bonate Re: order of covariate inclusion
Sep 25, 2003 Leonid Gibiansky Re: order of covariate inclusion
Sep 25, 2003 Harry Mager Hm Re: order of covariate inclusion
Sep 25, 2003 William Bachman RE: Re: order of covariate inclusion
Sep 25, 2003 Leonid Gibiansky RE: Re: order of covariate inclusion
Sep 25, 2003 Peter Bonate RE: Re: order of covariate inclusion
Sep 25, 2003 Kenneth Kowalski RE: order of covariate inclusion
Sep 25, 2003 Alan Xiao RE: Re: order of covariate inclusion
Sep 25, 2003 Leonid Gibiansky Re: order of covariate inclusion
Sep 25, 2003 Kenneth Kowalski RE: order of covariate inclusion
Sep 25, 2003 Sduffull RE: order of covariate inclusion
Sep 25, 2003 Marc Gastonguay RE: order of covariate inclusion -> avoiding stepwise approaches
Sep 26, 2003 Gary Maier RE: order of covariate inclusion -> avoiding stepwise approaches
Sep 26, 2003 Marc Gastonguay RE: order of covariate inclusion -> avoiding stepwise approaches
Sep 26, 2003 William Bachman RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 26, 2003 Jakob Ribbing RE: order of covariate inclusion -> avoiding stepwise approaches
Sep 26, 2003 Mark Sale RE: order of covariate inclusion -> avoiding stepwise approaches
Sep 26, 2003 Chuanpu 2 Hu RE: order of covariate inclusion -> avoiding stepwise a pproaches -> abandoning exploratory analysis?
Sep 26, 2003 Kenneth Kowalski RE: order of covariate inclusion -> avoiding stepwise approaches
Sep 26, 2003 Kenneth Kowalski RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 26, 2003 William Bachman RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 26, 2003 Marc Gastonguay RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 26, 2003 Kenneth Kowalski RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 26, 2003 William Bachman RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 26, 2003 Mark Sale RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 26, 2003 Marc Gastonguay RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 26, 2003 David Garbutt RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 26, 2003 Leonid Gibiansky RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 26, 2003 Mark Sale RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 26, 2003 Marc Gastonguay RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 26, 2003 Sduffull RE: order of covariate inclusion -> avoiding stepwise a pproaches -> abandoning exploratory analysis?
Sep 27, 2003 Marc Gastonguay RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 29, 2003 Mark Sale RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 29, 2003 Harry Mager Hm RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 29, 2003 Kenneth Kowalski RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 29, 2003 Tgordi RE: order of covariate inclusion -> avoiding stepwise approaches -> abandoning exploratory analysis?
Sep 29, 2003 Leonid Gibiansky RE: order of covariate inclusion -> avoiding stepwise a pproaches -> abandoning exploratory analysis?
Sep 29, 2003 Guzy RE: order of covariate inclusion -> avoiding stepwise a pproaches -> abandoning exploratory analysis?
Sep 29, 2003 Mark Sale RE: order of covariate inclusion -> avoiding stepwise a pproaches -> abandoning exploratory analysis?
Sep 29, 2003 Kenneth Kowalski RE: order of covariate inclusion -> avoiding stepwise a pproaches -> abandoning exploratory analysis?