RE: Centering (was Re: Missing covariates)
From: "KOWALSKI, KENNETH G. [PHR/1825]" <kenneth.g.kowalski@pharmacia.com>
Subject: RE: Centering (was Re: Missing covariates)
Date: Thu, 5 Jul 2001 09:48:17 -0500
Alan/NMUSERS:
Who really cares if INTERCEPT(no centering) has a different standard error
than INTERCEPT(centering)? They are not estimating the same thing! Note
that INTERCEPT(no centering)+SLOPE(no centering)*CENTEREDVALUE is estimating
the same thing as INTERCEPT(centering). If you calculatedthe standard error
for INTERCEPT(no centering)+SLOPE(no centering)*CENTEREDVALUE from the
covariance matrix of the estimates of the thetas you will get the same
standard error as that reported for INTERCEPT(centering). The really
important issue is the slope for the covariate effect. If a global minimum
is achieved then SLOPE(no centering) and SLOPE(centering) will have the same
estimates and standard errors because they are estimating the same thing.
Again, the benefits of centering are purely numerical not statistical. I
have no problems with not centering if the model is numerically stable and
you achieve the global minimum. This takes the question of how far off
from the center of your data to do the centering to the extreme (i.e.,
centered about zero). Obviously, the answer depends on the numerical
stability of the model that you are fitting to your data. Since one doesn't
always know at the onset how stable the model is going to be, it is good
practice to do centering. Whether one uses some standard value (such as
Nick suggests) or the mean or median doesn't really matter provided they all
achieve the global minimum...pick the one that is most convenient to you.
If achieving of the global miniminum is very sensitive to the choice of
centering then you may need to look at the model more closely
anyway...perhaps it is overparameterized.
I agree with Nick that it is convenient to choose standards. Nevertheless,
I probably wouldn't use an age of 40 if the range of ages in the study were
between 55 and 75...not that there is anything wrong with that (to quote
Jerry Seinfeld) if a global minimum is achieved. I would probably use
something like 60 or 65 regardless of what the mean or median of the
distribution of ages was in the study. If the model was so sensitive to
whether I used 60 or 65 versus the mean or the median then I would worry
about the appropriateness of the model more so than what value I should use
for centering.
Ken