centering variables
From: LSheiner <lewis@c255.ucsf.edu>
Date: Wed, 03 Mar 1999 11:58:57 -0800
Subject: centering variables
As I recall there was considerable discussion on this topic sometime in the past, and perhaps someone can find that discussion in the archives at http://gaps.cpb.uokhsc.edu/nm/ although I was unsuccessful ...
Anyway, the reasons are primarily these
Example:
(A) Cl = theta(1) + theta(2)*age
vs.
(B) Cl = theta(1) + theta(2)*(age - mean_age)
Observations:
1. Numerical: the parameters are less correlated which means the search is easier, less likely to terminate with rounding errors, etc. With (A), since most of the data will be at ages around mean_age, a small change in theta(2) will sweep out a large change in theta(1) - hence high correlation of parameter errors. On the other hand, with (B) the "intercept" (theta(1)) is the CL at the mean age and the slope pivots around that point - hence slope and intercept estimation errors are uncorrelated.
2. Meaningful parameters: With (A), theta(1) is clearance at a theoretical age of zero. This is not a parameter that has much biological meaning. With (B), theta(1) is mean CL at age = mean_age, a very meaningful parameter, and one that may be of considerable importance in its own right. With (B), the covariance step gives you the SE of a meaningful parameter; with (A), it does not.
Bottom line: There is nothing to lose with centering, and much to gain. Hence, ALWAYS do it!
LBS.
--
Lewis B Sheiner, MD
Professor: Lab. Med., Biopharm. Sci., Med.
Box 0626
UCSF, SF, CA
94143-0626
voice: 415 476 1965
fax: 415 476 2796
email: lewis@c255.ucsf.edu