RE: Centering (was Re: Missing covariates)

From: Juan Jose Perez Ruixo Date: July 12, 2001 technical Source: cognigencorp.com
From: "Perez Ruixo, Juan Jose [JanBe]" <JPEREZRU@janbe.jnj.com> Subject: RE: Centering (was Re: Missing covariates) Date: Thu, 12 Jul 2001 10:17:54 +0200 Dear all, Regarding standard errors when the centering approach is used, I would like to add some comments. For simple linear model without centering the independent variable (y = a + bx), the variance of y (as a function of x) is equal to: S**2 * { (1 / N) + (x - X)**2 / Sx} eq. 1 where, S is the residual standard error; N is the number of pairs x,y; X is the mean of x; Sx is the sum of squares for x. When x = X, we have the lower variance of y, S**2 / N. This variance is equal to the centering approach variance in x - X = 0. It only results in applying eq.1 to the model, y = a' + b (x-X), where a' = a + bX. In both cases, the variance in the mean of the independent variable is lower than the variance of the intercept. Only when the mean of x is equal 0, these variances are equal. For these reasons, centering does not affect the standard errors, and the intercept errors are different because it represents different values. We must be careful with the centering approach for categorical data. In this case, the slope (b) is affected by the codification used. Following the example of Matt (TVCL = THETA(1) + Xi * THETA(2)), the parametrization Xi = 1 for females, and Xi = -1 for males, allows to get the population average when Xi=0 and there is equal proportion of males and females. In this case, we have a null correlation between intercept and slope, but the intercept SE is the same as the slope SE and, both equal to residual standard error divided by SQRT(N). In other words, you don't know directly the population parameter for males and the difference from females. THETA(2) is a half of the real difference between males and females and its absolute standard error (not relative standard error) is affected. It means t-test is the same, but confidence interval building needs previous transformation in order to get the precision of the real difference between males and females. I can show a simple example (with S+ code) weight <- c(rnorm(50, mean = 60, sd = 6),rnorm(50, mean = 50, sd = 5)) gender0 <- c( rep(1,50),rep(0,50)) gender1 <- c( rep(1,50),rep(-1,50)) G0 <- lm(weight~gender0) G1 <- lm(weight~gender1) summary(G0) summary(G1) ....... Coefficients G0: Value Std. Error t value Pr(>|t|) (Intercept) 50.7279 0.7918 64.0675 0.0000 gender0 7.9107 1.1198 7.0647 0.0000 Residual standard error: 5.599 on 98 degrees of freedom Multiple R-Squared: 0.3374 F-statistic: 49.91 on 1 and 98 degrees of freedom, the p-value is 2.362e-010 Correlation Intercept, gender: -0.7071 Coefficients G1: Value Std. Error t value Pr(>|t|) (Intercept) 54.6832 0.5599 97.6698 0.0000 gender1 3.9554 0.5599 7.0647 0.0000 Residual standard error: 5.599 on 98 degrees of freedom Multiple R-Squared: 0.3374 F-statistic: 49.91 on 1 and 98 degrees of freedom, the p-value is 2.362e-010 Correlation Intercept, gender: 0 For this reasons, I suggest not to use centering approach for categorical data (here, I don't include ordinal data). Without centering the intercept have a useful meaning, thereby making centering unnecessary. > Thanks, > > Juan Jose Perez Ruixo > Global Pharmacokinetics and Clinical Pharmacology Division. > Janssen Research Foundation > Turnhoutseweg, 30 > B-2340 Beerse > Belgium > Tel: (+32) 14 60 75 08 > Email: jperezru@janbe.jnj.com >
Jul 02, 2001 Nick Holford Centering (was Re: Missing covariates)
Jul 02, 2001 William Bachman RE: Centering (was Re: Missing covariates)
Jul 02, 2001 Kenneth G. Kowalski RE: Centering (was Re: Missing covariates)
Jul 02, 2001 Lewis B. Sheiner Centering (was Re: Missing covariates)
Jul 03, 2001 Jogarao Gobburu Re: Centering (was Re: Missing covariates)
Jul 03, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 03, 2001 Nick Holford Re: Centering (was Re: Missing covariates)
Jul 03, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 03, 2001 Lewis B. Sheiner Re: Centering (was Re: Missing covariates)
Jul 03, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 03, 2001 Diane Mould Re: Centering (was Re: Missing covariates)
Jul 04, 2001 Nick Holford Re: Centering (was Re: Missing covariates)
Jul 04, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 04, 2001 Diane Mould Re: Centering (was Re: Missing covariates)
Jul 05, 2001 Nick Holford Re: Centering (was Re: Missing covariates)
Jul 05, 2001 Stephen Duffull RE: Centering (was Re: Missing covariates)
Jul 05, 2001 Nick Holford Re: Centering (was Re: Missing covariates)
Jul 05, 2001 Leon Aarons 70kg neonates
Jul 05, 2001 Nick Holford Re: 70kg neonates
Jul 05, 2001 Peter Bonate Centering
Jul 05, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 05, 2001 Leonid Gibiansky RE: Centering (was Re: Missing covariates)
Jul 05, 2001 Kenneth G. Kowalski RE: Centering (was Re: Missing covariates)
Jul 05, 2001 William Bachman RE: Centering (was Re: Missing covariates)
Jul 05, 2001 Diane Mould Re: Centering (was Re: Missing covariates)
Jul 05, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 05, 2001 Alan Xiao Question 2 about prediction and covariates
Jul 06, 2001 Matt Hutmacher RE: Centering (was Re: Missing covariates)
Jul 09, 2001 Vladimir Piotrovskij RE: Centering (Impact on SE)
Jul 09, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 09, 2001 Kenneth G. Kowalski RE: Centering (Impact on SE)
Jul 09, 2001 Vladimir Piotrovskij RE: Centering (Impact on SE)
Jul 09, 2001 Smith Brian P RE: Centering (Impact on SE)
Jul 09, 2001 Matt Hutmacher RE: Centering (was Re: Missing covariates)
Jul 12, 2001 Juan Jose Perez Ruixo RE: Centering (was Re: Missing covariates)
Jul 12, 2001 Juan Jose Perez Ruixo RE: Centering (was Re: Missing covariates)
Jul 12, 2001 Matt Hutmacher RE: Centering (was Re: Missing covariates)
Jul 12, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 30, 2001 Juan Jose Perez Ruixo Re: Centering (was Re: Missing covariates)
Jul 30, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 30, 2001 Leonid Gibiansky RE: Centering (was Re: Missing covariates)