Re: Centering (was Re: Missing covariates)

From: Alan Xiao Date: July 12, 2001 technical Source: cognigencorp.com
Date: Thu, 12 Jul 2001 19:12:58 -0400 From: Alan Xiao <Alan.Xiao@cognigencorp.com> Subject: Re: Centering (was Re: Missing covariates) Hi, I'll throw my 2 cents in here. About the centering for dichotomous or ordinal covariates, how about the following mental experiment, which starts from continuous covariates for which centering seems the least controversial here. For an ideal normal distribution of WEIGHT in a population with a mean of 40 and STD of 10, for example, probably no one will question the centering here. Now assuming in a real population, the sampled weight distribution is also normal (overall) around 40 but the sampling are distributed in the range of 10-20, 25-35, 45-55, 60-70, do you have problem with centering for this case? Further, assuming the sampling data are distributed in the ranges of 13-17, 27-33, 47-53, 63-67, do you have problem with centering for this case? Further similar operation will reduce the distribution to the ordinal/dichotomous distribution. Now, my question is, why ordinal/dichotomous covariates can not be treated as a special continuous covariates, or as a continuous covariates with special sampling values? (Usually in science, a more general/complicated theory should be able to explain/predict a specific/simpler case, am I wrong? ) Otherwise, what's the statistical basis for dichotomous covariates evaluation in PK/PD models? Back to Perez's example, note that, the scaling in gender0 (0~1) and gender1 (-1~1) is different. If you use the same scaling, for example, gender0 (0~1) and gender1 (-0.5 ~ 0.5), you'll get a different story. Following attached is a SAS program and corresponding results with 50 subjects (for WEIGHT and SEXF, data distribution is attached) - Notice the comparisons of results in following pairs: sexf0 (0 ~ 1) and sexf1 (-1 ~ 1) (comparing to Perez's case); sexf0 (0 ~ 1) and sexf2 (-0.5 ~ 0.5); and sexf1(-1 ~ 1) and sexf3 (0 ~ 2). N = 50. Dependent variable WEIGHT Parameter Estimate Standard t Value Pr > |t|| Error Intercept 50.55440307 0.94663607 53.40 <.0001 sexf0 10.85100036 7.77 <.0001 1.39573874 Intercept 55.97990325 0.69786937 80.22 <.0001 sexf1 5.42550018 0.69786937 7.7 <.0001 Intercept 55.97990325 0.69786937 80.22 <.0001 sexf2 1.39573874 7.77 <.0001 10.85100036 Intercept 50.55440307 0.94663607 53.40 <.0001 sexf3 5.42550018 0.69786937 7.77 <.0001 So, Perez's explanation seems to need some modification. ..... If you are interested, see the attached results for details or run by yourself. **************** SAS programs ************** %macro sss; data aa1; %do i = 1 %to 50; sexf0 = round(0.5 + sqrt(0.027)*rannor(35179*&i), 1); sexf1 = 2*sexf0 -1; sexf2 = sexf0 -0.5; sexf3 = 2*sexf0; if sexf0 = 0 then weight = 50 + sqrt(25)*rannor(43820*&i); if sexf0 = 1 then weight = 60 + sqrt(36)*rannor(43820*&i); output; %end; run; %mend sss; %sss; proc print data = aa1; run; proc gplot data = aa1; plot weight * sexf0; plot weight * sexf1; plot weight * sexf2; plot weight * sexf3; run; proc GLM data = aa1; model weight = sexf0; output out = bb1 p = pred r= res; run; proc GLM data = aa1; model weight = sexf1; output out = bb2 p = pred r= res; run; proc GLM data = aa1; model weight = sexf2; output out = bb3 p = pred r= res; run; proc GLM data = aa1; model weight = sexf3; output out = bb3 p = pred r= res; run; quit; **************results - graphs not shown here *********** The SAS System 94 16:32 Thursday, July 12, 2001 Obs sexf0 sexf1 sexf2 sexf3 weight 1 1 1 0.5 2 54.0675 2 1 1 0.5 2 54.4837 3 0 -1 -0.5 0 50.1830 4 0 -1 -0.5 0 59.8383 5 1 1 0.5 2 57.8778 6 0 -1 -0.5 0 47.4324 7 1 1 0.5 2 57.7886 8 1 1 0.5 2 65.4018 9 0 -1 -0.5 0 54.5489 10 0 -1 -0.5 0 55.4752 11 0 -1 -0.5 0 50.8191 12 1 1 0.5 2 60.9912 13 0 -1 -0.5 0 47.2735 14 0 -1 -0.5 0 57.7363 15 1 1 0.5 2 55.2785 16 1 1 0.5 2 59.8003 17 0 -1 -0.5 0 57.1171 18 0 -1 -0.5 0 52.8775 19 0 -1 -0.5 0 44.2806 20 0 -1 -0.5 0 51.8335 21 1 1 0.5 2 67.7795 22 0 -1 -0.5 0 53.4585 23 0 -1 -0.5 0 48.2589 24 0 -1 -0.5 0 48.2766 25 1 1 0.5 2 59.9475 ' The SAS System 95 16:32 Thursday, July 12, 2001 Obs sexf0 sexf1 sexf2 sexf3 weight 26 1 1 0.5 2 56.2993 27 1 1 0.5 2 64.0133 28 0 -1 -0.5 0 48.0437 29 1 1 0.5 2 61.5021 30 0 -1 -0.5 0 52.9423 31 0 -1 -0.5 0 46.8322 32 0 -1 -0.5 0 45.1282 33 1 1 0.5 2 68.9803 34 0 -1 -0.5 0 51.5586 35 0 -1 -0.5 0 53.0975 36 0 -1 -0.5 0 55.8479 37 1 1 0.5 2 66.0053 38 1 1 0.5 2 58.9312 39 1 1 0.5 2 52.7544 40 1 1 0.5 2 68.0063 41 0 -1 -0.5 0 46.0756 42 1 1 0.5 2 65.4956 43 0 -1 -0.5 0 48.3888 44 0 -1 -0.5 0 48.8654 45 1 1 0.5 2 70.5623 46 0 -1 -0.5 0 40.6463 47 1 1 0.5 2 68.1454 48 1 1 0.5 2 61.6652 49 0 -1 -0.5 0 48.1331 50 1 1 0.5 2 56.5472 ' The SAS System 96 16:32 Thursday, July 12, 2001 The GLM Procedure Number of observations 50 ' The SAS System 97 16:32 Thursday, July 12, 2001 The GLM Procedure Dependent Variable: weight Sum of Source DF Squares Mean Square F Value Pr > F Model 1 1462.383074 1462.383074 60.44 <.0001 Error 48 1161.371327 24.195236 Corrected Total 49 2623.754401 R-Square Coeff Var Root MSE weight Mean 0.557363 8.855503 4.918865 55.54586 Source DF Type I SS Mean Square F Value Pr > F sexf0 1 1462.383074 1462.383074 60.44 <.0001 ' The SAS System 98 16:32 Thursday, July 12, 2001 The GLM Procedure Dependent Variable: weight Source DF Type III SS Mean Square F Value Pr > F sexf0 1 1462.383074 1462.383074 60.44 <.0001 Standard Parameter Estimate Error t Value Pr > |t| Intercept 50.55440307 0.94663607 53.40 <.0001 sexf0 10.85100036 1.39573874 7.77 <.0001 ' The SAS System 99 16:32 Thursday, July 12, 2001 The GLM Procedure Number of observations 50 ' The SAS System 100 16:32 Thursday, July 12, 2001 The GLM Procedure Dependent Variable: weight Sum of Source DF Squares Mean Square F Value Pr > F Model 1 1462.383074 1462.383074 60.44 <.0001 Error 48 1161.371327 24.195236 Corrected Total 49 2623.754401 R-Square Coeff Var Root MSE weight Mean 0.557363 8.855503 4.918865 55.54586 Source DF Type I SS Mean Square F Value Pr > F sexf1 1 1462.383074 1462.383074 60.44 <.0001 ' The SAS System 101 16:32 Thursday, July 12, 2001 The GLM Procedure Dependent Variable: weight Source DF Type III SS Mean Square F Value Pr > F sexf1 1 1462.383074 1462.383074 60.44 <.0001 Standard Parameter Estimate Error t Value Pr > |t| Intercept 55.97990325 0.69786937 80.22 <.0001 sexf1 5.42550018 0.69786937 7.77 <.0001 ' The SAS System 102 16:32 Thursday, July 12, 2001 The GLM Procedure Number of observations 50 ' The SAS System 103 16:32 Thursday, July 12, 2001 The GLM Procedure Dependent Variable: weight Sum of Source DF Squares Mean Square F Value Pr > F Model 1 1462.383074 1462.383074 60.44 <.0001 Error 48 1161.371327 24.195236 Corrected Total 49 2623.754401 R-Square Coeff Var Root MSE weight Mean 0.557363 8.855503 4.918865 55.54586 Source DF Type I SS Mean Square F Value Pr > F sexf2 1 1462.383074 1462.383074 60.44 <.0001 ' The SAS System 104 16:32 Thursday, July 12, 2001 The GLM Procedure Dependent Variable: weight Source DF Type III SS Mean Square F Value Pr > F sexf2 1 1462.383074 1462.383074 60.44 <.0001 Standard Parameter Estimate Error t Value Pr > |t| Intercept 55.97990325 0.69786937 80.22 <.0001 sexf2 10.85100036 1.39573874 7.77 <.0001 ' The SAS System 105 16:32 Thursday, July 12, 2001 The GLM Procedure Number of observations 50 ' The SAS System 106 16:32 Thursday, July 12, 2001 The GLM Procedure Dependent Variable: weight Sum of Source DF Squares Mean Square F Value Pr > F Model 1 1462.383074 1462.383074 60.44 <.0001 Error 48 1161.371327 24.195236 Corrected Total 49 2623.754401 R-Square Coeff Var Root MSE weight Mean 0.557363 8.855503 4.918865 55.54586 Source DF Type I SS Mean Square F Value Pr > F sexf3 1 1462.383074 1462.383074 60.44 <.0001 ' The SAS System 107 16:32 Thursday, July 12, 2001 The GLM Procedure Dependent Variable: weight Source DF Type III SS Mean Square F Value Pr > F sexf3 1 1462.383074 1462.383074 60.44 <.0001 Standard Parameter Estimate Error t Value Pr > |t| Intercept 50.55440307 0.94663607 53.40 <.0001 sexf3 5.42550018 0.69786937 7.77 <.0001 -- ***** Alan Xiao, Ph.D *************** ***** PK/PD Scientist *************** ***** Cognigen Corporation ********** ***** Tel: 716-633-3463 ext 265 ******
Jul 02, 2001 Nick Holford Centering (was Re: Missing covariates)
Jul 02, 2001 William Bachman RE: Centering (was Re: Missing covariates)
Jul 02, 2001 Kenneth G. Kowalski RE: Centering (was Re: Missing covariates)
Jul 02, 2001 Lewis B. Sheiner Centering (was Re: Missing covariates)
Jul 03, 2001 Jogarao Gobburu Re: Centering (was Re: Missing covariates)
Jul 03, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 03, 2001 Nick Holford Re: Centering (was Re: Missing covariates)
Jul 03, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 03, 2001 Lewis B. Sheiner Re: Centering (was Re: Missing covariates)
Jul 03, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 03, 2001 Diane Mould Re: Centering (was Re: Missing covariates)
Jul 04, 2001 Nick Holford Re: Centering (was Re: Missing covariates)
Jul 04, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 04, 2001 Diane Mould Re: Centering (was Re: Missing covariates)
Jul 05, 2001 Nick Holford Re: Centering (was Re: Missing covariates)
Jul 05, 2001 Stephen Duffull RE: Centering (was Re: Missing covariates)
Jul 05, 2001 Nick Holford Re: Centering (was Re: Missing covariates)
Jul 05, 2001 Leon Aarons 70kg neonates
Jul 05, 2001 Nick Holford Re: 70kg neonates
Jul 05, 2001 Peter Bonate Centering
Jul 05, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 05, 2001 Leonid Gibiansky RE: Centering (was Re: Missing covariates)
Jul 05, 2001 Kenneth G. Kowalski RE: Centering (was Re: Missing covariates)
Jul 05, 2001 William Bachman RE: Centering (was Re: Missing covariates)
Jul 05, 2001 Diane Mould Re: Centering (was Re: Missing covariates)
Jul 05, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 05, 2001 Alan Xiao Question 2 about prediction and covariates
Jul 06, 2001 Matt Hutmacher RE: Centering (was Re: Missing covariates)
Jul 09, 2001 Vladimir Piotrovskij RE: Centering (Impact on SE)
Jul 09, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 09, 2001 Kenneth G. Kowalski RE: Centering (Impact on SE)
Jul 09, 2001 Vladimir Piotrovskij RE: Centering (Impact on SE)
Jul 09, 2001 Smith Brian P RE: Centering (Impact on SE)
Jul 09, 2001 Matt Hutmacher RE: Centering (was Re: Missing covariates)
Jul 12, 2001 Juan Jose Perez Ruixo RE: Centering (was Re: Missing covariates)
Jul 12, 2001 Juan Jose Perez Ruixo RE: Centering (was Re: Missing covariates)
Jul 12, 2001 Matt Hutmacher RE: Centering (was Re: Missing covariates)
Jul 12, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 30, 2001 Juan Jose Perez Ruixo Re: Centering (was Re: Missing covariates)
Jul 30, 2001 Alan Xiao Re: Centering (was Re: Missing covariates)
Jul 30, 2001 Leonid Gibiansky RE: Centering (was Re: Missing covariates)