Re: Centering (was Re: Missing covariates)
Date: Thu, 12 Jul 2001 19:12:58 -0400
From: Alan Xiao <Alan.Xiao@cognigencorp.com>
Subject: Re: Centering (was Re: Missing covariates)
Hi,
I'll throw my 2 cents in here.
About the centering for dichotomous or ordinal covariates, how about the
following mental experiment, which starts from continuous covariates for
which centering seems the least controversial here.
For an ideal normal distribution of WEIGHT in a population with a mean
of 40 and STD of 10, for example, probably no one will question the
centering here.
Now assuming in a real population, the sampled weight distribution is
also normal (overall) around 40 but the sampling are distributed in the
range of 10-20, 25-35, 45-55, 60-70, do you have problem with
centering for this case?
Further, assuming the sampling data are distributed in the ranges of
13-17, 27-33, 47-53, 63-67, do you have problem with centering for this
case?
Further similar operation will reduce the distribution to the
ordinal/dichotomous distribution.
Now, my question is, why ordinal/dichotomous covariates can not be
treated as a special continuous covariates, or as a continuous
covariates with special sampling values? (Usually in science, a more
general/complicated theory should be able to explain/predict a
specific/simpler case, am I wrong? ) Otherwise, what's the statistical
basis for dichotomous covariates evaluation in PK/PD models?
Back to Perez's example, note that, the scaling in gender0 (0~1) and
gender1 (-1~1) is different.
If you use the same scaling, for example, gender0 (0~1) and gender1
(-0.5 ~ 0.5), you'll get a different story. Following attached is a SAS
program and corresponding results with 50 subjects (for WEIGHT and
SEXF, data distribution is attached) - Notice the comparisons of results
in following pairs: sexf0 (0 ~ 1) and sexf1 (-1 ~ 1) (comparing to
Perez's case); sexf0 (0 ~ 1) and sexf2 (-0.5 ~ 0.5); and sexf1(-1 ~ 1)
and sexf3 (0 ~ 2).
N = 50. Dependent variable WEIGHT
Parameter Estimate Standard t Value Pr > |t||
Error
Intercept 50.55440307 0.94663607 53.40 <.0001
sexf0 10.85100036 7.77 <.0001
1.39573874
Intercept 55.97990325 0.69786937 80.22 <.0001
sexf1 5.42550018 0.69786937 7.7 <.0001
Intercept 55.97990325 0.69786937 80.22 <.0001
sexf2 1.39573874 7.77 <.0001
10.85100036
Intercept 50.55440307 0.94663607 53.40 <.0001
sexf3 5.42550018 0.69786937 7.77 <.0001
So, Perez's explanation seems to need some modification. .....
If you are interested, see the attached results for details or run by
yourself.
**************** SAS programs **************
%macro sss;
data aa1;
%do i = 1 %to 50;
sexf0 = round(0.5 + sqrt(0.027)*rannor(35179*&i), 1);
sexf1 = 2*sexf0 -1;
sexf2 = sexf0 -0.5;
sexf3 = 2*sexf0;
if sexf0 = 0 then
weight = 50 + sqrt(25)*rannor(43820*&i);
if sexf0 = 1 then
weight = 60 + sqrt(36)*rannor(43820*&i);
output;
%end;
run;
%mend sss;
%sss;
proc print data = aa1;
run;
proc gplot data = aa1;
plot weight * sexf0;
plot weight * sexf1;
plot weight * sexf2;
plot weight * sexf3;
run;
proc GLM data = aa1;
model weight = sexf0;
output out = bb1 p = pred r= res;
run;
proc GLM data = aa1;
model weight = sexf1;
output out = bb2 p = pred r= res;
run;
proc GLM data = aa1;
model weight = sexf2;
output out = bb3 p = pred r= res;
run;
proc GLM data = aa1;
model weight = sexf3;
output out = bb3 p = pred r= res;
run;
quit;
**************results - graphs not shown here ***********
The SAS
System 94
16:32 Thursday, July
12, 2001
Obs sexf0 sexf1 sexf2 sexf3 weight
1 1 1 0.5 2 54.0675
2 1 1 0.5 2 54.4837
3 0 -1 -0.5 0 50.1830
4 0 -1 -0.5 0 59.8383
5 1 1 0.5 2 57.8778
6 0 -1 -0.5 0 47.4324
7 1 1 0.5 2 57.7886
8 1 1 0.5 2 65.4018
9 0 -1 -0.5 0 54.5489
10 0 -1 -0.5 0 55.4752
11 0 -1 -0.5 0 50.8191
12 1 1 0.5 2 60.9912
13 0 -1 -0.5 0 47.2735
14 0 -1 -0.5 0 57.7363
15 1 1 0.5 2 55.2785
16 1 1 0.5 2 59.8003
17 0 -1 -0.5 0 57.1171
18 0 -1 -0.5 0 52.8775
19 0 -1 -0.5 0 44.2806
20 0 -1 -0.5 0 51.8335
21 1 1 0.5 2 67.7795
22 0 -1 -0.5 0 53.4585
23 0 -1 -0.5 0 48.2589
24 0 -1 -0.5 0 48.2766
25 1 1 0.5 2 59.9475
'
The SAS
System 95
16:32 Thursday, July
12, 2001
Obs sexf0 sexf1 sexf2 sexf3 weight
26 1 1 0.5 2 56.2993
27 1 1 0.5 2 64.0133
28 0 -1 -0.5 0 48.0437
29 1 1 0.5 2 61.5021
30 0 -1 -0.5 0 52.9423
31 0 -1 -0.5 0 46.8322
32 0 -1 -0.5 0 45.1282
33 1 1 0.5 2 68.9803
34 0 -1 -0.5 0 51.5586
35 0 -1 -0.5 0 53.0975
36 0 -1 -0.5 0 55.8479
37 1 1 0.5 2 66.0053
38 1 1 0.5 2 58.9312
39 1 1 0.5 2 52.7544
40 1 1 0.5 2 68.0063
41 0 -1 -0.5 0 46.0756
42 1 1 0.5 2 65.4956
43 0 -1 -0.5 0 48.3888
44 0 -1 -0.5 0 48.8654
45 1 1 0.5 2 70.5623
46 0 -1 -0.5 0 40.6463
47 1 1 0.5 2 68.1454
48 1 1 0.5 2 61.6652
49 0 -1 -0.5 0 48.1331
50 1 1 0.5 2 56.5472
'
The SAS
System 96
16:32 Thursday, July
12, 2001
The GLM Procedure
Number of observations 50
'
The SAS
System 97
16:32 Thursday, July
12, 2001
The GLM Procedure
Dependent Variable: weight
Sum of
Source DF Squares Mean Square F Value
Pr > F
Model 1 1462.383074 1462.383074 60.44
<.0001
Error 48 1161.371327 24.195236
Corrected Total 49 2623.754401
R-Square Coeff Var Root MSE weight Mean
0.557363 8.855503 4.918865 55.54586
Source DF Type I SS Mean Square F Value
Pr > F
sexf0 1 1462.383074 1462.383074 60.44
<.0001
'
The SAS
System 98
16:32 Thursday, July
12, 2001
The GLM Procedure
Dependent Variable: weight
Source DF Type III SS Mean Square F Value
Pr > F
sexf0 1 1462.383074 1462.383074 60.44
<.0001
Standard
Parameter Estimate Error t Value Pr > |t|
Intercept 50.55440307 0.94663607 53.40 <.0001
sexf0 10.85100036 1.39573874 7.77 <.0001
'
The SAS
System 99
16:32 Thursday, July
12, 2001
The GLM Procedure
Number of observations 50
'
The SAS
System 100
16:32 Thursday, July
12, 2001
The GLM Procedure
Dependent Variable: weight
Sum of
Source DF Squares Mean Square F Value
Pr > F
Model 1 1462.383074 1462.383074 60.44
<.0001
Error 48 1161.371327 24.195236
Corrected Total 49 2623.754401
R-Square Coeff Var Root MSE weight Mean
0.557363 8.855503 4.918865 55.54586
Source DF Type I SS Mean Square F Value
Pr > F
sexf1 1 1462.383074 1462.383074 60.44
<.0001
'
The SAS
System 101
16:32 Thursday, July
12, 2001
The GLM Procedure
Dependent Variable: weight
Source DF Type III SS Mean Square F Value
Pr > F
sexf1 1 1462.383074 1462.383074 60.44
<.0001
Standard
Parameter Estimate Error t Value Pr > |t|
Intercept 55.97990325 0.69786937 80.22 <.0001
sexf1 5.42550018 0.69786937 7.77 <.0001
'
The SAS
System 102
16:32 Thursday, July
12, 2001
The GLM Procedure
Number of observations 50
'
The SAS
System 103
16:32 Thursday, July
12, 2001
The GLM Procedure
Dependent Variable: weight
Sum of
Source DF Squares Mean Square F Value
Pr > F
Model 1 1462.383074 1462.383074 60.44
<.0001
Error 48 1161.371327 24.195236
Corrected Total 49 2623.754401
R-Square Coeff Var Root MSE weight Mean
0.557363 8.855503 4.918865 55.54586
Source DF Type I SS Mean Square F Value
Pr > F
sexf2 1 1462.383074 1462.383074 60.44
<.0001
'
The SAS
System 104
16:32 Thursday, July
12, 2001
The GLM Procedure
Dependent Variable: weight
Source DF Type III SS Mean Square F Value
Pr > F
sexf2 1 1462.383074 1462.383074 60.44
<.0001
Standard
Parameter Estimate Error t Value Pr > |t|
Intercept 55.97990325 0.69786937 80.22 <.0001
sexf2 10.85100036 1.39573874 7.77 <.0001
'
The SAS
System 105
16:32 Thursday, July
12, 2001
The GLM Procedure
Number of observations 50
'
The SAS
System 106
16:32 Thursday, July
12, 2001
The GLM Procedure
Dependent Variable: weight
Sum of
Source DF Squares Mean Square F Value
Pr > F
Model 1 1462.383074 1462.383074 60.44
<.0001
Error 48 1161.371327 24.195236
Corrected Total 49 2623.754401
R-Square Coeff Var Root MSE weight Mean
0.557363 8.855503 4.918865 55.54586
Source DF Type I SS Mean Square F Value
Pr > F
sexf3 1 1462.383074 1462.383074 60.44
<.0001
'
The SAS
System 107
16:32 Thursday, July
12, 2001
The GLM Procedure
Dependent Variable: weight
Source DF Type III SS Mean Square F Value Pr > F
sexf3 1 1462.383074 1462.383074 60.44 <.0001
Standard
Parameter Estimate Error t Value Pr > |t|
Intercept 50.55440307 0.94663607 53.40 <.0001
sexf3 5.42550018 0.69786937 7.77 <.0001
--
***** Alan Xiao, Ph.D ***************
***** PK/PD Scientist ***************
***** Cognigen Corporation **********
***** Tel: 716-633-3463 ext 265 ******