Re: Centering (was Re: Missing covariates)
From: "Perez Ruixo, Juan Jose [JanBe]" <JPEREZRU@janbe.jnj.com>
Subject: Re: Centering (was Re: Missing covariates)
Date: Mon, 30 Jul 2001 13:08:26 +0200
Dear Alan and all,
Now, I think everybody agree with centering approach
for quantitative covariates, also when they are sampled with different
strategies. But, I don't think the same is true for categorical covariates.
In this setting, we can distinguish nominal (for instance, pharmaceutical
form: solution, capsule or tablet; or sex) and ordinal (for instance,
disease progresion: grade I, II, III or IV; or APGAR scale) covariates. This
type of covariates cannot be treated as special continous covariates with
special sampling values.
Quantitative covariates are from interval (for
instance, temperature) or ratio (for instance, age) metric scale, and
additive or additive and multiplicative operations with them are allowed,
respectively. For this reasons, centering approach can be used independently
of sampling strategy. Categorical covariates aren't from metric scale. In
nominal covariates only equality operations are allowed, and in ordinal
covariates equality and order operations are possible. By definition,
additive and multiplicative operations are not applicable. For this reasons,
centering approach must be avoided.
A special case is ordinal covariates with a lot of
values (for instance, APGAR). If you assume asummed that the "distance"
between 6 and 7 scores is the same as between 10 and 11 scores, usually,
it's possible to treat this covariate as a quantitative with an interval
metric scale and then, centering approach can be usefull. But this
assumption is hardly applicable for covariates with a few scores like
disease progression covariates. Usually, the "distance" between grade I and
II scores is not the same than III to IV scores.
Allan's example shows how the slope of a linear
regression model with categorical data is affected by the codification used,
as I said in my last email. In that example, the real difference between
male and female weight is 10.85 (SE: 1.39). I agree we can get this value
from codification like sexf0 or sexf2, but no sexf1 or sexf3. In last
examples, we need to multiply by 2 for getting the confidence interval of
the real difference between male and female.
Sexf0 and sexf2 represents two different types of
codification. The sexf0 codification is named "reference cell coding" or
"partial method" and, sexf2 codification is named "deviations from mean
coding" or "marginal method". The choice of the covariates codification
depends of the effect that you want to fit. Now, we can consider the
pharmaceutical form covariate. If I want to estimate the absorption rate
constant difference between capsule and solution and, also the difference
between tablet and solution, I will use two dummy covariates with reference
cell coding (D1: solution = 0 and capsule = 1; D2 solution = 0 and table =
1). But, if I wish to compare the capsule absorption rate with respect to
the mean of absorption rate of all pharmaceuticaI forms, I will use
deviation from mean coding. If I have the same number of data for every
pharmaceutical form, I could code: solution = -1, capsule = 1 and tablet =
0. If the number of categories increases, the complexity of codification
increases too.
This situation doesn't happen with reference cell
coding. Moreover, in health sciences, the reference cell coding is the most
oftenly used codification because the regression coefficients are very easy
to interpret. In sexf0 example, the intercept have a direct meaning, it's
the weight average for category 0 and it's independent of the ratio male to
female. It doesn't happen the same with sexf2. Moreover, if the male to
female ratio is not equal to 1:1, it's neccesary to modify appropiately the
values for codifications with deviations from mean coding, otherwise the
intercept will be affected and won't represent the weigth average of male
and female.
Thanks,
Juan Jose Perez Ruixo
Global Clinical Pharmacokinetics and Clinical Pharmacodynamics
Department.
Jassen Research Foundation
Turnhoutsweg, 30
B-2340 Beerse
Belgium
Telephone: +032 / 14 60 75 08
Fax: +032 / 14 60 58 34
E-mail: jperezru@janbe.jnj.com