RE: WRES AND OUTLIER IDENTIFICATION/EXCLUSION
From: "Kowalski, Ken" Ken.Kowalski@pfizer.com
Subject: RE: [NMusers] WRES AND OUTLIER IDENTIFICATION/EXCLUSION
Date: Wed, 27 Sep 2006 15:27:55 -0400
Hi Mats, Nmusers,
Here are my two cents on this discussion.
1) For individual data-point outliers wouldn't the 'ETA on Epsilon'
residual error model you propose effectively down-weight all of the
observations within an individual and not just the suspected outlier
data point? I certainly see value in the 'ETA on Epsilon' residual
error model when the magnitude of the residual variation does not appear
to be the same across all subjects. However, in using this model I
would want to assess whether the apparent change in magnitude of the
residual variation across subjects is being unduly influenced by a
single observation within the subject's data. If it is, I don't think I
would use this approach. Of course, it may be a challenge to discrimate
statistical outliers vs. misspecification of the residual error model
(e.g., non-homogenous variation across subjects) vs. lack of fit of the
structural model. Note that a change in residual error model to
accommodate outliers rather than excluding outliers is making an
implicit set of assumptions so I don't think we can 'side-step' the
issue of outlier assessment...we are just trading one set of assumptions
for another.
2) Matt Hutmacher and I have been toying with the following idea to
address individual data outliers. First, based on a prespecified set of
criteria, identify suspected individual data outliers. Second, create a
flag variable on the data set to identify these data outliers (i.e.,
FLAG=1 denotes outlier, FLAG=0 denotes non-outlier). Third, fit a
residual error model with different sigmas for outliers and
non-outliers. The following code for a constant CV error model might be
considered:
Y=F*(1+(1-FLAG)*EPS(1)+FLAG*EPS(2))
(If the outliers appear to be independent of F then one might postulate
EPS(2) as an additive effect.) With this model sigma2 would be larger
than sigma1 effectively down-weighting the suspected outliers without
having to formally exclude them (i.e., giving zero weight to them). The
degree of down-weighting can be determined from the ratio of the
estimates of sigma2 to sigma1 and would increase as the magnitude of
outlier deviations increases. One could compare the parameter estimates
(thetas and omegas) from this model to that of the usual CV error model,
Y=F*(1+EPS(1)), to determine how much leverage these outliers
collectively have on the estimation. Any thoughts on this approach? We
don't have any direct experience in applying this approach so if anyone
would like to try it and report back their experiences we would
certainly be interested in hearing about it.
3) For detecting individual data-point outliers (as opposed to outlying
subjects) wouldn't the IWRES be a better diagnostic than WRES or CWRES?
It would seem to better fit with the sentiment that when assessing
individual data point outliers, they should be evaluated in context with
the other observations for that individual, presumably with respect to
their deviations from the IPRED.
4) Outlier assessment is a very contextual thing. It is nearly
impossible to be completely objective in this assessment but at the same
time we should be systematic and use sound reasoning in evaluating
outliers and the actions we take. While we need to be cautious when
considering the impact of exclusion or down-weighting individual
outliers we also shouldn't take the position that we should never
exclude them. These outliers can unduly inflate the variance components
and mask our ability to detect important determinants (covariate
effects) of the PK and PD responses. We need to rigorously evaluate the
adequacy of our models with various diagnostic plots and rule out
(whenever possible) various forms of model (structural and statistical)
misspecification before proposing to exclude outliers. The totality of
our diagnostics should help inform our decision on the models we
postulate and any actions (including no action) we take regarding
outliers.
Ken