RE: WRES AND OUTLIER IDENTIFICATION/EXCLUSION
From: "Kowalski, Ken" Ken.Kowalski@pfizer.com
Subject: RE: [NMusers] WRES AND OUTLIER IDENTIFICATION/EXCLUSION
Date: Tue, 3 Oct 2006 17:38:09 -0400
Nick,
You wrote: "I am still not very comfortable about the fractional
likelihood method because it assumes a similar fraction of outlier and
non-outlier observations in each subject..."
The full likelihood approach using an observation-level mixture model
(i.e., residual error mixture model) does not make this assumption. The
original version without Mats' modification to use $MIX merely estimates
the proportion of outlier observations from the total number of
observations. With Mats' $MIX code modification, we estimate the
proportion of 'subjects with outliers', thus, the mixing proportion in
the residual error mixture model is now conditional on the total number
of observations in the population of 'subjects with outliers' rather
than the total number of observations. There is no constraint that
forces the observation-level mixing proportion to be the same within
each subject of the 'subjects with outliers' subpopulation.
You wrote: "...more importantly (for this thread) it doesn't
distinguish in a discrete way between outlier and non-outlier
observations."
True...but it doesn't mean we couldn't perform some post hoc
calculations to classify observations as outlier or non-outlier based on
this full likelihood approach. It is analogous to the situation with
$MIX and the post hoc calculations that MIXEST performs. If we did not
have the MIXEST capabilities built in to NONMEM we would have a harder
time with our diagnostic evaluation of subject-level mixture models
using the $MIX functionality. Same is true here with the
observation-level mixture models. To fully evaluate and advocate this
approach would require more work to determine the post-processing
calculations that would allow us to classify the observations in the two
populations (outliers vs non-outliers) for diagnostic purposes. Note
that if we performed these post hoc calculations such that we could
classify outliers vs non-outliers at the observation level we would no
longer need to use the $MIX code to identify 'subjects with outliers'.
The population of 'subjects with outliers' could be determined directly
from these post hoc calculations of the observation-level classification
of outliers and non-outliers.
Kind regards,
Ken
_______________________________________________________