RE: Outliers and the FDA guideline
From: "Robert L. James" rjames@rhoworld.com
Subject: RE: [NMusers] Outliers and the FDA guideline
Date: Wed, August 18, 2004 8:33 am
Thomas,
I always classify outliers as those that are 1) "highly improbable", and
2) those that are due to "natural extremes" in variation. "Highly improbable"
outliers strongly suggest experimental protocol error (for example, the lab
technician left out an important reagent when performing the assay or a laboratory
appartus wasn't properly zeroed or "warmed up", incomplete mixing of a drug in blood
during the first minutes following an bolus arterial injection, etc). "Highly
improbable" outliers are ususally near the limit of biologic impossibility. "Natural
extremes" outliers, on the other hand, are unlucky but real. Biologic systems can
occassionaly vary producing very extreme values.
For, "Highly improbable" outliers, I simply discard the outlier from all analyses and
make a note of discarding it in my results.
However, discarding "natural extreme" outliers are statistically problematic. To
discard them outright will bias the results by shrinking the variance. Including
them may make it very difficult to fit a good model. For "natural extreme" outliers
I initially exclude them from the data during the model fitting. But then for my final
model run, I'll put "natural extreme" outliers back into my model so that the variance
structure reflects the natural (although extreme) variability. For this final run, I
may or may not fix the theta parameters to the estimates obtained by the earlier model
(without the outliers). I report model diagnostics using the final model fit which was
based on the data that included the "natural extreme" outliers.
Robert James