Outliers and the FDA guideline
From: "TKT (Thomas Klitgaard)" tkt@novonordisk.com
Subject: [NMusers] Outliers and the FDA guideline
Date: Wed, August 18, 2004 3:21 am
Dear all,
In the FDA Guidance For industry: "Population Pharmacokinetics" (February 1999)
section VII, C (p. 11 onwards) states the following about outliers:
"The statistical definition of an outlier is, to some extent, arbitrary. The
reasons for declaring a data point to be an outlier should be statistically convincing
and, if possible, prespecified in the protocol. 1* Any physiological or study-related event
that renders the data unusable should be explained in the study report. 2* A distinction
should be made between outlying individuals (intersubject variability) and
outlier data points (intrasubject variability). Because of the exploratory nature
of population analysis, the study protocol may not specify a procedure for dealing
with outliers. In such a situation, it would be possible to perform model building
on the reduced data set (i.e., the data set without outliers) to 3* reanalyze
the entire data set (including the outliers) using the final population model,
and to discuss the difference in the results. Including extreme outliers is not
a good practice when using least-squares or normal-theory type estimation methods,
as such outliers 4* inevitably have a disproportionate effect on estimates. Also, it
is well known that for most biological phenomena, outlying observations are far
more frequent than suggested by the normal distribution (i.e., biological distributions
are heavy-tailed). Some robust methods of population analysis have recently been suggested,
and these may allow outliers to be retained without giving them undue weight (38-40).
Outliers should be specified in a separate appendix to the report, with all data available"
Our interpretation of this is the following: Either the criteria may are predefined, in a
statistically reasonable way, in the protocol (*1) - or they're not, in which case (*3) model
building on the reduced data set could be performed followed by a re-run of the final model
on the full data set. (Section 2* does not appear to be an outlier issue, as it pertains to
a non-result. )
My questions are:
1) What would be a statically convincing criterion, for the first approach in section 1*?. Note
that for this approach, the criterion should be stated a priori in the protocol.
2) The procedure explained in 3*) requires that the outliers be known before model
development, hence excluding the application of a model-based exclusion criterion applied on
the full dataset first (e.g. "exclude if WRES>4"), followed by a re-run on the reduced data
set and a discussion of the sensitivity to the applied censoring. How do you get to know you
outliers beforehand?
3) How much flexibility do you allow yourselves with the criterions - would you go with
common-sense based "looks better - CV% are notably smaller" instead of strict rules. Would the FDA?
Thanks in advance.
Thomas Klitgaard, Pharmacometrics, Novo Nordisk Denmark