RE: $OMEGA blocks and log-likelihood profiling
From: "Kowalski, Ken" Ken.Kowalski@pfizer.com
Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling
Date: Tue, June 29, 2004 3:08 pm
Nick,
I'm sure we're all getting tired of this thread but I just can't leave it
where it last ended. Below you ask the rhetorical question why we should
consider it good statistical practice to get the COV step to run as a marker
for a stable model that is somehow more reliable. I don't consider it good
statistical practice to simply use success/failure of the COV step as such a
marker. What I and Matt have been saying is that it is good statistical
practice to develop models that have a successful COV step AND to inspect
the COV step output to assess the stability of the models. It is the
stability of the model that allows us to guage our confidence (i.e.,
reliability) in the parameter estimates that we obtain from NONMEM as being
optimal from which to make inference via point and interval (i.e.,
bootstrapping) estimates. You seem to want to rely solely on your
confidence that you have a correctly specified mechanistic model and that is
sufficient to have confidence in the estimates regardless of whether NONMEM
achieves successful convergence. Surely you would agree that if we knew
with 100% certainty that your mechanistic model was correctly specified, and
you had a dataset where every subject was sampled only once at the same
fixed time point, it would be ludicrous to fit this data in NONMEM (we would
expect it not to converge) and put any level of trust in the estimates we
obtain. The reliability in the paramater estimates depends not only in our
confidence of a correctly specified model (which we can never really know
with real data) but also in our confidence that the data in hand contains
rich enough information to estimate these parameters.
Reliable estimates should not be equated with unbiased estimates. Unbiased
estimates speak to the accuracy of the estimates and the correctness of the
model which cannot be assessed with real data sets. Even if you wanted to
assess the value of simple success/failure of the COV step in your bootstrap
runs as a marker of the validity of the results with real data, at best, you
can conclude that the distribution of the parameter estimates from the
failed COV step runs (and in your example the majority also fail in
convergence) are similar to the distribution of the estimates from the
successful COV step runs. The problem with real data sets is we don't know
what the true distribution of the parameter estimates should be. If you
want to assess whether a successful COV step provides any value as a marker
in assessing the accuracy and precision of the estimates you need to do this
via simulations where you know the true value of the parameters. As you
vary the design to change the information content in the data resulting in
increased instablity (with fewer data points), I think you will find that
the accuracy of the parameter estimates and coverage probabilities of the
bootstrap CIs will get worse even though the model is correctly specified
(i.e., fitting the same model as you used to simulate the data) regardless
of COV step status.
Note that the nonparametric bootstrap relies on statistical theory and is
not a data-based result. In order for the bootstrap to give us valid
confidence intervals we need to rely on the randomness and optimality of
the estimates. Rounding errors, lack of convergence, and COV step failures
may be indicative that the estimates are sub-optimal (not at the global
minimum) and could result in systematic biases that invalidate any inference
from the resulting empirical distribution of these sub-optimal estimates.
Finally, I'd encourage you to read up more on statistical theory for
nonlinear models. There is a wealth of information in the statistical
literature dating back to the 60's and 70's that Wald-based symmetric
confidence intervals for nonlinear models often do not have proper covergage
and that the distributions of many parameter estimates from nonlinear models
are asymmetric. A lot of this is in standard texts for nonlinear regression
models. Bates & Watts, Nonlinear Regression Analysis and its Applications,
Wiley, NY, 1988 is a good text. It is true that maximum likelihood theory
states that asymptotically, maximum likelihood estimates have a multivariate
normal distribution, however, for population models, Vonesh (Biometrika
1996;83:447-452) has shown that the asymptotics require not only a large
number of subjects (N) but also a large number of observations per subject
(n). For certain intrinsically nonlinear parameters the magnitude of n
necessary to achieve these asymptotics may never be realistically achieved.
Furthermore, with regards to NONMEM we are not doing exact maximum
likelihood estimation but approximate maximum likelihood estimation due to
the first-order approximations that are employed and this also plays into
the problem. Mats Karlsson and his colleagues have shown situations when
the first-order approximations (especially the FO method) do not perform
well in maintaining the nominal type I error rate for likelihood ratio
tests. However, they have also shown situations where the type I error rate
is maintained based on the Chi-Square distribution. So, situations where
the likelihood ratio test do not follow a Chi-Square distribution are not
evidence of a failure of statistical theory but rather an indication that
our approximations and asymptotics may not be working to our advantage.
Application of statistical theory requires us to have an understanding of
the properties and limitations of the estimation methods that we employ. We
(the PK/PD modeling community) are continuing to contribute to this
statistical theory as it applies to NONMEM and its estimation methods.
Regards,
Ken