Re: Simulation vs. actual data
From: "Nick Holford" n.holford@auckland.ac.nz
Subject: Re: [NMusers] Simulation vs. actual data
Date: Tue, July 12, 2005 1:50 pm
Ken,
Thanks for your further elaboration. Your boiled down example is perhaps somewhat
over simple for nmusers because I think we need to consider 3 kinds of random
effects when constructing intervals for model qualification and prediction.
1. Population Parameter Variability (PPV). The sum of between (BSV) and within (WSV)
subject variability in model parameters such as Emax and EC50.
2. Residual Unidentified Variability (RUV). The residual error describing the
difference between individual subject predictions and observations.
3. Parameter Uncertainty. This is often described by the parameter standard error
but might be more reliably described by an empirical distribution of parameter
estimates obtained by a non-parametric bootstrap.
Your example only includes one level of random effect (which could be either PPV or
RUV depending on the context) and derives intervals by the commonly used asymptotic
assumption.
The kind of parametric simulation procedure that nmusers might use has been
described by Yano et al. (2001) in their investigation of the posterior predictive
check (PPC). As in your own example this study was limited to only one level of
random effect (they used RUV) and examined what might be learned from including
parameter uncertainty. They concluded "No clear adantage for one or another method
of approximating the posterior distribution on model parameters is found." i.e.
including uncertainty in parameters did not improve the power of the PPC to aid in
evaluating model performance. Dropping the 'posterior' i.e. the uncertainty in the
parameter distribution and just sampling from what they call the 'degenerate
posterior' may be adequate. Yano et al discuss why this might be the typical case
for many NONMEM type analyses.
On the basis of this rather limited study one should be aware that in other
situations the inclusion of uncertainty could be quite important but the relevance
to typical NONMEM analyses is currently unclear. At least in some settings the
simple predictive check (SPC; Gobburu et al. 2000) which uses the final point
estimates of the parameters (the degenerate posterior distribution) for simulation
without including uncertainty can give useful diagnostic information.
In trying to use a consistent terminology for the various intervals used to describe
the time course of response I wonder if you would accept the following:
Confidence Interval: Describes the uncertainty in the mean response. It could be
constructed by a non-parametric bootstrap and using the resulting parameters for
each bootstrap run to predict the population response at each time point. The
distribution of these population responses obtained from say 1000 bootstrap runs can
be used to define the confidence interval. The confidence interval says nothing
about PPV or RUV but reflects only the estimation uncertainty in the population
parameters. I am not aware of any published use of this kind of interval applied to
NONMEM analyses but would be interested to hear of this application.
Prediction Interval: Decribes the variation in individual response which is
attributable to PPV and RUV. It may be obtained by a parametric bootstrap (e.g.
using $SIM with NSUBPROBLEMS=1000 in NONMEM) based on the final parameter estimates
for the model that is being evaluated. A 90% interval constructed from the empirical
distribution of individual predicted responses (with residual error) at each time
should contain 90% of the observed responses at that time. This is the interval and
method that is used for the visual predictive check (VPC; Holford 2005). The
procedure is the same as the SPC and degenerate PPC. It has been frequently referred
to as a posterior predictive check (e.g. Duffull 2000).
Tolerance Interval: Describes the uncertainty in the prediction interval by
including uncertainty in the parameter estimates. This could be done using the same
procedure as the SPC but sampling from the covariance matrix of the estimate in
addition to the variance-covariance matrix for OMEGA and SIGMA. I am not aware of
anyone who has done this with NONMEM with both PPV and RUV but would be interested
if someone could report any such experience.
In their definition of PPC, Yano et al. did not include the generation of an interval:
"The PPC compares a statistic (T) computed on the observed data to the distribution
of that statistic under a candidate model fitted to the data to derive a p value,
which we denote by pPPC."
However, it is implicit in their methodology for calculating the probability of a
response (pPPC).
I would suggest that it might be better if the term PPC was restricted to the case
where parameter uncertainty is included in the simulation process because this
explicitly recognizes the role of the non-degenerate posterior distribution.
I think that an interval which describes the variability in individual responses
(prediction interval) is more commonly of interest than variability in the
population response (confidence interval). A tolerance interval has some theoretical
advantage over the prediction interval by being a bit more conservative (i.e. wider
intervals) but most of the merits of this kind of model qualification approach will
be seen in the computationally more convenient prediction interval. It is directly
applicable for evaluating the performance of a model to describe existing
observations and for illustrating to non-pharmacometricians what might be expected
in a typical patient population. The nomenclature could stand some improvement so
that we can use these terms more precisely.
Nick
Duffull SB, Chabaud S, Nony P, Laveille C, Girard P, Aarons L. A pharmacokinetic
simulation model for ivabradine in healthy volunteers. Eur J Pharm Sci
2000;10(4):285-94
Gobburu J, Holford N, Ko H, Peck C. Two-step model evaluation (tsme) for model
qualification. In: American Society for Clinical Pharmacology and Therapeutics.
Annual Meeting; 2000; Los Angeles, CA, USA; 2000. Abstract.
Holford NHG. The Visual Predictive Check Superiority to Standard Diagnostic
(Rorschach) Plots PAGE 2005.
http://www.page-meeting.org/default.asp?id=26&keuze=abstract-view&goto=abstracts&orderby=author&abstract_id=738
Yano Y, Beal SL, Sheiner LB. Evaluating pharmacokinetic/pharmacodynamic models using
the posterior predictive check. J Pharmacokinet Pharmacodyn 2001;28(2):171-92