RE: Re: FO vs FOCE vs LAPLACIAN
From: "Kowalski, Ken" <Ken.Kowalski@pfizer.com>
Subject: RE: [NMusers] Re: FO vs FOCE vs LAPLACIAN
Date: Tue, 22 Jul 2003 12:50:46 -0400
Nick,
Wow!...Talk about proceeding cautiously. See below for my response to your
comments.
Ken
Quoted reply history
-----Original Message-----
From: Nick Holford [mailto:n.holford@auckland.ac.nz]
Sent: Monday, July 21, 2003 11:46 PM
To: nmusers@globomaxnm.com
Subject: Re: [NMusers] Re: FO vs FOCE vs LAPLACIAN
Ken,
I am primarily interested in avoiding local minima (so that I can test model
building hypotheses) and obtaining minimally biased and imprecise parameter
estimates. I agree with you that success or failure of $COV probably does
not help diagnose a local minimum problem. I have no evidence to support
this. But what about bias and imprecision?
[Ken Kowalski] Actually, what I said was that a successful $COV does not
guarantee that one has converged to a global minimum. However, a failure of
$COV should raise a concern that we might not have converged to a global
minimum. That is, conditions may be ripe for converging to a local minimum.
I have encountered many instances (as I'm sure you have as well) where for
two hierarchical models, the bigger model converges to a higher OFV than the
smaller model (ie., delta-OFV is negative). In these instances the bigger
model has clearly converged to a local minimum and often the $COV will fail
for the bigger model. However, I have also seen instances where the $COV
runs successfully for the bigger model but inspection of the output
indicates large SEs for one or more parameters (as well as high pairwise
correlations of the estimates) and this is the diagnostic information that
leads one to be concerned that the bigger model is over-parameterized,
leading to an ill-conditioning of the R-matrix (e.g., nearly singular) and
instability in the estimation. The bigger model is unstable in the sense
that if I change the starting values, I may converge to a different set of
final estimates. When the likelihood surface for the bigger model is very
flat, there may be many sets of solutions leading to very similar values of
the minimum OFV. In this setting we may get a minimimization failure, or
the $COV may fail due to a non-singular R-matrix, or if $COV does run some
of the parameters will have very large SEs. In this sense the $COV does
provide information on the precision or lack thereof (imprecision) in the
estimation. One can't use the $COV to assess bias if the true model is
unknown, however, if the model is unstable we should be concerned that we
might have biased estimates.
I have a somewhat anecdotal but nevertheless evidence based comment on this.
I recently completed a PK model analysis using WT, AGE, SCR and SEX as
covariates (697 subjects, 2567 concs). The model did not run $COV, in fact
it didn't even minimize successfully. Other evidence convinced me it was not
far away from an appropriate minimum and because it had a more biologically
sound basis than its more successful neighbours I preferred this model. I
bootstrapped the original data set using the preferred model and found 28%
of 1055 bootstrap runs minimized successfully and 7.1% ran the $COV step.
The mean of the parameters obtained from all bootstrap runs and the mean
from those which ran the $COV step were all within 2%. I conclude that $COV
does not indicate lower bias compared with runs that do not minimize.
[Ken Kowalski] It concerns me that such a low proportion minimized
successfully. What happens if you substantially change your starting
values? You might find than one or more of the bootstrap parameter means
are substantially different. We can't assess bias from bootstrapping a real
data set where the true model for the data is unknown. Just because the
$COV step ran that does not mean your model is not over-parameterized.
Those runs where the $COV was successful may be numerically nonsingular but
still nearly singular (output from $COV can help assess this). Thus, it
doesn't surprise me that the mean estimates aren't different between those
where the $COV was successful and all the runs. The value of the $COV goes
well beyond a simple success/failure flag. When the $COV is successful we
still need to inspect the output from the $COV to assess the stability.
When the $COV fails we don't have this luxury but we get warning messages
regarding invertability problems with the hessian that indicate we have a
stability problem. Again, if we have a stability problem we may continue to
proceed with the over-parameterized model but we should do so cautiously.
To assess imprecision I computed the ratio of the mean standard error from
the $COV successful runs to the bootstrap standard error obtained from all
runs. For THETA:se estimates the $COV SE was on average 3% smaller but for
OMEGA:se the $COV SE was 58% larger than the overall bootstrap SE. I
conclude from this that the imprecision of THETA:se was negligibly different
when the $COV step was successful. The difference in the OMEGA :se may
reflect the intrinsic difficulty in obtaining estimates of OMEGA and
OMEGA:se. Perhaps the asymptotic assumptions involved in $COV produce an
upward bias.
[Ken Kowalski] I believe Mats Karlsson has shown that asymptotic SEs for
elements of Omega are not very good.
95% confidence intervals obtained from all the bootstrap runs were very
similar to those obtained from minimization successful and $COV successful
runs. The 95% CI predicted from the asymptotic SE was on average 21% larger
(range 15-35%) than the bootstrap CI.
In order to explore the issue a bit further I simulated a data set using the
mean bootstrap parameter estimates from all runs. I then bootstrapped this
simulated data set (1772 runs). The minimization success rate was double
(56%) that of the original real data bootstrap runs and 12.5% ran $COV.
Because the true parameter values for the simulation are known the absolute
bias can be computed. Only 3 out of 29 parameters had an absolute bias
larger than 10%. There were negligible differences between the absolute bias
using estimates from all runs, minimization successful runs or $COV
successful runs.
[Ken Kowalski] The fact that you get 44% minimization failures and 87.5%
$COV failures when you bootstrapped your simulated data set, based on the
model developed from your original data set, provides evidence that your
model is unstable. This is what I claim the $COV step failure from your
original model fit was diagnosing. You indicate that 3 out of 29 parameters
had an absolute bias greater than 10%. Perhaps the instability in your
model is related to the estimates of these parameters. If they are not
important parameters then perhaps I wouldn't be concerned. To illustrate my
point consider the simple example where we have very little sampling in the
absorption phase. Perhaps in each individual the observed Tmax corresponds
to the first observation. In this setting we can have convergence and/or
$COV failures and wildly biased estimates for ka (e.g., an estimate of ka
>>100 hr^-1). Fortunately, we may find that even though ka is not well
estimated we can still get relatively accurate estimates of CL/F which we
may be more interested in. Thus, the instability in the model is related to
the estimation of ka and the limitations of the design at early time points.
I would be inclined to fix ka at some prior estimate (if I had one) to
remove the instability and obtain successful minimization and $COV and
acknowledge the limitations of the design/model to estimate ka.
Alternatively, we could use simulation/bootstrapping to verify that poor
(biased) estimation of ka is not likely to unduly bias CL/F. Further
evaluation perhaps using simulation and bootstrapping is a cautious way to
proceed when considering over-parameterized models. For me, I like to fit
alternative models (often reduced hierarchical models) as a set of
diagnostic runs and inspect the $COV output so that I can understand where
the limitations are with the design/data/model. Both approaches can help us
stay out of trouble.
This means the $COV step is not a guide to reduced bias.
[Ken Kowalski] The $COV should not be used as a guide to reduce bias. One
can fit a mispecified reduced (smaller) model that is quite stable with a
successful $COV and get biased estimates just as one can fit the true model
and get biased estimates if the design/data doesn't support fitting all the
parameters of the true model (ie., the true model may be over-parameterized
resulting in a failed $COV). If the reduced model is not very plausible we
may discard it or recognize its limitations particularly for extrapolation.
However, it is naive to simply trust an over-parameterized model fit simply
because the model is more plausible and proceed without caution. A true but
over-parameterized model fit may have difficulty in estimation due to the
limitations of the design to support the model. If the fit converges to a
local minimum we still need to be concerned about bias even though we are
fitting the true model. If you have a strong belief that the
over-parameterized model is more plausible this is where explicity
incorporating prior information on parameters that may be difficult to
estimate from the existing design/data may be helpful.
The imprecision pattern was similar with the simulated data but the
magnitude of differences between the mean $COV SE and the mean bootstrap SE
were larger than those seen with the original real dataset. For $COV SE the
THETA:se estimates were about 50% smaller while OMEGA:se were 400% larger
than the bootstrap SE. There were no real differences depending on whether
all runs, minimization successful or $COV successful runs were used ($COV
successful runs tended to be a bit larger).
95% confidence intervals obtained from all the bootstrap runs on the
simulated dataset were very similar to those obtained from minimization
successful and $COV successful runs. The 95% CI predicted from the
asymptotic SE was on average 22% larger (range 14-46%) than the bootstrap
CI.
My conclusion from this empirical exploration of one data set and model
suggests that a successful $COV is of no value for selection of models with
improved bias or imprecision. It is a quicker way of obtaining some idea of
the parameter 95% confidence interval but it is upwardly biased compared
with the bootstrap estimate. I am not typically interested in parameter CIs
for every model I run. I am happy to leave that until I have finished model
building and prefer to rely on bootstrap CIs.
I think we are in agreement on almost all issues that you raise except for
the diagnostic value of the $COV in relation to the thing you call
"stability". I dont know what stability means so perhaps you would like to
offer a definition and some evidence for your assertion.
[Ken Kowalski] Hopefully the above responses give you some sense of what I
mean by stability. Over-parameterization of the model (too many parameters
estimated relative to the information content of the data/design),
ill-conditioning of the hessian (R-matrix) which can lead to numerically
unstable inversion of the R-matrix ($COV failures) and instability of the
paramater estimation (overly sensitive to the starting values) are all
symptoms of a stability problem with the model.
_______________________________________________________