RE: $OMEGA blocks and log-likelihood profiling
From: "Kowalski, Ken" Ken.Kowalski@pfizer.com
Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling
Date: Tue, June 8, 2004 5:23 pm
Mats, Nick, and all,
I have not forgotten Nick's original premise that the parameter estimates
were the same regardless of successful convergence or successful COV step.
However, my preference is to understand why he has such a high convergence
failure rate. In my opinion Nick has two choices, 1) he can try to
understand why the failure rate is so high perhaps leading to a more stable
model selection (if his model is over-parameterized) that resolves the high
failure rate, or 2) he can verify that the empirical marginal distributions
of the parameter estimates are the same between the successful and failed
convergence runs. I don't believe Nick has provided sufficient information
to assess the latter and I will respond directly to his message with what I
think would provide compelling evidence that his bootstrap sample
distribution is indeed independent of convergence status. I could be
convinced with sufficient data that pooling bootstrap estimates from both
the successful and failed convergence runs is OK for his particular example
but Nick wants to generalize based on this one example to conclude that as
long as we are doing bootstrapping we never have to worry about convergence
and/or COV step failures and this is where I take exception.
Suppose with another example with a high convergence failure rate the
bootstrap distributions between the successful and failed runs are
different. In this setting the analyst has no choice but to go back and try
to figure out why he/she has such a high failure rate. This is where the
COV step provides diagnostic information that may be helpful. In Nick's
example, I wanted to try and understand why he has such a high convergence
failure rate and the 7% that had a successful COV step do have additional
information to assess the possible instability of the model at least with
respect to these particular bootstrap datasets. It is in this regard that
they contain more information than the 93% where the COV step failed. In a
previous message Nick provided information from the COV step output that
suggests the bootstraps runs for these 7% are indeed stable. That didn't
have to be the case as the COV step can be successful and the model can
still be unstable. It is for this reason that I agree with Nick that simple
success or failure of the COV step alone is a poor indicator of the
reliability/stability of the model. If the COV step output for these 7% had
diagnostic information to suggest that these particular fits were unstable,
then it would have been of interest to postulate alternative, more stable
models that resolve this instability and see if that also resolves the high
convergence failure rate as well. But in Nick's example the 7% percent
appear to be stable which means diagnosing the reason for the high
convergence failure rate is going to be more difficult. It could still be
related to instability/over-parameterization of the model for the remaining
93% of bootstrap datasets but we would need more information from Nick
regarding the design of his dataset and bootstrap sampling scheme to assess
this.
Ideally, one would be looking at the COV step output throughout the model
building process before getting to the bootstrap phase to give one a better
shot at not encountering such a high convergence failure rate when
performing bootstrapping. In fact I often don't perform the COV step during
bootstrapping as it can be impractical especially for models/data with long
run-times. On the other hand, I don't often encounter such a high
convergence failure rate when I perform bootstrapping either. I believe
this is in part due to the concious effort I take to avoid instability in my
model building. I agree with Mats that the COV step provides imperfect
diagnostic information but that is the case with other diagnostics such as
empirical Bayes estimation as well. Imperfect as the COV step information
may be, it still provides valuable diagnostic information. That is not to
say I would use the COV step output to make formal inference via confidence
intervals because I generally don't, but I do think we should be reviewing
the COV step output routinely to help guide model development.
Ken