Dear all,
I have a naive question regarding the modeling building process in NONMEM.
With more and more covariates added in the model, I often come across an
error message saying that "ERROR 134", or R MATRIX SINGULAR.
After searching from the internet, I learned that changing NSIG in
$ESTIMATION and MATRIX=S in $COV would be helpful for both problems
respectively. And from my own experience, it dose help with the modeling
building.
However, my concern is, I used different NSIG and MATRIX in the previous
steps. Is it proper to use different NSIGs and MATRICE in a single model
building? If not, could you please explain this a little bit?
Thank you in advance!
Best Regards
--
Xinting
Wang
Change of NSIG or R matrix
11 messages
5 people
Latest: Oct 25, 2013
Xinting,
I ignore rounding errors and failure of the covariance step when model building. Indeed you will speed up your model building by not using $COV. I build my models based on 1) change in OFV 2) plausibility of parameter estimates 3) VPCs.
i would not trust parameter estimates obtained with NSIG<3.
I use non-parametric bootstraps to also help in model building but would never rely on confidence intervals derived from $COV SEs.
Nick
Quoted reply history
On 8/10/2013 8:57 p.m., Xinting Wang wrote:
> Dear all,
>
> I have a naive question regarding the modeling building process in NONMEM. With more and more covariates added in the model, I often come across an error message saying that "ERROR 134", or R MATRIX SINGULAR.
>
> After searching from the internet, I learned that changing NSIG in $ESTIMATION and MATRIX=S in $COV would be helpful for both problems respectively. And from my own experience, it dose help with the modeling building.
>
> However, my concern is, I used different NSIG and MATRIX in the previous steps. Is it proper to use different NSIGs and MATRICE in a single model building? If not, could you please explain this a little bit?
>
> Thank you in advance!
>
> Best Regards
> --
> Xinting
> Wang
--
Nick Holford, Professor Clinical Pharmacology
Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A
University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand
office:+64(9)923-6730 mobile:NZ +64(21)46 23 53
email: [email protected]
http://holford.fmhs.auckland.ac.nz/
Holford NHG. Disease progression and neuroscience. Journal of Pharmacokinetics
and Pharmacodynamics. 2013;40:369-76
http://link.springer.com/article/10.1007/s10928-013-9316-2
Holford N, Heo Y-A, Anderson B. A pharmacokinetic standard for babies and
adults. J Pharm Sci. 2013:
http://onlinelibrary.wiley.com/doi/10.1002/jps.23574/abstract
Holford N. A time to event tutorial for pharmacometricians. CPT:PSP. 2013;2:
http://www.nature.com/psp/journal/v2/n5/full/psp201318a.html
Holford NHG. Clinical pharmacology = disease progression + drug action. British
Journal of Clinical Pharmacology. 2013:
http://onlinelibrary.wiley.com/doi/10.1111/bcp.12170/abstract
Yes, it should be fine to use S matrix if you cannot get default to run, and use NSIG larger or smaller than default value of 3 (although this is not guaranteed, usually NSIG does not change the OF value or parameter estimates in any significant way). Note that Nonmem manual recommends that SIGL >= 3*NSIG, TOL >= SIGL. Separate SIGL can be set on COV step, and it is recommended that SIGL >= 4*NSIG on COV step. In real life I've seen many examples where larger NSIG and SIGL resulted in successful COV step, and also many examples when default values were better (in getting COV step). UNCONDITIONAL on COV step allows you to run COV even when minimization ended with some error.
Contrary to Nick's experience, I found that COV step is useful as it reveals which of the model parameters are poorly estimated, and that CI based on SE are usually quite good and are in a general agreement with the bootstrap CI, but it may depend on the problem.
Leonid
--------------------------------------
Leonid Gibiansky, Ph.D.
President, QuantPharm LLC
web: www.quantpharm.com
e-mail: LGibiansky at quantpharm.com
tel: (301) 767 5566
Quoted reply history
On 10/8/2013 3:57 AM, Xinting Wang wrote:
> Dear all,
>
> I have a naive question regarding the modeling building process in
> NONMEM. With more and more covariates added in the model, I often come
> across an error message saying that "ERROR 134", or R MATRIX SINGULAR.
>
> After searching from the internet, I learned that changing NSIG in
> $ESTIMATION and MATRIX=S in $COV would be helpful for both problems
> respectively. And from my own experience, it dose help with the modeling
> building.
>
> However, my concern is, I used different NSIG and MATRIX in the previous
> steps. Is it proper to use different NSIGs and MATRICE in a single model
> building? If not, could you please explain this a little bit?
>
> Thank you in advance!
>
> Best Regards
> --
> Xinting
> Wang
Dear Nick,
Thank you very much for your suggestion. Could you explain a little bit
about the statement regarding NSIG < 3? I seem to remember that many
suggested to use a smaller NSIG to get a successful minimization.
Dear Leonid,
I read about the recommendation of SIGL, NSIG and TOL, but I am not quite
familiar with the use of these options in subroutine ADVAN4. If I set SIGL
a fixed value, let's say 12, and NSIG 3, does this mean I also have to
identify a value for TOL in $subroutine? I appreciate your help very much.
Thank you both.
Regards
Quoted reply history
On 8 October 2013 21:59, Leonid Gibiansky <[email protected]> wrote:
> Yes, it should be fine to use S matrix if you cannot get default to run,
> and use NSIG larger or smaller than default value of 3 (although this is
> not guaranteed, usually NSIG does not change the OF value or parameter
> estimates in any significant way). Note that Nonmem manual recommends that
> SIGL >= 3*NSIG, TOL >= SIGL. Separate SIGL can be set on COV step, and it
> is recommended that SIGL >= 4*NSIG on COV step. In real life I've seen many
> examples where larger NSIG and SIGL resulted in successful COV step, and
> also many examples when default values were better (in getting COV step).
> UNCONDITIONAL on COV step allows you to run COV even when minimization
> ended with some error.
>
> Contrary to Nick's experience, I found that COV step is useful as it
> reveals which of the model parameters are poorly estimated, and that CI
> based on SE are usually quite good and are in a general agreement with the
> bootstrap CI, but it may depend on the problem.
>
> Leonid
>
>
> ------------------------------**--------
> Leonid Gibiansky, Ph.D.
> President, QuantPharm LLC
> web: www.quantpharm.com
> e-mail: LGibiansky at quantpharm.com
> tel: (301) 767 5566
>
>
>
>
> On 10/8/2013 3:57 AM, Xinting Wang wrote:
>
>> Dear all,
>>
>> I have a naive question regarding the modeling building process in
>> NONMEM. With more and more covariates added in the model, I often come
>> across an error message saying that "ERROR 134", or R MATRIX SINGULAR.
>>
>> After searching from the internet, I learned that changing NSIG in
>> $ESTIMATION and MATRIX=S in $COV would be helpful for both problems
>> respectively. And from my own experience, it dose help with the modeling
>> building.
>>
>> However, my concern is, I used different NSIG and MATRIX in the previous
>> steps. Is it proper to use different NSIGs and MATRICE in a single model
>> building? If not, could you please explain this a little bit?
>>
>> Thank you in advance!
>>
>> Best Regards
>> --
>> Xinting
>> Wang
>>
>>
--
Xinting
Xinting,
First of all 'successful minimization' has nothing to do with a good model. NONMEM's internal decision to declare success or termination is often a pseudo-random choice. If you look at the sigdigs of the estimate you will typically find that the lowest value is 2.9 and many others are greater than 5. This gives you a clue to which parameters are well determined and which are less well known. It is a not a YES/NO decision.
Second, NSIG determines the number of significant digits in the parameter estimates. If you choose a number less than 3 then it means you don't care if the answer is 10.1 or 10.9. They both have 2 sig digs but the estimates differ by nearly 10%. There is a large body of empirical literature that has relied on NSIG=3 (or more). I do not see any reason to ignore this in order to get a meaningless "minimization successful" message from a random number generator.
I look forward to hearing from "many" to understand why they believe that "minimization successful" indicates that the model results are somehow better even though the parameter estimates have hardly any significant digits.
Nick
Quoted reply history
On 22/10/2013 9:17 p.m., Xinting Wang wrote:
> Dear Nick,
>
> Thank you very much for your suggestion. Could you explain a little bit about the statement regarding NSIG < 3? I seem to remember that many suggested to use a smaller NSIG to get a successful minimization.
>
> Dear Leonid,
>
> I read about the recommendation of SIGL, NSIG and TOL, but I am not quite familiar with the use of these options in subroutine ADVAN4. If I set SIGL a fixed value, let's say 12, and NSIG 3, does this mean I also have to identify a value for TOL in $subroutine? I appreciate your help very much.
>
> Thank you both.
>
> Regards
>
> On 8 October 2013 21:59, Leonid Gibiansky < [email protected] < mailto: [email protected] >> wrote:
>
> Yes, it should be fine to use S matrix if you cannot get default
> to run, and use NSIG larger or smaller than default value of 3
> (although this is not guaranteed, usually NSIG does not change the
> OF value or parameter estimates in any significant way). Note that
> Nonmem manual recommends that SIGL >= 3*NSIG, TOL >= SIGL.
> Separate SIGL can be set on COV step, and it is recommended that
> SIGL >= 4*NSIG on COV step. In real life I've seen many examples
> where larger NSIG and SIGL resulted in successful COV step, and
> also many examples when default values were better (in getting COV
> step). UNCONDITIONAL on COV step allows you to run COV even when
> minimization ended with some error.
>
> Contrary to Nick's experience, I found that COV step is useful as
> it reveals which of the model parameters are poorly estimated, and
> that CI based on SE are usually quite good and are in a general
> agreement with the bootstrap CI, but it may depend on the problem.
>
> Leonid
>
> --------------------------------------
> Leonid Gibiansky, Ph.D.
> President, QuantPharm LLC
> web: www.quantpharm.com http://www.quantpharm.com
> e-mail: LGibiansky at quantpharm.com http://quantpharm.com
> tel: (301) 767 5566 <tel:%28301%29%20767%205566>
>
> On 10/8/2013 3:57 AM, Xinting Wang wrote:
>
> Dear all,
>
> I have a naive question regarding the modeling building process in
> NONMEM. With more and more covariates added in the model, I
> often come
> across an error message saying that "ERROR 134", or R MATRIX
> SINGULAR.
>
> After searching from the internet, I learned that changing NSIG in
> $ESTIMATION and MATRIX=S in $COV would be helpful for both
> problems
> respectively. And from my own experience, it dose help with
> the modeling
> building.
>
> However, my concern is, I used different NSIG and MATRIX in
> the previous
> steps. Is it proper to use different NSIGs and MATRICE in a
> single model
> building? If not, could you please explain this a little bit?
>
> Thank you in advance!
>
> Best Regards
> --
> Xinting
> Wang
>
> --
> Xinting
--
Nick Holford, Professor Clinical Pharmacology
Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A
University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand
office:+64(9)923-6730 mobile:NZ +64(21)46 23 53
email: [email protected]
http://holford.fmhs.auckland.ac.nz/
Holford NHG. Disease progression and neuroscience. Journal of Pharmacokinetics
and Pharmacodynamics. 2013;40:369-76
http://link.springer.com/article/10.1007/s10928-013-9316-2
Holford N, Heo Y-A, Anderson B. A pharmacokinetic standard for babies and
adults. J Pharm Sci. 2013:
http://onlinelibrary.wiley.com/doi/10.1002/jps.23574/abstract
Holford N. A time to event tutorial for pharmacometricians. CPT:PSP. 2013;2:
http://www.nature.com/psp/journal/v2/n5/full/psp201318a.html
Holford NHG. Clinical pharmacology = disease progression + drug action. British
Journal of Clinical Pharmacology. 2013:
http://onlinelibrary.wiley.com/doi/10.1111/bcp.12170/abstract
Nick -
a) The usual definition of 'number of significant digits' is -log10(relative
precision). Thus a sigdig of 3 is a precision of 1 part in 1000, and a sigdig
of 2 corresponds to 1% precision, not 10% as in your example.
b) that being said, the sigdigs in the parameters reported by NONMEM need to be
taken with a grain of salt - they probably represent best case, 'speed of
light' type numbers where the real precision may be considerably worse. I do
not know specifically how they are computed, but my guess is that it is based
on the fact that in a converged problem, the relative gradient has been driven
below some specified tolerance. One can then infer precision from the
condition number of the Hessian of the overall objective function and the
actual relative gradients. But NONMEM uses a quasi-Newton method -
there is no Hessian available to the method, but only a stand-in accumulated
curvature matrix (a 'pseudo Hessian') that is usually much better conditioned
than the actual Hessian. The only thing that can really be concluded is that,
at the moment the top level iteration is stopped and convergence declared based
on the relative gradient, the next iteration , if it were done, would not
change the parameter estimates by more than the reported sigdig value. This is
quite a different conclusion than reported parameter estimates are with
sigdigits of the 'true' values.
c) I know you have often argued that the failure of a covariance step has
little or no evidential value for determining whether the minimization step was
'successful', and I generally agree with you.
But the failure of the covariance step does mean that the Hessian could not be
numerically estimated at all (failed the positive definiteness test). This
does provide some additional evidence that one should be even more skeptical of
the reported sigdig values of the parameter estimates.
Quoted reply history
-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Nick Holford
Sent: Tuesday, October 22, 2013 4:45 AM
To: [email protected]
Subject: Re: [NMusers] Change of NSIG or R matrix
Xinting,
First of all 'successful minimization' has nothing to do with a good model.
NONMEM's internal decision to declare success or termination is often a
pseudo-random choice. If you look at the sigdigs of the estimate you will
typically find that the lowest value is 2.9 and many others are greater than 5.
This gives you a clue to which parameters are well determined and which are
less well known. It is a not a YES/NO decision.
Second, NSIG determines the number of significant digits in the parameter
estimates. If you choose a number less than 3 then it means you don't care if
the answer is 10.1 or 10.9. They both have 2 sig digs but the estimates differ
by nearly 10%. There is a large body of empirical literature that has relied on
NSIG=3 (or more). I do not see any reason to ignore this in order to get a
meaningless "minimization successful" message from a random number generator.
I look forward to hearing from "many" to understand why they believe that
"minimization successful" indicates that the model results are somehow better
even though the parameter estimates have hardly any significant digits.
Nick
On 22/10/2013 9:17 p.m., Xinting Wang wrote:
> Dear Nick,
>
> Thank you very much for your suggestion. Could you explain a little
> bit about the statement regarding NSIG < 3? I seem to remember that
> many suggested to use a smaller NSIG to get a successful minimization.
>
> Dear Leonid,
>
> I read about the recommendation of SIGL, NSIG and TOL, but I am not
> quite familiar with the use of these options in subroutine ADVAN4. If
> I set SIGL a fixed value, let's say 12, and NSIG 3, does this mean I
> also have to identify a value for TOL in $subroutine? I appreciate
> your help very much.
>
> Thank you both.
>
> Regards
>
>
>
> On 8 October 2013 21:59, Leonid Gibiansky <[email protected]
> <mailto:[email protected]>> wrote:
>
> Yes, it should be fine to use S matrix if you cannot get default
> to run, and use NSIG larger or smaller than default value of 3
> (although this is not guaranteed, usually NSIG does not change the
> OF value or parameter estimates in any significant way). Note that
> Nonmem manual recommends that SIGL >= 3*NSIG, TOL >= SIGL.
> Separate SIGL can be set on COV step, and it is recommended that
> SIGL >= 4*NSIG on COV step. In real life I've seen many examples
> where larger NSIG and SIGL resulted in successful COV step, and
> also many examples when default values were better (in getting COV
> step). UNCONDITIONAL on COV step allows you to run COV even when
> minimization ended with some error.
>
> Contrary to Nick's experience, I found that COV step is useful as
> it reveals which of the model parameters are poorly estimated, and
> that CI based on SE are usually quite good and are in a general
> agreement with the bootstrap CI, but it may depend on the problem.
>
> Leonid
>
>
> --------------------------------------
> Leonid Gibiansky, Ph.D.
> President, QuantPharm LLC
> web: www.quantpharm.com http://www.quantpharm.com
> e-mail: LGibiansky at quantpharm.com http://quantpharm.com
> tel: (301) 767 5566 <tel:%28301%29%20767%205566>
>
>
>
>
> On 10/8/2013 3:57 AM, Xinting Wang wrote:
>
> Dear all,
>
> I have a naive question regarding the modeling building process in
> NONMEM. With more and more covariates added in the model, I
> often come
> across an error message saying that "ERROR 134", or R MATRIX
> SINGULAR.
>
> After searching from the internet, I learned that changing NSIG in
> $ESTIMATION and MATRIX=S in $COV would be helpful for both
> problems
> respectively. And from my own experience, it dose help with
> the modeling
> building.
>
> However, my concern is, I used different NSIG and MATRIX in
> the previous
> steps. Is it proper to use different NSIGs and MATRICE in a
> single model
> building? If not, could you please explain this a little bit?
>
> Thank you in advance!
>
> Best Regards
> --
> Xinting
> Wang
>
>
>
>
> --
> Xinting
--
Nick Holford, Professor Clinical Pharmacology Dept Pharmacology & Clinical
Pharmacology, Bldg 503 Room 302A University of Auckland,85 Park Rd,Private Bag
92019,Auckland,New Zealand
office:+64(9)923-6730 mobile:NZ +64(21)46 23 53
email: [email protected]
http://holford.fmhs.auckland.ac.nz/
Holford NHG. Disease progression and neuroscience. Journal of Pharmacokinetics
and Pharmacodynamics. 2013;40:369-76
http://link.springer.com/article/10.1007/s10928-013-9316-2
Holford N, Heo Y-A, Anderson B. A pharmacokinetic standard for babies and
adults. J Pharm Sci. 2013:
http://onlinelibrary.wiley.com/doi/10.1002/jps.23574/abstract
Holford N. A time to event tutorial for pharmacometricians. CPT:PSP. 2013;2:
http://www.nature.com/psp/journal/v2/n5/full/psp201318a.html
Holford NHG. Clinical pharmacology = disease progression + drug action. British
Journal of Clinical Pharmacology. 2013:
http://onlinelibrary.wiley.com/doi/10.1111/bcp.12170/abstract
TOL in used only for numerical integration ADVANs.
As to NSIG, I would not use NSIG=2 only to get a successful minimization step, but in our paper
( http://link.springer.com/article/10.1007%2Fs10928-011-9228-y)
we found that setting NSIG to 2 in the long-run numerical integration problems allowed to get the FOCEI solution 5 times quicker (than NSIG=3 version) without any changes in the OF, parameter estimates, and SE. I am sure one can show counter-examples, so use it on your own risk.
For the use of the Nonmem convergence as a random number generator, one can find a long discussing in the archive, so I think it make no sense to repeat it one more time, but there are competing views whether and/or when it is important to have convergence, whether and how to use nonmem SEs, etc. But it is safe to mention that Nick's view is on the extreme of the observed distribution of point of views on this subject :)
Leonid
--------------------------------------
Leonid Gibiansky, Ph.D.
President, QuantPharm LLC
web: www.quantpharm.com
e-mail: LGibiansky at quantpharm.com
tel: (301) 767 5566
Quoted reply history
On 10/22/2013 4:17 AM, Xinting Wang wrote:
> Dear Nick,
>
> Thank you very much for your suggestion. Could you explain a little bit
> about the statement regarding NSIG < 3? I seem to remember that many
> suggested to use a smaller NSIG to get a successful minimization.
>
> Dear Leonid,
>
> I read about the recommendation of SIGL, NSIG and TOL, but I am not
> quite familiar with the use of these options in subroutine ADVAN4. If I
> set SIGL a fixed value, let's say 12, and NSIG 3, does this mean I also
> have to identify a value for TOL in $subroutine? I appreciate your help
> very much.
>
> Thank you both.
>
> Regards
>
> On 8 October 2013 21:59, Leonid Gibiansky <[email protected]
> <mailto:[email protected]>> wrote:
>
> Yes, it should be fine to use S matrix if you cannot get default to
> run, and use NSIG larger or smaller than default value of 3
> (although this is not guaranteed, usually NSIG does not change the
> OF value or parameter estimates in any significant way). Note that
> Nonmem manual recommends that SIGL >= 3*NSIG, TOL >= SIGL. Separate
> SIGL can be set on COV step, and it is recommended that SIGL >=
> 4*NSIG on COV step. In real life I've seen many examples where
> larger NSIG and SIGL resulted in successful COV step, and also many
> examples when default values were better (in getting COV step).
> UNCONDITIONAL on COV step allows you to run COV even when
> minimization ended with some error.
>
> Contrary to Nick's experience, I found that COV step is useful as it
> reveals which of the model parameters are poorly estimated, and that
> CI based on SE are usually quite good and are in a general agreement
> with the bootstrap CI, but it may depend on the problem.
>
> Leonid
>
> ------------------------------__--------
> Leonid Gibiansky, Ph.D.
> President, QuantPharm LLC
> web: www.quantpharm.com http://www.quantpharm.com
> e-mail: LGibiansky at quantpharm.com http://quantpharm.com
> tel: (301) 767 5566 <tel:%28301%29%20767%205566>
>
> On 10/8/2013 3:57 AM, Xinting Wang wrote:
>
> Dear all,
>
> I have a naive question regarding the modeling building process in
> NONMEM. With more and more covariates added in the model, I
> often come
> across an error message saying that "ERROR 134", or R MATRIX
> SINGULAR.
>
> After searching from the internet, I learned that changing NSIG in
> $ESTIMATION and MATRIX=S in $COV would be helpful for both problems
> respectively. And from my own experience, it dose help with the
> modeling
> building.
>
> However, my concern is, I used different NSIG and MATRIX in the
> previous
> steps. Is it proper to use different NSIGs and MATRICE in a
> single model
> building? If not, could you please explain this a little bit?
>
> Thank you in advance!
>
> Best Regards
> --
> Xinting
> Wang
>
> --
> Xinting
Bob,
NM-Help defines what it means by the SIGDIG estimation option.
SIGDIGITS=n
Number of significant digits required in the final parameter
estimate. SIGDIGITS is not used by the Monte-Carlo methods.
Default: 3. May also be coded NSIGDIGITS.
SIGL=n
n is used to calculate the step-size for finite difference deriv-
atives independent of the SIGDIGITS value. If n=0 or n=100 then
SIGL is ignored and SIGDIGITS is used as in versions prior to
NONMEM 7. SIGL should usually be 2 to 3 times the value of NSIG.
The number of significant digits reported is the
number of significant digits in the least-well-determined element.
The report "MINIMIZATION SUCCESSFUL" is issued when this number is no
less than the number of significant digits requested using the SIGDIG-
ITS option of the $ESTIMATION record.
NONMEM 7 has an additional estimation option (SIGL) that it used to provide additional control for finite-difference derivatives. Unless Bob Bauer can clarify the meaning further I still believe that the meaning of SIGDIG when used as a convergence criterion refers to the number of significant digits in the parameter value. What you describe is more like the meaning of TOL (which is used to control the local error for the DEQ solver).
I have no particular interest in the accuracy of the calculation of the number of significant digits in the parameter estimate but there is a large body of empirical experience using SIGDIG of 3 (or more). Furthermore, NONMEM reports parameter estimates with 3 significant digits which I am prepared to believe more if convergence was achieved with SIGDIG of 3.
As noted many times before on this list there is no empirical evidence to support the idea that the calculation of the asymptotic covariance matrix is associated with more reliable OFV or parameter estimates. What you say may be true as a mathematical description of the variance-covariance matrix properties but it does not mean the OFV and parameter estimates are correlated with this description and there is no evidence to support a correlation that I know of.
Leonid,
Thanks for telling me of your experience with SIGDIG=2 suggesting that this does not change the OFV or parameter estimates at least for some kinds of problem. I think this empirical knowledge is valuable but more examples are needed. Quicker run times because of fewer function evaluations are nice to have but only if the OFV and parameter estimates are not inferior. There are no free lunches so there must be some point where doing less work means that the OFV and parameter estimates are less reliable.
I tried a non-parametric bootstrap of a published model describing tumour growth (Tham LS, Wang L, Soo RA, Lee SC, Lee HS, Yong WP, et al. A pharmacodynamic model for the time course of tumor shrinkage by gemcitabine + carboplatin in non-small cell lung cancer patients. Clin Cancer Res. 2008;14(13):4213-8). The model has 2 differential equations to describe the amount of drug and the size of the tumour. I used the bootstrap average parameter estimates from SIGDIG=3 as the reference and calculated the bias in estimates obtained from SIGDIG=2. The absolute bias ranged from 2.5 to +67% for the fixed effect parameters and 4.7% to 100.8% for the random effects. So I would conclude that perhaps using SIGDIG=2 is problematic in terms of parameter estimates. On the other hand the OFV values were very similar with only 2 cases where the SIGDIG=2 OFV was more than 0.05 units worse (1.001 and 1.34 units worse).
You note that I hold an extreme opinion in the distribution of those who have offered opinions about the importance/non-importance of successful convergence and execution of the $COV step: "But it is safe to mention that Nick's view is on the extreme of the observed distribution of point of views on this subject "
A key point is that my extreme view is supported by experimental data (which has been independently confirmed by Marc Gastonguay and colleagues) and is not based on asymptotically derived speculations :-P.
Nick
Quoted reply history
On 23/10/2013 2:15 a.m., Bob Leary wrote:
> Nick -
> a) The usual definition of 'number of significant digits' is -log10(relative
> precision). Thus a sigdig of 3 is a precision of 1 part in 1000, and a sigdig
> of 2 corresponds to 1% precision, not 10% as in your example.
> b) that being said, the sigdigs in the parameters reported by NONMEM need to be
> taken with a grain of salt - they probably represent best case, 'speed of
> light' type numbers where the real precision may be considerably worse. I do
> not know specifically how they are computed, but my guess is that it is based
> on the fact that in a converged problem, the relative gradient has been driven
> below some specified tolerance. One can then infer precision from the
> condition number of the Hessian of the overall objective function and the
> actual relative gradients. But NONMEM uses a quasi-Newton method -
> there is no Hessian available to the method, but only a stand-in accumulated
> curvature matrix (a 'pseudo Hessian') that is usually much better conditioned
> than the actual Hessian. The only thing that can really be concluded is that,
> at the moment the top level iteration is stopped and convergence declared based
> on the relative gradient, the next iteration , if it were done, would not
> change the parameter estimates by more than the reported sigdig value. This is
> quite a different conclusion than reported parameter estimates are with
> sigdigits of the 'true' values.
>
> c) I know you have often argued that the failure of a covariance step has
> little or no evidential value for determining whether the minimization step was
> 'successful', and I generally agree with you.
> But the failure of the covariance step does mean that the Hessian could not be
> numerically estimated at all (failed the positive definiteness test). This
> does provide some additional evidence that one should be even more skeptical of
> the reported sigdig values of the parameter estimates.
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On
> Behalf Of Nick Holford
> Sent: Tuesday, October 22, 2013 4:45 AM
> To: [email protected]
> Subject: Re: [NMusers] Change of NSIG or R matrix
>
> Xinting,
>
> First of all 'successful minimization' has nothing to do with a good model.
> NONMEM's internal decision to declare success or termination is often a
> pseudo-random choice. If you look at the sigdigs of the estimate you will
> typically find that the lowest value is 2.9 and many others are greater than 5.
> This gives you a clue to which parameters are well determined and which are
> less well known. It is a not a YES/NO decision.
>
> Second, NSIG determines the number of significant digits in the parameter estimates. If
> you choose a number less than 3 then it means you don't care if the answer is 10.1 or
> 10.9. They both have 2 sig digs but the estimates differ by nearly 10%. There is a large
> body of empirical literature that has relied on NSIG=3 (or more). I do not see any reason
> to ignore this in order to get a meaningless "minimization successful" message
> from a random number generator.
>
> I look forward to hearing from "many" to understand why they believe that
> "minimization successful" indicates that the model results are somehow better even though
> the parameter estimates have hardly any significant digits.
>
> Nick
>
> On 22/10/2013 9:17 p.m., Xinting Wang wrote:
>
> > Dear Nick,
> >
> > Thank you very much for your suggestion. Could you explain a little
> > bit about the statement regarding NSIG < 3? I seem to remember that
> > many suggested to use a smaller NSIG to get a successful minimization.
> >
> > Dear Leonid,
> >
> > I read about the recommendation of SIGL, NSIG and TOL, but I am not
> > quite familiar with the use of these options in subroutine ADVAN4. If
> > I set SIGL a fixed value, let's say 12, and NSIG 3, does this mean I
> > also have to identify a value for TOL in $subroutine? I appreciate
> > your help very much.
> >
> > Thank you both.
> >
> > Regards
> >
> > On 8 October 2013 21:59, Leonid Gibiansky <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> > Yes, it should be fine to use S matrix if you cannot get default
> > to run, and use NSIG larger or smaller than default value of 3
> > (although this is not guaranteed, usually NSIG does not change the
> > OF value or parameter estimates in any significant way). Note that
> > Nonmem manual recommends that SIGL >= 3*NSIG, TOL >= SIGL.
> > Separate SIGL can be set on COV step, and it is recommended that
> > SIGL >= 4*NSIG on COV step. In real life I've seen many examples
> > where larger NSIG and SIGL resulted in successful COV step, and
> > also many examples when default values were better (in getting COV
> > step). UNCONDITIONAL on COV step allows you to run COV even when
> > minimization ended with some error.
> >
> > Contrary to Nick's experience, I found that COV step is useful as
> > it reveals which of the model parameters are poorly estimated, and
> > that CI based on SE are usually quite good and are in a general
> > agreement with the bootstrap CI, but it may depend on the problem.
> >
> > Leonid
> >
> > --------------------------------------
> > Leonid Gibiansky, Ph.D.
> > President, QuantPharm LLC
> > web: www.quantpharm.com http://www.quantpharm.com
> > e-mail: LGibiansky at quantpharm.com http://quantpharm.com
> > tel: (301) 767 5566 <tel:%28301%29%20767%205566>
> >
> > On 10/8/2013 3:57 AM, Xinting Wang wrote:
> >
> > Dear all,
> >
> > I have a naive question regarding the modeling building process in
> > NONMEM. With more and more covariates added in the model, I
> > often come
> > across an error message saying that "ERROR 134", or R MATRIX
> > SINGULAR.
> >
> > After searching from the internet, I learned that changing NSIG in
> > $ESTIMATION and MATRIX=S in $COV would be helpful for both
> > problems
> > respectively. And from my own experience, it dose help with
> > the modeling
> > building.
> >
> > However, my concern is, I used different NSIG and MATRIX in
> > the previous
> > steps. Is it proper to use different NSIGs and MATRICE in a
> > single model
> > building? If not, could you please explain this a little bit?
> >
> > Thank you in advance!
> >
> > Best Regards
> > --
> > Xinting
> > Wang
> >
> > --
> > Xinting
>
> --
> Nick Holford, Professor Clinical Pharmacology Dept Pharmacology & Clinical
> Pharmacology, Bldg 503 Room 302A University of Auckland,85 Park Rd,Private Bag
> 92019,Auckland,New Zealand
> office:+64(9)923-6730 mobile:NZ +64(21)46 23 53
> email: [email protected]
> http://holford.fmhs.auckland.ac.nz/
>
> Holford NHG. Disease progression and neuroscience. Journal of Pharmacokinetics
> and Pharmacodynamics. 2013;40:369-76
> http://link.springer.com/article/10.1007/s10928-013-9316-2
> Holford N, Heo Y-A, Anderson B. A pharmacokinetic standard for babies and
> adults. J Pharm Sci. 2013:
> http://onlinelibrary.wiley.com/doi/10.1002/jps.23574/abstract
> Holford N. A time to event tutorial for pharmacometricians. CPT:PSP. 2013;2:
> http://www.nature.com/psp/journal/v2/n5/full/psp201318a.html
> Holford NHG. Clinical pharmacology = disease progression + drug action. British
> Journal of Clinical Pharmacology. 2013:
> http://onlinelibrary.wiley.com/doi/10.1111/bcp.12170/abstract
>
>
Nick, As you point out, several (including myself) people have confirmed that the bootstrap samples that have successful covariance step do not differ from those that fail the covariance step. But that addresses the question of whether covariance is important predictor of "goodness" with the same model across different data sets. I don't think that is the question. The question (I think) is whether covariance step success is a useful predictor of "goodness" with the same data set across different models. That is, if I have 2 models that otherwise seem equally useful, should I prefer the one that has a successful covariance step. The only way to address that question is to do a trial. Take a data set and model it requiring a covariance step and again (objectively of course, without learning anything from the first modeling effort) without requiring a covariance step and see which give you a "better" model. We did this, using the automated model selection algorithm (genetic algorithm, now called "Darwin"). We also similarly compared different p values for the LRT (AIC vs p<0.05 vs p< 0.01) and using cross validation as the model selection criteria or NPDE. The metric of "goodness" was -2ll on a validation data set (without parameter estimation). Results were: model selection criteria -2ll on validation data set # of covariates in model AIC 648.16 26 AIC + covariance 651.80 24 LRT with p< 0.05 648.47 13 LRT with p< 0.01 652.84 6 cross validation 623.74 32 NPDE 785.80 21 Bottom line AIC without requirement for covariance gave the "best", (most predictive) model among the conventional endpoints. But, cross validation gave a much better model (not surprising that a model selected for its ability to predict an independent data set was best at predicting an independent data set). Note that you do not correct for parsimony on a validation data set, only compare the goodness of fit. Mark Sale MD President, Next Level Solutions, LLC www.NextLevelSolns.com 919-846-9185 A carbon-neutral company See our real time solar energy production at: http://enlighten.enphaseenergy.com/public/systems/aSDz2458
Quoted reply history
-------- Original Message --------
Subject: Re: [NMusers] Change of NSIG or R matrix
From: Nick Holford < [email protected] >
Date: Wed, October 23, 2013 1:50 am
To: nmusers < [email protected] >
Bob,
NM-Help defines what it means by the SIGDIG estimation option.
SIGDIGITS=n
Number of significant digits required in the final parameter
estimate. SIGDIGITS is not used by the Monte-Carlo methods.
Default: 3. May also be coded NSIGDIGITS.
SIGL=n
n is used to calculate the step-size for finite difference deriv-
atives independent of the SIGDIGITS value. If n=0 or n=100 then
SIGL is ignored and SIGDIGITS is used as in versions prior to
NONMEM 7. SIGL should usually be 2 to 3 times the value of NSIG.
The number of significant digits reported is the
number of significant digits in the least-well-determined element.
The report "MINIMIZATION SUCCESSFUL" is issued when this number is no
less than the number of significant digits requested using the SIGDIG-
ITS option of the $ESTIMATION record.
NONMEM 7 has an additional estimation option (SIGL) that it used to
provide additional control for finite-difference derivatives.
Unless Bob Bauer can clarify the meaning further I still believe that
the meaning of SIGDIG when used as a convergence criterion refers to the
number of significant digits in the parameter value. What you describe
is more like the meaning of TOL (which is used to control the local
error for the DEQ solver).
I have no particular interest in the accuracy of the calculation of the
number of significant digits in the parameter estimate but there is a
large body of empirical experience using SIGDIG of 3 (or more).
Furthermore, NONMEM reports parameter estimates with 3 significant
digits which I am prepared to believe more if convergence was achieved
with SIGDIG of 3.
As noted many times before on this list there is no empirical evidence
to support the idea that the calculation of the asymptotic covariance
matrix is associated with more reliable OFV or parameter estimates. What
you say may be true as a mathematical description of the
variance-covariance matrix properties but it does not mean the OFV and
parameter estimates are correlated with this description and there is no
evidence to support a correlation that I know of.
Leonid,
Thanks for telling me of your experience with SIGDIG=2 suggesting that
this does not change the OFV or parameter estimates at least for some
kinds of problem. I think this empirical knowledge is valuable but more
examples are needed. Quicker run times because of fewer function
evaluations are nice to have but only if the OFV and parameter estimates
are not inferior. There are no free lunches so there must be some point
where doing less work means that the OFV and parameter estimates are
less reliable.
I tried a non-parametric bootstrap of a published model describing
tumour growth (Tham LS, Wang L, Soo RA, Lee SC, Lee HS, Yong WP, et al.
A pharmacodynamic model for the time course of tumor shrinkage by
gemcitabine + carboplatin in non-small cell lung cancer patients. Clin
Cancer Res. 2008;14(13):4213-8). The model has 2 differential equations
to describe the amount of drug and the size of the tumour. I used the
bootstrap average parameter estimates from SIGDIG=3 as the reference and
calculated the bias in estimates obtained from SIGDIG=2. The absolute
bias ranged from 2.5 to +67% for the fixed effect parameters and 4.7% to
100.8% for the random effects. So I would conclude that perhaps using
SIGDIG=2 is problematic in terms of parameter estimates. On the other
hand the OFV values were very similar with only 2 cases where the
SIGDIG=2 OFV was more than 0.05 units worse (1.001 and 1.34 units worse).
You note that I hold an extreme opinion in the distribution of those who
have offered opinions about the importance/non-importance of successful
convergence and execution of the $COV step:
"But it is safe to mention that Nick's view is on the extreme of the
observed distribution of point of views on this subject "
A key point is that my extreme view is supported by experimental data
(which has been independently confirmed by Marc Gastonguay and
colleagues) and is not based on asymptotically derived speculations :-P.
Nick
On 23/10/2013 2:15 a.m., Bob Leary wrote:
> Nick -
> a) The usual definition of 'number of significant digits' is -log10(relative precision). Thus a sigdig of 3 is a precision of 1 part in 1000, and a sigdig of 2 corresponds to 1% precision, not 10% as in your example.
> b) that being said, the sigdigs in the parameters reported by NONMEM need to be taken with a grain of salt - they probably represent best case, 'speed of light' type numbers where the real precision may be considerably worse. I do not know specifically how they are computed, but my guess is that it is based on the fact that in a converged problem, the relative gradient has been driven below some specified tolerance. One can then infer precision from the condition number of the Hessian of the overall objective function and the actual relative gradients. But NONMEM uses a quasi-Newton method -
> there is no Hessian available to the method, but only a stand-in accumulated curvature matrix (a 'pseudo Hessian') that is usually much better conditioned than the actual Hessian. The only thing that can really be concluded is that, at the moment the top level iteration is stopped and convergence declared based on the relative gradient, the next iteration , if it were done, would not change the parameter estimates by more than the reported sigdig value. This is quite a different conclusion than reported parameter estimates are with sigdigits of the 'true' values.
>
> c) I know you have often argued that the failure of a covariance step has little or no evidential value for determining whether the minimization step was 'successful', and I generally agree with you.
> But the failure of the covariance step does mean that the Hessian could not be numerically estimated at all (failed the positive definiteness test). This does provide some additional evidence that one should be even more skeptical of the reported sigdig values of the parameter estimates.
>
>
>
> -----Original Message-----
> From: [email protected] [ mailto: [email protected] ] On Behalf Of Nick Holford
> Sent: Tuesday, October 22, 2013 4:45 AM
> To: [email protected]
> Subject: Re: [NMusers] Change of NSIG or R matrix
>
> Xinting,
>
> First of all 'successful minimization' has nothing to do with a good model. NONMEM's internal decision to declare success or termination is often a pseudo-random choice. If you look at the sigdigs of the estimate you will typically find that the lowest value is 2.9 and many others are greater than 5. This gives you a clue to which parameters are well determined and which are less well known. It is a not a YES/NO decision.
>
> Second, NSIG determines the number of significant digits in the parameter estimates. If you choose a number less than 3 then it means you don't care if the answer is 10.1 or 10.9. They both have 2 sig digs but the estimates differ by nearly 10%. There is a large body of empirical literature that has relied on NSIG=3 (or more). I do not see any reason to ignore this in order to get a meaningless "minimization successful" message from a random number generator.
>
> I look forward to hearing from "many" to understand why they believe that "minimization successful" indicates that the model results are somehow better even though the parameter estimates have hardly any significant digits.
>
> Nick
>
> On 22/10/2013 9:17 p.m., Xinting Wang wrote:
>> Dear Nick,
>>
>> Thank you very much for your suggestion. Could you explain a little
>> bit about the statement regarding NSIG < 3? I seem to remember that
>> many suggested to use a smaller NSIG to get a successful minimization.
>>
>> Dear Leonid,
>>
>> I read about the recommendation of SIGL, NSIG and TOL, but I am not
>> quite familiar with the use of these options in subroutine ADVAN4. If
>> I set SIGL a fixed value, let's say 12, and NSIG 3, does this mean I
>> also have to identify a value for TOL in $subroutine? I appreciate
>> your help very much.
>>
>> Thank you both.
>>
>> Regards
>>
>>
>>
>> On 8 October 2013 21:59, Leonid Gibiansky < [email protected]
>> < mailto: [email protected] >> wrote:
>>
>> Yes, it should be fine to use S matrix if you cannot get default
>> to run, and use NSIG larger or smaller than default value of 3
>> (although this is not guaranteed, usually NSIG does not change the
>> OF value or parameter estimates in any significant way). Note that
>> Nonmem manual recommends that SIGL >= 3*NSIG, TOL >= SIGL.
>> Separate SIGL can be set on COV step, and it is recommended that
>> SIGL >= 4*NSIG on COV step. In real life I've seen many examples
>> where larger NSIG and SIGL resulted in successful COV step, and
>> also many examples when default values were better (in getting COV
>> step). UNCONDITIONAL on COV step allows you to run COV even when
>> minimization ended with some error.
>>
>> Contrary to Nick's experience, I found that COV step is useful as
>> it reveals which of the model parameters are poorly estimated, and
>> that CI based on SE are usually quite good and are in a general
>> agreement with the bootstrap CI, but it may depend on the problem.
>>
>> Leonid
>>
>>
>> --------------------------------------
>> Leonid Gibiansky, Ph.D.
>> President, QuantPharm LLC
>> web: www.quantpharm.com http:// www.quantpharm.com
>> e-mail: LGibiansky at quantpharm.com < http://quantpharm.com >
>> tel: (301) 767 5566 <tel:%28301%29%20767%205566>
>>
>>
>>
>>
>> On 10/8/2013 3:57 AM, Xinting Wang wrote:
>>
>> Dear all,
>>
>> I have a naive question regarding the modeling building process in
>> NONMEM. With more and more covariates added in the model, I
>> often come
>> across an error message saying that "ERROR 134", or R MATRIX
>> SINGULAR.
>>
>> After searching from the internet, I learned that changing NSIG in
>> $ESTIMATION and MATRIX=S in $COV would be helpful for both
>> problems
>> respectively. And from my own experience, it dose help with
>> the modeling
>> building.
>>
>> However, my concern is, I used different NSIG and MATRIX in
>> the previous
>> steps. Is it proper to use different NSIGs and MATRICE in a
>> single model
>> building? If not, could you please explain this a little bit?
>>
>> Thank you in advance!
>>
>> Best Regards
>> --
>> Xinting
>> Wang
>>
>>
>>
>>
>> --
>> Xinting
> --
> Nick Holford, Professor Clinical Pharmacology Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand
> office:+64(9)923-6730 mobile:NZ +64(21)46 23 53
> email: [email protected]
> http://holford.fmhs.auckland.ac.nz/
>
> Holford NHG. Disease progression and neuroscience. Journal of Pharmacokinetics and Pharmacodynamics. 2013;40:369-76 http://link.springer.com/article/10.1007/s10928-013-9316-2
> Holford N, Heo Y-A, Anderson B. A pharmacokinetic standard for babies and adults. J Pharm Sci. 2013: http://onlinelibrary.wiley.com/doi/10.1002/jps.23574/abstract
> Holford N. A time to event tutorial for pharmacometricians. CPT:PSP. 2013;2: http://www.nature.com/psp/journal/v2/n5/full/psp201318a.html
> Holford NHG. Clinical pharmacology = disease progression + drug action. British Journal of Clinical Pharmacology. 2013: http://onlinelibrary.wiley.com/doi/10.1111/bcp.12170/abstract
>
>
>
Nick -
I think you either mis-read or mis-understood my last message.
1) you seem to be thinking of sigdigits (and NSIG for the least precise
parameter)
as an integer related to number of significant decimal digits in the parameter
estimate. Loosely speaking, this is the general
idea, but this leaves open the problem that you suggested in your example
where you claimed that at a sigdigit level of 2, a user would be somewhat
indifferent between a parameter value
of 1.0 and 1.1, a 10% difference, since these only differ by 1 in the second
significant decimal digit.
The simple interpretation of sigdigits as the (integer) number of significant
decimal digits is too vague - under it, sigdigit=2 implies a 10% precision
for estimates like 1.0,
but a 1% precision for estimates like 9.9. The use of sigdigit =
-log10(parameter precision) preserves the general idea of number of significant
decimal digits,
but makes this idea much precise.
I disagree (except as amplified in part 2 below) that what I described in my
last message is " is more like the meaning of TOL (which is used to control the
local error for the DEQ solver)"
What I described was in fact was exactly the most usual (in the numerical
analysis community) definition of sigdigits as it relates to precision of the
parameter estimate
namely that sigdigits is -log10(precision of parameter estimates). Note under
this interpretation, sigdigits is a floating point number, not an integer
NONMEM indeed reports a floating point number (not an integer) for SIGDIGITS
in it MINIMIZATION SUCCESSFUL message, e.g.
NO. OF SIG. DIGITS IN FINAL EST.: 3.4
But lurking under NSIG is in fact a tolerance, since the specification of NSIG
as a convergence criterion somewhere must be implemented as a tolerance test
involving the underlying quantities used to
estimate NSIG , so
2) I provided what I believe to be the most reasonable method of estimating
precision - i.e. inferring it from
magnitude of the objective function gradient and approximate Hessian values at
the converged parameter estimate. So here is the connection between
NSIG and a tolerance - specifying an NSIG value in fact puts a limit on how
large the quantity ( inverse(approximate hessian)* gradient) can be at the
converged value.
Again this is highly consistent with what HELP suggests when
it notes that higher SIGL's (and hence more precise gradient estimates) should
be used with
higher values of NSIG - you need a more precise gradient value if you want to
to use it in connection with showing that a parameter precision meets the NSIG
criterion . (As I noted in my
original message, the Hessian is more problematical - the best a quasi-Newton
method can do is substitute the
accumulated curvature matrix as an approximation for the Hessian. This matrix
is constructed out of sums of the tensor products of the gradient vectors from
all iterations up to the current one, and thus has roughly the same precision
as the gradient. But it is not really the same as
the Hessian - if it were, there would be no need for a COVR step to try to
estimate the real Hessian)
If the SIGL value is too low, the estimate of the parameter precision
in terms of first and second derivatives of the objective function itself
becomes too imprecise, e.g.
you may never be able to guarantee that a gradient computed with low
precision has a magnitude low enough to meet a high NSIG precision criterion.
The HELP-suggested use of a SIGL '2 or 3 times' that of the NSIG value is
consistent with the well -known square root (first derivatives) or cube root
(second derivatives) connection between the optimal step size for computing
numerical difference -based derivative values and the precision of the
evaluations of the function
that is being differentiated.
Finally, note that HELP also mentions . ' SIGDIGITS is not used by the
Monte-Carlo methods."
The Monte Carlo methods are precisely the ones which do not use explicit
gradients of the objective function,
so are unable to do compute a gradient-based precision SIGDIGITS estimate.
Quoted reply history
-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Nick Holford
Sent: Wednesday, October 23, 2013 1:51 AM
To: nmusers
Subject: Re: [NMusers] Change of NSIG or R matrix
Bob,
NM-Help defines what it means by the SIGDIG estimation option.
SIGDIGITS=n
Number of significant digits required in the final parameter
estimate. SIGDIGITS is not used by the Monte-Carlo methods.
Default: 3. May also be coded NSIGDIGITS.
SIGL=n
n is used to calculate the step-size for finite difference deriv-
atives independent of the SIGDIGITS value. If n=0 or n=100 then
SIGL is ignored and SIGDIGITS is used as in versions prior to
NONMEM 7. SIGL should usually be 2 to 3 times the value of NSIG.
The number of significant digits reported is the
number of significant digits in the least-well-determined element.
The report "MINIMIZATION SUCCESSFUL" is issued when this number is no
less than the number of significant digits requested using the SIGDIG-
ITS option of the $ESTIMATION record.
NONMEM 7 has an additional estimation option (SIGL) that it used to provide
additional control for finite-difference derivatives.
Unless Bob Bauer can clarify the meaning further I still believe that the
meaning of SIGDIG when used as a convergence criterion refers to the number of
significant digits in the parameter value. What you describe is more like the
meaning of TOL (which is used to control the local error for the DEQ solver).
I have no particular interest in the accuracy of the calculation of the number
of significant digits in the parameter estimate but there is a large body of
empirical experience using SIGDIG of 3 (or more).
Furthermore, NONMEM reports parameter estimates with 3 significant digits which
I am prepared to believe more if convergence was achieved with SIGDIG of 3.
Katya, If you want a p<0.05 model selection criteria then yes, if the penalty is set to 2 (for AIC) then you will get effects that don't pass at p<0.05 (LRT delta of 3.84). BTW, the OBJ given before is from the validation data set (which is different from the cross-validation data sets), WITHOUT parameter estimation, not the model development data set, so the data set is smaller, and the whole basis for hypothesis testing is invalid since no estimation was done. But, I'd like to suggest that we back up a little and ask why we are doing the model. If your goal is to test some hypothesis, and you're worried about inflated alpha error (which you probably should be, at least a little), the a Bonferroni-like correction on the model development data set is fine. But, if you're interested in predicting an independent data set, perhaps your approach to overmodeling/inflated alpha error should be different. Cross validation is an alternative to hypothesis-test-with-Bonferroni-like correction (henceforth HTWBLC) approach to over fitting. Like Bonferroni-like correction, cross-validation isn't a guarantee that all effects you find are "real", in this case "real" defined as improving the predictive performance in an independent data set, and if you look at all possible combinations of 112 different effects (as we did in this data set) you still stand some chance of identifying spurious effects. But, to your point, many of the effects found in the cross validation had a very small effect (<0.1 points) on the -2LL, even in the model development data set and would be rejected in a hypothesis testing world. But, they made it into the model because they improved the predictive performance (in the cross validation), if only a little. Note that ANY improvement in the cross validation approach is accepted, 0.001 points better is a better prediction. Note also that, usually, a spurious effect should increase the OBJ in cross validation, if you put into the model that astrological sign predicts volume (when it doesn't), and fix that parameter, the OBJ in the validation set should increase. Whereas if you estimate the parameter, the OBJ should decrease, distributed as a Chi-square. What we're thinking about at this point is what other parsimony-management methods should be added to cross validation, because we are getting very large models. I'm leaning toward clinical significance, which would have eliminated many of the 32 covariate effects that we found. Clinical significance is defined as the upper or lower limits of the predicted parameter (e.g., volume) differing by more than 20% over the 95% CI of the covariates. That is, take the lower 2.5% of the weight, calculate the predicted volume, take the upper 97.5% of the weight, calculate the predicted volume, if those differ > 20%, the effect of weight on volume is clinically significant. This clinical significance criteria is built into Darwin, our automated model selection method, and we're looking at that, in combination with cross validation to use as model selection criteria when simulation is the objective. Mark Mark Sale MD President, Next Level Solutions, LLC www.NextLevelSolns.com 919-846-9185 A carbon-neutral company See our real time solar energy production at: http://enlighten.enphaseenergy.com/public/systems/aSDz2458
Quoted reply history
-------- Original Message --------
Subject: Re: [NMusers] Change of NSIG or R matrix
From: Ekaterina Gibiansky < [email protected] >
Date: Wed, October 23, 2013 11:04 am
To: Mark Sale - Next Level Solutions < [email protected] >, nmusers
< [email protected] >
Hi Mark, This is a somewhat different topic, but judging from your example all the methods of model selection that you tested give you overly over-parametrized covariate models compared to LRT with p< 0.01 . Cross-validation that gave you 29 points lower OF in the validation set has 26 more covariates. The rest of the methods give < 5 points OF improvement for additional 7-21 covariates. All the methods are supposed to correct for parsimony during model development, so your example shows that they actually do not do it well. Have you tried LRT with p< 0.001? Regards, Katya Ekaterina Gibiansky, Ph.D.
CEO&CSO, QuantPharm LLC
Web: www.quantpharm.com
Email: EGibiansky at quantpharm.com
Tel: (301)-717-7032 On 10/23/2013 7:30 AM, Mark Sale - Next Level Solutions wrote: Nick, As you point out, several (including myself) people have confirmed that the bootstrap samples that have successful covariance step do not differ from those that fail the covariance step. But that addresses the question of whether covariance is important predictor of "goodness" with the same model across different data sets. I don't think that is the question. The question (I think) is whether covariance step success is a useful predictor of "goodness" with the same data set across different models. That is, if I have 2 models that otherwise seem equally useful, should I prefer the one that has a successful covariance step. The only way to address that question is to do a trial. Take a data set and model it requiring a covariance step and again (objectively of course, without learning anything from the first modeling effort) without requiring a covariance step and see which give you a "better" model. We did this, using the automated model selection algorithm (genetic algorithm, now called "Darwin"). We also similarly compared different p values for the LRT (AIC vs p<0.05 vs p< 0.01) and using cross validation as the model selection criteria or NPDE. The metric of "goodness" was -2ll on a validation data set (without parameter estimation). Results were: model selection criteria -2ll on validation data set # of covariates in model AIC 648.16 26 AIC + covariance 651.80 24 LRT with p< 0.05 648.47 13 LRT with p< 0.01 652.84 6 cross validation 623.74 32 NPDE 785.80 21 Bottom line AIC without requirement for covariance gave the "best", (most predictive) model among the conventional endpoints. But, cross validation gave a much better model (not surprising that a model selected for its ability to predict an independent data set was best at predicting an independent data set). Note that you do not correct for parsimony on a validation data set, only compare the goodness of fit. Mark Sale MD President, Next Level Solutions, LLC www.NextLevelSolns.com 919-846-9185 A carbon-neutral company See our real time solar energy production at: http://enlighten.enphaseenergy.com/public/systems/aSDz2458 -------- Original Message -------- Subject: Re: [NMusers] Change of NSIG or R matrix From: Nick Holford < [email protected] > Date: Wed, October 23, 2013 1:50 am To: nmusers < [email protected] > Bob, NM-Help defines what it means by the SIGDIG estimation option. SIGDIGITS=n Number of significant digits required in the final parameter estimate. SIGDIGITS is not used by the Monte-Carlo methods. Default: 3. May also be coded NSIGDIGITS. SIGL=n n is used to calculate the step-size for finite difference deriv- atives independent of the SIGDIGITS value. If n=0 or n=100 then SIGL is ignored and SIGDIGITS is used as in versions prior to NONMEM 7. SIGL should usually be 2 to 3 times the value of NSIG. The number of significant digits reported is the number of significant digits in the least-well-determined element. The report "MINIMIZATION SUCCESSFUL" is issued when this number is no less than the number of significant digits requested using the SIGDIG- ITS option of the $ESTIMATION record. NONMEM 7 has an additional estimation option (SIGL) that it used to provide additional control for finite-difference derivatives. Unless Bob Bauer can clarify the meaning further I still believe that the meaning of SIGDIG when used as a convergence criterion refers to the number of significant digits in the parameter value. What you describe is more like the meaning of TOL (which is used to control the local error for the DEQ solver). I have no particular interest in the accuracy of the calculation of the number of significant digits in the parameter estimate but there is a large body of empirical experience using SIGDIG of 3 (or more). Furthermore, NONMEM reports parameter estimates with 3 significant digits which I am prepared to believe more if convergence was achieved with SIGDIG of 3. As noted many times before on this list there is no empirical evidence to support the idea that the calculation of the asymptotic covariance matrix is associated with more reliable OFV or parameter estimates. What you say may be true as a mathematical description of the variance-covariance matrix properties but it does not mean the OFV and parameter estimates are correlated with this description and there is no evidence to support a correlation that I know of. Leonid, Thanks for telling me of your experience with SIGDIG=2 suggesting that this does not change the OFV or parameter estimates at least for some kinds of problem. I think this empirical knowledge is valuable but more examples are needed. Quicker run times because of fewer function evaluations are nice to have but only if the OFV and parameter estimates are not inferior. There are no free lunches so there must be some point where doing less work means that the OFV and parameter estimates are less reliable. I tried a non-parametric bootstrap of a published model describing tumour growth (Tham LS, Wang L, Soo RA, Lee SC, Lee HS, Yong WP, et al. A pharmacodynamic model for the time course of tumor shrinkage by gemcitabine + carboplatin in non-small cell lung cancer patients. Clin Cancer Res. 2008;14(13):4213-8). The model has 2 differential equations to describe the amount of drug and the size of the tumour. I used the bootstrap average parameter estimates from SIGDIG=3 as the reference and calculated the bias in estimates obtained from SIGDIG=2. The absolute bias ranged from 2.5 to +67% for the fixed effect parameters and 4.7% to 100.8% for the random effects. So I would conclude that perhaps using SIGDIG=2 is problematic in terms of parameter estimates. On the other hand the OFV values were very similar with only 2 cases where the SIGDIG=2 OFV was more than 0.05 units worse (1.001 and 1.34 units worse). You note that I hold an extreme opinion in the distribution of those who have offered opinions about the importance/non-importance of successful convergence and execution of the $COV step: "But it is safe to mention that Nick's view is on the extreme of the observed distribution of point of views on this subject " A key point is that my extreme view is supported by experimental data (which has been independently confirmed by Marc Gastonguay and colleagues) and is not based on asymptotically derived speculations :-P. Nick On 23/10/2013 2:15 a.m., Bob Leary wrote: > Nick - > a) The usual definition of 'number of significant digits' is -log10(relative precision). Thus a sigdig of 3 is a precision of 1 part in 1000, and a sigdig of 2 corresponds to 1% precision, not 10% as in your example. > b) that being said, the sigdigs in the parameters reported by NONMEM need to be taken with a grain of salt - they probably represent best case, 'speed of light' type numbers where the real precision may be considerably worse. I do not know specifically how they are computed, but my guess is that it is based on the fact that in a converged problem, the relative gradient has been driven below some specified tolerance. One can then infer precision from the condition number of the Hessian of the overall objective function and the actual relative gradients. But NONMEM uses a quasi-Newton method - > there is no Hessian available to the method, but only a stand-in accumulated curvature matrix (a 'pseudo Hessian') that is usually much better conditioned than the actual Hessian. The only thing that can really be concluded is that, at the moment the top level iteration is stopped and convergence declared based on the relative gradient, the next iteration , if it were done, would not change the parameter estimates by more than the reported sigdig value. This is quite a different conclusion than reported parameter estimates are with sigdigits of the 'true' values. > > c) I know you have often argued that the failure of a covariance step has little or no evidential value for determining whether the minimization step was 'successful', and I generally agree with you. > But the failure of the covariance step does mean that the Hessian could not be numerically estimated at all (failed the positive definiteness test). This does provide some additional evidence that one should be even more skeptical of the reported sigdig values of the parameter estimates. > > > > -----Original Message----- > From: [email protected] [ mailto: [email protected] ] On Behalf Of Nick Holford > Sent: Tuesday, October 22, 2013 4:45 AM > To: [email protected] > Subject: Re: [NMusers] Change of NSIG or R matrix > > Xinting, > > First of all 'successful minimization' has nothing to do with a good model. NONMEM's internal decision to declare success or termination is often a pseudo-random choice. If you look at the sigdigs of the estimate you will typically find that the lowest value is 2.9 and many others are greater than 5. This gives you a clue to which parameters are well determined and which are less well known. It is a not a YES/NO decision. > > Second, NSIG determines the number of significant digits in the parameter estimates. If you choose a number less than 3 then it means you don't care if the answer is 10.1 or 10.9. They both have 2 sig digs but the estimates differ by nearly 10%. There is a large body of empirical literature that has relied on NSIG=3 (or more). I do not see any reason to ignore this in order to get a meaningless "minimization successful" message from a random number generator. > > I look forward to hearing from "many" to understand why they believe that "minimization successful" indicates that the model results are somehow better even though the parameter estimates have hardly any significant digits. > > Nick > > On 22/10/2013 9:17 p.m., Xinting Wang wrote: >> Dear Nick, >> >> Thank you very much for your suggestion. Could you explain a little >> bit about the statement regarding NSIG < 3? I seem to remember that >> many suggested to use a smaller NSIG to get a successful minimization. >> >> Dear Leonid, >> >> I read about the recommendation of SIGL, NSIG and TOL, but I am not >> quite familiar with the use of these options in subroutine ADVAN4. If >> I set SIGL a fixed value, let's say 12, and NSIG 3, does this mean I >> also have to identify a value for TOL in $subroutine? I appreciate >> your help very much. >> >> Thank you both. >> >> Regards >> >> >> >> On 8 October 2013 21:59, Leonid Gibiansky < [email protected] >> < mailto: [email protected] >> wrote: >> >> Yes, it should be fine to use S matrix if you cannot get default >> to run, and use NSIG larger or smaller than default value of 3 >> (although this is not guaranteed, usually NSIG does not change the >> OF value or parameter estimates in any significant way). Note that >> Nonmem manual recommends that SIGL >= 3*NSIG, TOL >= SIGL. >> Separate SIGL can be set on COV step, and it is recommended that >> SIGL >= 4*NSIG on COV step. In real life I've seen many examples >> where larger NSIG and SIGL resulted in successful COV step, and >> also many examples when default values were better (in getting COV >> step). UNCONDITIONAL on COV step allows you to run COV even when >> minimization ended with some error. >> >> Contrary to Nick's experience, I found that COV step is useful as >> it reveals which of the model parameters are poorly estimated, and >> that CI based on SE are usually quite good and are in a general >> agreement with the bootstrap CI, but it may depend on the problem. >> >> Leonid >> >> >> -------------------------------------- >> Leonid Gibiansky, Ph.D. >> President, QuantPharm LLC >> web: www.quantpharm.com < http:// www.quantpharm.com > >> e-mail: LGibiansky at quantpharm.com < http://quantpharm.com > >> tel: (301) 767 5566 <tel:%28301%29%20767%205566> >> >> >> >> >> On 10/8/2013 3:57 AM, Xinting Wang wrote: >> >> Dear all, >> >> I have a naive question regarding the modeling building process in >> NONMEM. With more and more covariates added in the model, I >> often come >> across an error message saying that "ERROR 134", or R MATRIX >> SINGULAR. >> >> After searching from the internet, I learned that changing NSIG in >> $ESTIMATION and MATRIX=S in $COV would be helpful for both >> problems >> respectively. And from my own experience, it dose help with >> the modeling >> building. >> >> However, my concern is, I used different NSIG and MATRIX in >> the previous >> steps. Is it proper to use different NSIGs and MATRICE in a >> single model >> building? If not, could you please explain this a little bit? >> >> Thank you in advance! >> >> Best Regards >> -- >> Xinting >> Wang >> >> >> >> >> -- >> Xinting > -- > Nick Holford, Professor Clinical Pharmacology Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand > office:+64(9)923-6730 mobile:NZ +64(21)46 23 53 > email: [email protected] > http://holford.fmhs.auckland.ac.nz/ > > Holford NHG. Disease progression and neuroscience. Journal of Pharmacokinetics and Pharmacodynamics. 2013;40:369-76 http://link.springer.com/article/10.1007/s10928-013-9316-2 > Holford N, Heo Y-A, Anderson B. A pharmacokinetic standard for babies and adults. J Pharm Sci. 2013: http://onlinelibrary.wiley.com/doi/10.1002/jps.23574/abstract > Holford N. A time to event tutorial for pharmacometricians. CPT:PSP. 2013;2: http://www.nature.com/psp/journal/v2/n5/full/psp201318a.html > Holford NHG. Clinical pharmacology = disease progression + drug action. British Journal of Clinical Pharmacology. 2013: http://onlinelibrary.wiley.com/doi/10.1111/bcp.12170/abstract > > >