Dear all,
I am wondering if someone can provide references for the condition number
thresholds we are seeing (<1000) etc. Also, the other way I have seen when I
was in graduate school that condition number <10^n (n- number of parameters)
is OK. Personally, I am depending on correlation matrix rather than condition
number and have seen cases where condition number is large (according to 1000
rule but less than 10^n rule) but correlation matrix is fine.
I want to provide these for my teaching purposes and any help is greatly
appreciated.
Regards,
Ayyappa
Condition number
19 messages
10 people
Latest: Dec 02, 2022
Hi Ayyappa,
I think the condition number was first proposed as a statistic to diagnose
multicollinearity in multiple linear regression analyses based on an
eigenvalue analysis of the X'X matrix. You can probably search the
statistical literature and multiple linear regression textbooks to find
various rules for the condition number as well as other statistics related
to the eigenvalue analysis. For the CN<1000 rule I typically reference the
following textbook:
Montgomery and Peck (1982). Introduction to Linear Regression Analysis.
Wiley, NY (pp. 301-302).
The condition number is good at detecting model instability but it is not
very good for identifying the source. Inspecting the correlation matrix for
extreme pairwise correlations is better suited for identifying the source of
the instability when it only involves a couple of parameters. It becomes
more challenging to identify the source of the instability
(multicollinearity) when the CN>1000 but none of the pairwise correlations
are extreme |corr|>0.95. Although when CN>1000 often we will find several
pairwise correlations that are moderately high |corr|>0.7 but it may be hard
to uncover a pattern or source of the instability without trying alternative
models that may eliminate one or more of the parameters associated with
these moderate to high correlations.
Best,
Ken
Kenneth G. Kowalski
Kowalski PMetrics Consulting, LLC
Email: [email protected]
Cell: 248-207-5082
Quoted reply history
-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Ayyappa Chaturvedula
Sent: Tuesday, November 29, 2022 8:52 AM
To: [email protected]
Subject: [NMusers] Condition number
Dear all,
I am wondering if someone can provide references for the condition number
thresholds we are seeing (<1000) etc. Also, the other way I have seen when I
was in graduate school that condition number <10^n (n- number of parameters)
is OK. Personally, I am depending on correlation matrix rather than
condition number and have seen cases where condition number is large
(according to 1000 rule but less than 10^n rule) but correlation matrix is
fine.
I want to provide these for my teaching purposes and any help is greatly
appreciated.
Regards,
Ayyappa
--
This email has been checked for viruses by Avast antivirus software.
www.avast.com
This is also discussed in my book on page 70.
The first definition is simply the ratio of the largest to
smallest eigenvalue
K = L1/Lp (51)
where L1 and Lp are the largest and smallest eigenvalues of
the correlation matrix (Jackson 1991). The second way is to
define K as
K = sqrt(L1/Lp) (52)
The latter method is often used simply because the
condition numbers are smaller. The user should be aware
how a software package computes a condition number. For
instance, SAS uses (52). For this book (51) will be used as
the definition of the condition number. Condition numbers
range from 1, which indicates perfect stability, to infinity,
which indicates perfect instability. As a rule of thumb,
Log10(K) using (51) indicates the number of decimal places
lost by a computer due to round-off errors due to matrix
inversion. Most computers have about 16 decimal digits of
accuracy and if the condition number is 10^4, then the result
will be accurate to at most 12 (calculated as 16 - 4) decimal
places of accuracy.
It is difficult to find useful yardsticks in the literature
about what constitutes a large condition number because
many books have drastically different cut-offs. For this
book, the following guidelines will be used. For a linear
model, when the condition number is less than 104, no
serious collinearity is present. When the condition number
is between 10^4 and 10^6, moderate collinearity is present,
and when the condition number exceeds 10^6, severe
collinearity is present and the values of the parameter
estimates are not to be trusted. The difficulty with the use
of the condition number is that it fails to identify which
columns are collinear and simply indicates that collinearity
is present. If multicollinearity is present wherein a function
of one or more columns is collinear with a function of one
or more other columns, then the condition number will fail
to identify that collinearity. See Belsley et al. (1980) for
details on how to detect collinearity among sets of
covariates
I also found this on stack exchange
https://math.stackexchange.com/questions/2392992/matrix-condition-number-and-loss-of-accuracy
pete
Peter Bonate, PhD
Executive Director
Pharmacokinetics, Modeling, and Simulation (PKMS)
Clinical Pharmacology and Exploratory Development (CPED)
Astellas
1 Astellas Way
Northbrook, IL 60062
[email protected]
(224) 619-4901
Quote of the week –
“Dancing with the Stars” is not owned by Astellas.
Quoted reply history
-----Original Message-----
From: [email protected] <[email protected]> On Behalf Of
Ayyappa Chaturvedula
Sent: Tuesday, November 29, 2022 9:20 AM
To: Ken Kowalski <[email protected]>
Cc: [email protected]
Subject: Re: [NMusers] Condition number
Thank you, Ken. It is very reassuring.
I have also seen a discussion on other forums that Condition number as a
function of dimension of problem (n). I am seeing contradiction between 10^n
and a static >1000 approach. I am curious if someone can also comment on this
and 10^n rule?
Regards,
Ayyappa
> On Nov 29, 2022, at 9:04 AM, Ken Kowalski <[email protected]> wrote:
>
> Hi Ayyappa,
>
> I think the condition number was first proposed as a statistic to
> diagnose multicollinearity in multiple linear regression analyses
> based on an eigenvalue analysis of the X'X matrix. You can probably
> search the statistical literature and multiple linear regression
> textbooks to find various rules for the condition number as well as
> other statistics related to the eigenvalue analysis. For the CN<1000
> rule I typically reference the following textbook:
>
> Montgomery and Peck (1982). Introduction to Linear Regression Analysis.
> Wiley, NY (pp. 301-302).
>
> The condition number is good at detecting model instability but it is
> not very good for identifying the source. Inspecting the correlation
> matrix for extreme pairwise correlations is better suited for identifying the
> source of
> the instability when it only involves a couple of parameters. It becomes
> more challenging to identify the source of the instability
> (multicollinearity) when the CN>1000 but none of the pairwise
> correlations are extreme |corr|>0.95. Although when CN>1000 often we
> will find several pairwise correlations that are moderately high
> |corr|>0.7 but it may be hard to uncover a pattern or source of the
> instability without trying alternative models that may eliminate one
> or more of the parameters associated with these moderate to high correlations.
>
> Best,
>
> Ken
>
> Kenneth G. Kowalski
> Kowalski PMetrics Consulting, LLC
> Email: [email protected]
> Cell: 248-207-5082
>
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Ayyappa
> Chaturvedula
> Sent: Tuesday, November 29, 2022 8:52 AM
> To: [email protected]
> Subject: [NMusers] Condition number
>
> Dear all,
> I am wondering if someone can provide references for the condition
> number thresholds we are seeing (<1000) etc. Also, the other way I
> have seen when I was in graduate school that condition number <10^n
> (n- number of parameters) is OK. Personally, I am depending on
> correlation matrix rather than condition number and have seen cases
> where condition number is large (according to 1000 rule but less than
> 10^n rule) but correlation matrix is fine.
>
> I want to provide these for my teaching purposes and any help is
> greatly appreciated.
>
> Regards,
> Ayyappa
>
>
> --
> This email has been checked for viruses by Avast antivirus software.
> www.avast.com
Dear Ayyappa,
A nice discussion!
It may be worthwhile to inspect further collinearity statistics, see e.g. https://cran.r-project.org/web/packages/olsrr/vignettes/regression_diagnostics.html , with for example VIF and CI being sometimes useful to detect problematic parameters in my experience.
As is already clear from the preceding discussion, indeed please do not rely on just applying "rules" but try to think through what these properties mean for your model.
Hope this helps,
Jeroen
http://pd-value.com
[email protected]
@PD_value
+31 6 23118438
-- More value out of your data!
Quoted reply history
On 29-11-2022 17:25, Ayyappa Chaturvedula wrote:
> Hi Ken,
>
> You are correct, the 10^n rule is in the context of individual level modeling.
>
> Thank you Pete for chiming in, I learned the difference you mention from your book too.
>
> Regards,
> Ayyappa
>
> On Tue, Nov 29, 2022 at 10:19 AM Ken Kowalski < [email protected] > wrote:
>
> I have seen models with a successful COV step and CN > 10^5 but I
> certainly have not seen COV steps run with a CN > 10^20. Thus,
> the CN > 10^n has got to break down when n is large. Does
> Gabrielsson and Weiner discuss this rule in the context of simple
> nonlinear regression of individual subject (or animal) curves or
> do they also propose this rule in the context of population models
> with nonlinear mixed effects. I suspect it was only proposed for
> the former.
>
> Not to rehash old ground, but a successful COV step does not imply
> that a model is stable even if none of the pairwise correlations
> are extreme if CN is very large.
>
> *From:*Ayyappa Chaturvedula [mailto:[email protected]]
> *Sent:* Tuesday, November 29, 2022 11:07 AM
> *To:* Ken Kowalski <[email protected]>
> *Cc:* [email protected]
> *Subject:* Re: [NMusers] Condition number
>
> Hi Ken,
>
> Thank you again. But, I have seen models with 10^5 and above
> with no issues with covariance step and correlations not reaching
> 0.95 but some with moderate levels. It will be interesting to
> know other experiences.
>
> The 10^n rule is from the PK-PD Data analysis, Gabrielsson and
> Weiner, Edition 3, page 313. I read this book most of my grad
> school days.
>
> Regards,
>
> Ayyappa
>
> On Tue, Nov 29, 2022 at 9:35 AM Ken Kowalski
> <[email protected]> wrote:
>
> Hi Ayyappa,
>
> I have not seen this rule but it strikes me as being too
> liberal to apply in pharmacometrics where n can be very large
> for the models we fit. If we have a structural model with say
> n=4 or 5 parameters and then also investigate covariate
> effects on these parameters it would not be unusual to have a
> covariate model with n=20+ fixed effects parameters. I doubt
> we can get the COV step to run such that we can observe a CN
> >10^20.
>
> I have not seen CN criteria indexed by n. The classifications
> of collinearity that I've seen based on CN are:
>
> Moderate: 100 <= CN < 1000
> High: 1000 <= CN < 10,000
> Extreme: CN >= 10,000
>
> Ken
>
> -----Original Message-----
> From: Ayyappa Chaturvedula [mailto:[email protected]]
> Sent: Tuesday, November 29, 2022 10:20 AM
> To: Ken Kowalski <[email protected]>
> Cc: [email protected]
> Subject: Re: [NMusers] Condition number
>
> Thank you, Ken. It is very reassuring.
>
> I have also seen a discussion on other forums that Condition
> number as a function of dimension of problem (n). I am seeing
> contradiction between 10^n and a static >1000 approach. I am
> curious if someone can also comment on this and 10^n rule?
>
> Regards,
> Ayyappa
>
> > On Nov 29, 2022, at 9:04 AM, Ken Kowalski
> <[email protected]> wrote:
> >
> > Hi Ayyappa,
> >
> > I think the condition number was first proposed as a
> statistic to
> > diagnose multicollinearity in multiple linear regression
> analyses
> > based on an eigenvalue analysis of the X'X matrix. You can
> probably
> > search the statistical literature and multiple linear
> regression
> > textbooks to find various rules for the condition number as
> well as
> > other statistics related to the eigenvalue analysis. For
> the CN<1000
> > rule I typically reference the following textbook:
> >
> > Montgomery and Peck (1982). Introduction to Linear
> Regression Analysis.
> > Wiley, NY (pp. 301-302).
> >
> > The condition number is good at detecting model instability
> but it is
> > not very good for identifying the source. Inspecting the
> correlation
> > matrix for extreme pairwise correlations is better suited
> for identifying the source of
> > the instability when it only involves a couple of
> parameters. It becomes
> > more challenging to identify the source of the instability
> > (multicollinearity) when the CN>1000 but none of the pairwise
> > correlations are extreme |corr|>0.95. Although when CN>1000
> often we
> > will find several pairwise correlations that are moderately
> high
> > |corr|>0.7 but it may be hard to uncover a pattern or source
> of the
> > instability without trying alternative models that may
> eliminate one
> > or more of the parameters associated with these moderate to
> high correlations.
> >
> > Best,
> >
> > Ken
> >
> > Kenneth G. Kowalski
> > Kowalski PMetrics Consulting, LLC
> > Email: [email protected]
> > Cell: 248-207-5082
> >
> >
> >
> > -----Original Message-----
> > From: [email protected]
> > [mailto:[email protected]] On Behalf Of Ayyappa
> > Chaturvedula
> > Sent: Tuesday, November 29, 2022 8:52 AM
> > To: [email protected]
> > Subject: [NMusers] Condition number
> >
> > Dear all,
> > I am wondering if someone can provide references for the
> condition
> > number thresholds we are seeing (<1000) etc. Also, the other
> way I
> > have seen when I was in graduate school that condition
> number <10^n
> > (n- number of parameters) is OK. Personally, I am depending on
> > correlation matrix rather than condition number and have
> seen cases
> > where condition number is large (according to 1000 rule but
> less than
> > 10^n rule) but correlation matrix is fine.
> >
> > I want to provide these for my teaching purposes and any
> help is
> > greatly appreciated.
> >
> > Regards,
> > Ayyappa
> >
> >
> > --
> > This email has been checked for viruses by Avast antivirus
> software.
> > www.avast.com http://www.avast.com
>
> -- This email has been checked for viruses by Avast antivirus
>
> software.
> www.avast.com http://www.avast.com
>
> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient
> Virus-free.www.avast.com
>
> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient
>
> <#m_4632534978889413432_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
Dear All,
I would like to devalue condition number and multi-collinearity in
nonlinear regression.
The reason we consider condition number (or multi-collinearity) is that
this may cause the following fitting (estimation) problems;
1. Fitting failure (fail to converge, fail to minimize)
2. Unrealistic point estimates
3. Too wide standard errors
If you do not see the above problems (i.e., no estimation problem with
modest standard error), you do not need to give attention to the condition
number.
I think I saw 10^(n – parameters) criterion in an old version of
Gabrielsson’s book many years ago (but not in the latest version).
Best regards,
Kyun-Seop Bae
Quoted reply history
On Tue, 29 Nov 2022 at 22:59, Ayyappa Chaturvedula <[email protected]>
wrote:
> Dear all,
> I am wondering if someone can provide references for the condition number
> thresholds we are seeing (<1000) etc. Also, the other way I have seen when
> I was in graduate school that condition number <10^n (n- number of
> parameters) is OK. Personally, I am depending on correlation matrix rather
> than condition number and have seen cases where condition number is large
> (according to 1000 rule but less than 10^n rule) but correlation matrix is
> fine.
>
> I want to provide these for my teaching purposes and any help is greatly
> appreciated.
>
> Regards,
> Ayyappa
>
Hi Ken & Kyun-Seop,
I agree it should be taught, since it is prevalent in the industry, and
should be looked at as something to investigate further, but no hard and
fast rule should be applied to if the model is reasonable and fit for
purpose. That should be done in conjunction with other diagnostic plots.
One thing that has always bothered me about the condition number is that it
is calculated based on the final parameter estimates, but not the scaled
parameter estimates. Truly the scaling is supposed to help make the
gradient on a comparable scale and fix many numerical problems here.
Hence, if the scaling works as it is supposed to, small changes may not
affect the colinearity as strongly as the calculated condition number
suggests.
This is mainly why I see it as a number to keep in mind instead of a hard
and fast rule.
Matt
Quoted reply history
On Tue, Nov 29, 2022 at 5:09 PM Ken Kowalski <[email protected]> wrote:
> Hi Kyun-Seop,
>
>
>
> I would state things a little differently rather than say “devalue
> condition number and multi-collinearity” we should treat CN as a diagnostic
> and rules such as CN>1000 should NOT be used as a hard and fast rule to
> reject a model. I agree with Jeroen that we should understand the
> implications of a high CN and the impact multi-collinearity may have on the
> model estimation and that there are other diagnostics such as correlations,
> variance inflation factors (VIF), standard errors, CIs, etc. that can also
> help with our understanding of the effects of multi-collinearity and its
> implications for model development.
>
>
>
> That being said, if you have a model with a high CN and the model
> converges with realistic point estimates and reasonable standard errors
> then it may still be reasonable to accept that model. However, in this
> setting I would probably still want to re-run the model with different
> starting values and make sure it converges to the same OFV and set of point
> estimates.
>
>
>
> As the smallest eigenvalue goes to 0 and the CN goes to infinity we end up
> with a singular Hessian matrix (R matrix) so we know that at some point a
> high enough CN will result in convergence and COV step failures. Thus, you
> shouldn’t simply dismiss CN as not having any diagnostic value, just don’t
> apply it in a rule such as CN>1000 to blindly reject a model. The CN>1000
> rule should only be used to call your attention to the potential for an
> issue that warrants further investigation before accepting the model or
> deciding how to alter the model to improve stability in the estimation.
>
>
>
> Best,
>
>
>
> Ken
>
>
>
> Kenneth G. Kowalski
>
> Kowalski PMetrics Consulting, LLC
>
> Email: [email protected]
>
> Cell: 248-207-5082
>
>
>
>
>
>
>
> *From:* [email protected] [mailto:[email protected]]
> *On Behalf Of *Kyun-Seop Bae
> *Sent:* Tuesday, November 29, 2022 5:10 PM
> *To:* [email protected]
> *Subject:* Fwd: [NMusers] Condition numbera
>
>
>
> Dear All,
>
>
>
> I would like to devalue condition number and multi-collinearity in
> nonlinear regression.
>
> The reason we consider condition number (or multi-collinearity) is that
> this may cause the following fitting (estimation) problems;
>
> 1. Fitting failure (fail to converge, fail to minimize)
> 2. Unrealistic point estimates
> 3. Too wide standard errors
>
>
>
> If you do not see the above problems (i.e., no estimation problem with
> modest standard error), you do not need to give attention to the condition
> number.
>
>
>
> I think I saw 10^(n – parameters) criterion in an old version of
> Gabrielsson’s book many years ago (but not in the latest version).
>
>
>
> Best regards,
>
>
>
> Kyun-Seop Bae
>
>
>
> On Tue, 29 Nov 2022 at 22:59, Ayyappa Chaturvedula <[email protected]>
> wrote:
>
> Dear all,
> I am wondering if someone can provide references for the condition number
> thresholds we are seeing (<1000) etc. Also, the other way I have seen when
> I was in graduate school that condition number <10^n (n- number of
> parameters) is OK. Personally, I am depending on correlation matrix rather
> than condition number and have seen cases where condition number is large
> (according to 1000 rule but less than 10^n rule) but correlation matrix is
> fine.
>
> I want to provide these for my teaching purposes and any help is greatly
> appreciated.
>
> Regards,
> Ayyappa
>
>
>
> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient
> Virus-free.www.avast.com
> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient
> <#m_-1496757161961929954_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
Hi Matt,
Correct me if I’m wrong but I thought NONMEM calculates the condition number
based on the correlation matrix of the parameter estimates so it is scaled
based on the standard errors of the estimates.
Ken
Quoted reply history
From: Matthew Fidler [mailto:[email protected]]
Sent: Tuesday, November 29, 2022 7:04 PM
To: Ken Kowalski <[email protected]>
Cc: Kyun-Seop Bae <[email protected]>; [email protected]; Jeroen
Elassaiss-Schaap (PD-value B.V.) <[email protected]>
Subject: Re: [NMusers] Condition number
Hi Ken & Kyun-Seop,
I agree it should be taught, since it is prevalent in the industry, and should
be looked at as something to investigate further, but no hard and fast rule
should be applied to if the model is reasonable and fit for purpose. That
should be done in conjunction with other diagnostic plots.
One thing that has always bothered me about the condition number is that it is
calculated based on the final parameter estimates, but not the scaled parameter
estimates. Truly the scaling is supposed to help make the gradient on a
comparable scale and fix many numerical problems here. Hence, if the scaling
works as it is supposed to, small changes may not affect the colinearity as
strongly as the calculated condition number suggests.
This is mainly why I see it as a number to keep in mind instead of a hard and
fast rule.
Matt
On Tue, Nov 29, 2022 at 5:09 PM Ken Kowalski <[email protected]
<mailto:[email protected]> > wrote:
Hi Kyun-Seop,
I would state things a little differently rather than say “devalue condition
number and multi-collinearity” we should treat CN as a diagnostic and rules
such as CN>1000 should NOT be used as a hard and fast rule to reject a model.
I agree with Jeroen that we should understand the implications of a high CN and
the impact multi-collinearity may have on the model estimation and that there
are other diagnostics such as correlations, variance inflation factors (VIF),
standard errors, CIs, etc. that can also help with our understanding of the
effects of multi-collinearity and its implications for model development.
That being said, if you have a model with a high CN and the model converges
with realistic point estimates and reasonable standard errors then it may still
be reasonable to accept that model. However, in this setting I would probably
still want to re-run the model with different starting values and make sure it
converges to the same OFV and set of point estimates.
As the smallest eigenvalue goes to 0 and the CN goes to infinity we end up with
a singular Hessian matrix (R matrix) so we know that at some point a high
enough CN will result in convergence and COV step failures. Thus, you
shouldn’t simply dismiss CN as not having any diagnostic value, just don’t
apply it in a rule such as CN>1000 to blindly reject a model. The CN>1000 rule
should only be used to call your attention to the potential for an issue that
warrants further investigation before accepting the model or deciding how to
alter the model to improve stability in the estimation.
Best,
Ken
Kenneth G. Kowalski
Kowalski PMetrics Consulting, LLC
Email: <mailto:[email protected]> [email protected]
Cell: 248-207-5082
From: [email protected] <mailto:[email protected]>
[mailto:[email protected] <mailto:[email protected]> ] On
Behalf Of Kyun-Seop Bae
Sent: Tuesday, November 29, 2022 5:10 PM
To: [email protected] <mailto:[email protected]>
Subject: Fwd: [NMusers] Condition numbera
Dear All,
I would like to devalue condition number and multi-collinearity in nonlinear
regression.
The reason we consider condition number (or multi-collinearity) is that this
may cause the following fitting (estimation) problems;
1. Fitting failure (fail to converge, fail to minimize)
2. Unrealistic point estimates
3. Too wide standard errors
If you do not see the above problems (i.e., no estimation problem with modest
standard error), you do not need to give attention to the condition number.
I think I saw 10^(n – parameters) criterion in an old version of Gabrielsson’s
book many years ago (but not in the latest version).
Best regards,
Kyun-Seop Bae
On Tue, 29 Nov 2022 at 22:59, Ayyappa Chaturvedula <[email protected]
<mailto:[email protected]> > wrote:
Dear all,
I am wondering if someone can provide references for the condition number
thresholds we are seeing (<1000) etc. Also, the other way I have seen when I
was in graduate school that condition number <10^n (n- number of parameters)
is OK. Personally, I am depending on correlation matrix rather than condition
number and have seen cases where condition number is large (according to 1000
rule but less than 10^n rule) but correlation matrix is fine.
I want to provide these for my teaching purposes and any help is greatly
appreciated.
Regards,
Ayyappa
https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient
Virus-free.
https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient
www.avast.com
--
This email has been checked for viruses by Avast antivirus software.
www.avast.com
from the manual:
Iteration -1000000003 indicates that this line contains the condition number , lowest, highest, Eigen values of the correlation matrix of the variances of the final parameters.
Quoted reply history
On 11/29/2022 7:59 PM, Ken Kowalski wrote:
> Hi Matt,
>
> I’m pretty sure Stu Beal told me many years ago that NONMEM calculates the eigenvalues from the correlation matrix. Maybe Bob Bauer can chime in here?
>
> Ken
>
> *From:*Matthew Fidler [mailto:[email protected]]
> *Sent:* Tuesday, November 29, 2022 7:56 PM
> *To:* Ken Kowalski <[email protected]>
>
> *Cc:* Kyun-Seop Bae < [email protected] >; [email protected] ; Jeroen Elassaiss-Schaap (PD-value B.V.) < [email protected] >
>
> *Subject:* Re: [NMusers] Condition number
>
> Hi Ken,
>
> I am unsure, since I don't have my NONMEM manual handy.
>
> I based my understanding on reading about condition numbers in numerical analysis, which seemed to use the parameter estimates:
>
> https://en.wikipedia.org/wiki/Condition_number < https://en.wikipedia.org/wiki/Condition_number >
>
> If it uses the correlation matrix, it could be less sensitive.
>
> Matt
>
> On Tue, Nov 29, 2022 at 6:11 PM Ken Kowalski < [email protected] < mailto: [email protected] >> wrote:
>
> Hi Matt,
>
> Correct me if I’m wrong but I thought NONMEM calculates the
> condition number based on the correlation matrix of the parameter
> estimates so it is scaled based on the standard errors of the estimates.
>
> Ken
>
> *From:*Matthew Fidler [mailto:[email protected]
> <mailto:[email protected]>]
> *Sent:* Tuesday, November 29, 2022 7:04 PM
> *To:* Ken Kowalski <[email protected]
> <mailto:[email protected]>>
> *Cc:* Kyun-Seop Bae <[email protected]
> <mailto:[email protected]>>; [email protected]
> <mailto:[email protected]>; Jeroen Elassaiss-Schaap (PD-value
> B.V.) <[email protected] <mailto:[email protected]>>
> *Subject:* Re: [NMusers] Condition number
>
> Hi Ken & Kyun-Seop,
>
> I agree it should be taught, since it is prevalent in the industry,
> and should be looked at as something to investigate further, but no
> hard and fast rule should be applied to if the model is reasonable
> and fit for purpose. That should be done in conjunction with other
> diagnostic plots.
>
> One thing that has always bothered me about the condition number is
> that it is calculated based on the final parameter estimates, but
> not the scaled parameter estimates. Truly the scaling is supposed
> to help make the gradient on a comparable scale and fix many
> numerical problems here. Hence, if the scaling works as it is
> supposed to, small changes may not affect the colinearity as
> strongly as the calculated condition number suggests.
>
> This is mainly why I see it as a number to keep in mind instead of a
> hard and fast rule.
>
> Matt
>
> On Tue, Nov 29, 2022 at 5:09 PM Ken Kowalski <[email protected]
> <mailto:[email protected]>> wrote:
>
> Hi Kyun-Seop,
>
> I would state things a little differently rather than say
> “devalue condition number and multi-collinearity” we should
> treat CN as a diagnostic and rules such as CN>1000 should NOT be
> used as a hard and fast rule to reject a model. I agree with
> Jeroen that we should understand the implications of a high CN
> and the impact multi-collinearity may have on the model
> estimation and that there are other diagnostics such as
> correlations, variance inflation factors (VIF), standard errors,
> CIs, etc. that can also help with our understanding of the
> effects of multi-collinearity and its implications for model
> development.
>
> That being said, if you have a model with a high CN and the
> model converges with realistic point estimates and reasonable
> standard errors then it may still be reasonable to accept that
> model. However, in this setting I would probably still want to
> re-run the model with different starting values and make sure it
> converges to the same OFV and set of point estimates.
>
> As the smallest eigenvalue goes to 0 and the CN goes to infinity
> we end up with a singular Hessian matrix (R matrix) so we know
> that at some point a high enough CN will result in convergence
> and COV step failures. Thus, you shouldn’t simply dismiss CN as
> not having any diagnostic value, just don’t apply it in a rule
> such as CN>1000 to blindly reject a model. The CN>1000 rule
> should only be used to call your attention to the potential for
> an issue that warrants further investigation before accepting
> the model or deciding how to alter the model to improve
> stability in the estimation.
>
> Best,
>
> Ken
>
> Kenneth G. Kowalski
>
> Kowalski PMetrics Consulting, LLC
>
> Email: [email protected] <mailto:[email protected]>
>
> Cell: 248-207-5082
>
> *From:*[email protected]
> <mailto:[email protected]>
> [mailto:[email protected]
> <mailto:[email protected]>] *On Behalf Of *Kyun-Seop Bae
> *Sent:* Tuesday, November 29, 2022 5:10 PM
> *To:* [email protected] <mailto:[email protected]>
> *Subject:* Fwd: [NMusers] Condition numbera
>
> Dear All,
>
> I would like to devalue condition number and multi-collinearity
> in nonlinear regression.
>
> The reason we consider condition number (or multi-collinearity)
> is that this may cause the following fitting (estimation) problems;
>
> 1. Fitting failure (fail to converge, fail to minimize)
> 2. Unrealistic point estimates
> 3. Too wide standard errors
>
> If you do not see the above problems (i.e., no estimation
> problem with modest standard error), you do not need to give
> attention to the condition number.
>
> I think I saw 10^(n – parameters) criterion in an old version of
> Gabrielsson’s book many years ago (but not in the latest version).
>
> Best regards,
>
> Kyun-Seop Bae
>
> On Tue, 29 Nov 2022 at 22:59, Ayyappa Chaturvedula
> <[email protected] <mailto:[email protected]>> wrote:
>
> Dear all,
> I am wondering if someone can provide references for the
> condition number thresholds we are seeing (<1000) etc. Also,
> the other way I have seen when I was in graduate school that
> condition number <10^n (n- number of parameters) is OK.
> Personally, I am depending on correlation matrix rather than
> condition number and have seen cases where condition number
> is large (according to 1000 rule but less than 10^n rule)
> but correlation matrix is fine.
>
> I want to provide these for my teaching purposes and any
> help is greatly appreciated.
>
> Regards,
> Ayyappa
>
> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient
>
> Virus-free.www.avast.com
>
> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient
This is great. Just like the glory days of NMusers. Any moment now Nick Holford
is jointing to chime in.
I’m not an expert in matrix algebra but is the correlation matrix the right one
to be using? We are concerned about inversion of the hessian. That
instability is what affects our parameter estimates and standard errors.
Doesn’t that depend on the Jacobian? Shouldn’t we be looking at the
eigenvalues of the Jacobian matrix instead?
And to echo what was already said. Never use the condition number as an
absolute. It’s a yardstick. FYI- one time I got a negative eigenvalue from
nonmem and would not have known how unstable the model was unless I looked at
the eigenvalue.
Pete.
Quoted reply history
> On Nov 29, 2022, at 7:17 PM, Leonid Gibiansky <[email protected]>
> wrote:
>
> from the manual:
>
> Iteration -1000000003 indicates that this line contains the condition number
> , lowest, highest, Eigen values of the correlation matrix of the variances of
> the final parameters.
>
>
>
>> On 11/29/2022 7:59 PM, Ken Kowalski wrote:
>> Hi Matt,
>> I’m pretty sure Stu Beal told me many years ago that NONMEM calculates the
>> eigenvalues from the correlation matrix. Maybe Bob Bauer can chime in here?
>> Ken
>> *From:*Matthew Fidler [mailto:[email protected]]
>> *Sent:* Tuesday, November 29, 2022 7:56 PM
>> *To:* Ken Kowalski <[email protected]>
>> *Cc:* Kyun-Seop Bae <[email protected]>; [email protected]; Jeroen
>> Elassaiss-Schaap (PD-value B.V.) <[email protected]>
>> *Subject:* Re: [NMusers] Condition number
>> Hi Ken,
>> I am unsure, since I don't have my NONMEM manual handy.
>> I based my understanding on reading about condition numbers in numerical
>> analysis, which seemed to use the parameter estimates:
>> https://en.wikipedia.org/wiki/Condition_number
>> https://en.wikipedia.org/wiki/Condition_number
>> If it uses the correlation matrix, it could be less sensitive.
>> Matt
>> On Tue, Nov 29, 2022 at 6:11 PM Ken Kowalski <[email protected]
>> <mailto:[email protected]>> wrote:
>> Hi Matt,
>> Correct me if I’m wrong but I thought NONMEM calculates the
>> condition number based on the correlation matrix of the parameter
>> estimates so it is scaled based on the standard errors of the estimates.
>> Ken
>> *From:*Matthew Fidler [mailto:[email protected]
>> <mailto:[email protected]>]
>> *Sent:* Tuesday, November 29, 2022 7:04 PM
>> *To:* Ken Kowalski <[email protected]
>> <mailto:[email protected]>>
>> *Cc:* Kyun-Seop Bae <[email protected]
>> <mailto:[email protected]>>; [email protected]
>> <mailto:[email protected]>; Jeroen Elassaiss-Schaap (PD-value
>> B.V.) <[email protected] <mailto:[email protected]>>
>> *Subject:* Re: [NMusers] Condition number
>> Hi Ken & Kyun-Seop,
>> I agree it should be taught, since it is prevalent in the industry,
>> and should be looked at as something to investigate further, but no
>> hard and fast rule should be applied to if the model is reasonable
>> and fit for purpose. That should be done in conjunction with other
>> diagnostic plots.
>> One thing that has always bothered me about the condition number is
>> that it is calculated based on the final parameter estimates, but
>> not the scaled parameter estimates. Truly the scaling is supposed
>> to help make the gradient on a comparable scale and fix many
>> numerical problems here. Hence, if the scaling works as it is
>> supposed to, small changes may not affect the colinearity as
>> strongly as the calculated condition number suggests.
>> This is mainly why I see it as a number to keep in mind instead of a
>> hard and fast rule.
>> Matt
>> On Tue, Nov 29, 2022 at 5:09 PM Ken Kowalski <[email protected]
>> <mailto:[email protected]>> wrote:
>> Hi Kyun-Seop,
>> I would state things a little differently rather than say
>> “devalue condition number and multi-collinearity” we should
>> treat CN as a diagnostic and rules such as CN>1000 should NOT be
>> used as a hard and fast rule to reject a model. I agree with
>> Jeroen that we should understand the implications of a high CN
>> and the impact multi-collinearity may have on the model
>> estimation and that there are other diagnostics such as
>> correlations, variance inflation factors (VIF), standard errors,
>> CIs, etc. that can also help with our understanding of the
>> effects of multi-collinearity and its implications for model
>> development.
>> That being said, if you have a model with a high CN and the
>> model converges with realistic point estimates and reasonable
>> standard errors then it may still be reasonable to accept that
>> model. However, in this setting I would probably still want to
>> re-run the model with different starting values and make sure it
>> converges to the same OFV and set of point estimates.
>> As the smallest eigenvalue goes to 0 and the CN goes to infinity
>> we end up with a singular Hessian matrix (R matrix) so we know
>> that at some point a high enough CN will result in convergence
>> and COV step failures. Thus, you shouldn’t simply dismiss CN as
>> not having any diagnostic value, just don’t apply it in a rule
>> such as CN>1000 to blindly reject a model. The CN>1000 rule
>> should only be used to call your attention to the potential for
>> an issue that warrants further investigation before accepting
>> the model or deciding how to alter the model to improve
>> stability in the estimation.
>> Best,
>> Ken
>> Kenneth G. Kowalski
>> Kowalski PMetrics Consulting, LLC
>> Email: [email protected] <mailto:[email protected]>
>> Cell: 248-207-5082
>> *From:*[email protected]
>> <mailto:[email protected]>
>> [mailto:[email protected]
>> <mailto:[email protected]>] *On Behalf Of *Kyun-Seop Bae
>> *Sent:* Tuesday, November 29, 2022 5:10 PM
>> *To:* [email protected] <mailto:[email protected]>
>> *Subject:* Fwd: [NMusers] Condition numbera
>> Dear All,
>> I would like to devalue condition number and multi-collinearity
>> in nonlinear regression.
>> The reason we consider condition number (or multi-collinearity)
>> is that this may cause the following fitting (estimation) problems;
>> 1. Fitting failure (fail to converge, fail to minimize)
>> 2. Unrealistic point estimates
>> 3. Too wide standard errors
>> If you do not see the above problems (i.e., no estimation
>> problem with modest standard error), you do not need to give
>> attention to the condition number.
>> I think I saw 10^(n – parameters) criterion in an old version of
>> Gabrielsson’s book many years ago (but not in the latest version).
>> Best regards,
>> Kyun-Seop Bae
>> On Tue, 29 Nov 2022 at 22:59, Ayyappa Chaturvedula
>> <[email protected] <mailto:[email protected]>> wrote:
>> Dear all,
>> I am wondering if someone can provide references for the
>> condition number thresholds we are seeing (<1000) etc. Also,
>> the other way I have seen when I was in graduate school that
>> condition number <10^n (n- number of parameters) is OK.
>> Personally, I am depending on correlation matrix rather than
>> condition number and have seen cases where condition number
>> is large (according to 1000 rule but less than 10^n rule)
>> but correlation matrix is fine.
>> I want to provide these for my teaching purposes and any
>> help is greatly appreciated.
>> Regards,
>> Ayyappa
>>
>> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient
>>
>> Virus-free.www.avast.com
>>
>> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient
>
Hello all:
Report of non-positive definiteness or negative eigenvalues, are reported
during the analysis of the R matrix (decomposition and inversion), which occurs
before the correlation matrix is constructed. Often, this is caused by
numerical imprecision. If the R matrix step fails, the $COV step fails to
produce a final variance-covariance matrix, and of course, does not produce a
correlation matrix. If the R matrix inversion step succeeds, the
variance-covariance matrix and its correlation matrix are produced, and the
correlation matrix is then assessed for its eigenvalues. So, both the R matrix
(first step) and correlation matrix (second step) are decomposed and assessed.
Robert J. Bauer, Ph.D.
Senior Director
Pharmacometrics R&D
ICON Early Phase
731 Arbor way, suite 100
Blue Bell, PA 19422
Office: (215) 616-6428
Mobile: (925) 286-0769
[email protected]<mailto:[email protected]>
http://www.iconplc.com/
Hi everyone,
This has been a great discussion!
Bob: I’d like to clarify something that Pete, Matt, Ken, and Leonid were
discussing about how the covariance matrix is calculated. I believe that
NONMEM rescales the values for estimation and then reverses the rescaling
for reporting. Is the covariance matrix calculated on the rescaled values
or on the final parameter estimate values?
Thanks,
Bill
*From:* [email protected] <[email protected]> *On
Behalf Of *Bauer, Robert
*Sent:* Wednesday, November 30, 2022 1:53 PM
*To:* '[email protected]' <[email protected]>
*Subject:* RE: [NMusers] Condition number
Hello all:
Report of non-positive definiteness or negative eigenvalues, are reported
during the analysis of the R matrix (decomposition and inversion), which
occurs before the correlation matrix is constructed. Often, this is
caused by numerical imprecision. If the R matrix step fails, the $COV step
fails to produce a final variance-covariance matrix, and of course, does
not produce a correlation matrix. If the R matrix inversion step succeeds,
the variance-covariance matrix and its correlation matrix are produced, and
the correlation matrix is then assessed for its eigenvalues. So, both the
R matrix (first step) and correlation matrix (second step) are decomposed
and assessed.
Robert J. Bauer, Ph.D.
Senior Director
Pharmacometrics R&D
ICON Early Phase
731 Arbor way, suite 100
Blue Bell, PA 19422
Office: (215) 616-6428
Mobile: (925) 286-0769
[email protected]
www.iconplc.com
Dear Bill,
You can see how NONMEM calculates in R with nmw package.
install.packages("nmw")
DataAll = Theoph
colnames(DataAll) = c("ID", "BWT", "DOSE", "TIME", "DV")
DataAll[,"ID"] = as.numeric(as.character(DataAll[,"ID"]))
nTheta = 3
nEta = 3
nEps = 2
THETAinit = c(2, 50, 0.1)
OMinit = matrix(c(0.2, 0.1, 0.1, 0.1, 0.2, 0.1, 0.1, 0.1, 0.2), nrow=nEta,
ncol=nEta)
SGinit = diag(c(0.1, 0.1))
LB = rep(0, nTheta) # Lower bound
UB = rep(1000000, nTheta) # Upper bound
FGD = deriv(~DOSE/(TH2*exp(ETA2))*TH1*exp(ETA1)/(TH1*exp(ETA1) -
TH3*exp(ETA3))*
(exp(-TH3*exp(ETA3)*TIME)-exp(-TH1*exp(ETA1)*TIME)),
c("ETA1","ETA2","ETA3"),
function.arg=c("TH1", "TH2", "TH3", "ETA1", "ETA2", "ETA3",
"DOSE", "TIME"),
func=TRUE, hessian=TRUE)
H = deriv(~F + F*EPS1 + EPS2, c("EPS1", "EPS2"), function.arg=c("F",
"EPS1", "EPS2"), func=TRUE)
PRED = function(THETA, ETA, DATAi)
{
FGDres = FGD(THETA[1], THETA[2], THETA[3], ETA[1], ETA[2], ETA[3],
DOSE=320, DATAi[,"TIME"])
Gres = attr(FGDres, "gradient")
Hres = attr(H(FGDres, 0, 0), "gradient")
if (e$METHOD == "LAPL") {
Dres = attr(FGDres, "hessian")
Res = cbind(FGDres, Gres, Hres, Dres[,1,1], Dres[,2,1], Dres[,2,2],
Dres[,3,])
colnames(Res) = c("F", "G1", "G2", "G3", "H1", "H2", "D11", "D21",
"D22", "D31", "D32", "D33")
} else {
Res = cbind(FGDres, Gres, Hres)
colnames(Res) = c("F", "G1", "G2", "G3", "H1", "H2")
}
return(Res)
}
####### First Order Approximation Method
InitStep(DataAll, THETAinit=THETAinit, OMinit=OMinit, SGinit=SGinit, LB=LB,
UB=UB,
Pred=PRED, METHOD="ZERO")
(EstRes = EstStep()) # 4 sec
(CovRes = CovStep()) # 2 sec
PostHocEta() # Using e$FinalPara from EstStep()
TabStep() # Commented out for the CRAN CPU time
######## First Order Conditional Estimation with Interaction Method
InitStep(DataAll, THETAinit=THETAinit, OMinit=OMinit, SGinit=SGinit, LB=LB,
UB=UB,
Pred=PRED, METHOD="COND")
(EstRes = EstStep()) # 2 min
(CovRes = CovStep()) # 1 min
get("EBE", envir=e)
TabStep()
######## Laplacian Approximation with Interacton Method
InitStep(DataAll, THETAinit=THETAinit, OMinit=OMinit, SGinit=SGinit, LB=LB,
UB=UB,
Pred=PRED, METHOD="LAPL")
(EstRes = EstStep()) # 4 min
(CovRes = CovStep()) # 1 min
get("EBE", envir=e)
TabStep()
Best regards,
Kyun-Seop Bae
Quoted reply history
On Thu, 1 Dec 2022 at 04:38, Bill Denney <[email protected]>
wrote:
> Hi everyone,
>
>
>
> This has been a great discussion!
>
>
>
> Bob: I’d like to clarify something that Pete, Matt, Ken, and Leonid were
> discussing about how the covariance matrix is calculated. I believe that
> NONMEM rescales the values for estimation and then reverses the rescaling
> for reporting. Is the covariance matrix calculated on the rescaled values
> or on the final parameter estimate values?
>
>
>
> Thanks,
>
>
>
> Bill
>
>
>
> *From:* [email protected] <[email protected]> *On
> Behalf Of *Bauer, Robert
> *Sent:* Wednesday, November 30, 2022 1:53 PM
> *To:* '[email protected]' <[email protected]>
> *Subject:* RE: [NMusers] Condition number
>
>
>
> Hello all:
>
> Report of non-positive definiteness or negative eigenvalues, are reported
> during the analysis of the R matrix (decomposition and inversion), which
> occurs before the correlation matrix is constructed. Often, this is
> caused by numerical imprecision. If the R matrix step fails, the $COV step
> fails to produce a final variance-covariance matrix, and of course, does
> not produce a correlation matrix. If the R matrix inversion step succeeds,
> the variance-covariance matrix and its correlation matrix are produced, and
> the correlation matrix is then assessed for its eigenvalues. So, both the
> R matrix (first step) and correlation matrix (second step) are decomposed
> and assessed.
>
>
>
> Robert J. Bauer, Ph.D.
>
> Senior Director
>
> Pharmacometrics R&D
>
> ICON Early Phase
>
> 731 Arbor way, suite 100
>
> Blue Bell, PA 19422
>
> Office: (215) 616-6428
>
> Mobile: (925) 286-0769
>
> [email protected]
>
> www.iconplc.com
>
>
>
>
>
>
Thanks Ken and Al. I miss these discussions, while others in NMusers are
probably thinking “how can there be that many emails on this”.
I think we are conflating different things. The reason we look at the CN is
that during the optimization process NONMEM has to invert a matrix (its either
gradient or the Hessian, I am not sure), and also again in the calculation of
the standard errors. The log10 CN is how many digits are lost in that
inversion process. You can have matrix instability for many reasons,
collinearity being one of them, but that is not the only reason. As you said,
you can have parameters that are widely different in scale; that can also cause
instability in that inversion process. Thus, a high CN does not always imply
you have collinearity.
So if NONMEM is reporting the eigenvalues of the correlation matrix, then this
has a couple of consequences:
• The CN no longer means how many digits are lost during inversion
• The CN no longer indicates how stable that matrix inversion is
• The only thing it is good for now is detecting collinearity.
• We use a cutoff value of 1000 because that implies we lose 3 digits of
accuracy during the inversion. This value may not be applicable to a
correlation matrix eigenvector.
This is why using the correlation matrix makes no sense to me. Still doesn’t.
Ayyappa – see what a can of worms you opened. Lol.
pete
Peter Bonate, PhD
Executive Director
Pharmacokinetics, Modeling, and Simulation (PKMS)
Clinical Pharmacology and Exploratory Development (CPED)
Astellas
1 Astellas Way
Northbrook, IL 60062
[email protected]
(224) 619-4901
Quote of the week –
“Dancing with the Stars” is not owned by Astellas.
Quoted reply history
-----Original Message-----
From: Bonate, Peter
Sent: Wednesday, November 30, 2022 9:13 AM
To: Ken Kowalski <[email protected]>; 'Leonid Gibiansky'
<[email protected]>
Cc: 'Matthew Fidler' <[email protected]>; 'Kyun-Seop Bae'
<[email protected]>; [email protected]; 'Jeroen Elassaiss-Schaap
(PD-value B.V.)' <[email protected]>
Subject: RE: [NMusers] Condition number
Just wanted to follow up on a few things.
First, Nick, glad to hear from you again.
I gave up trying to understand how NONMEM works years ago. I don't need to
know how the engine works in my car to drive it or make me a better driver.
But I did look at CN years ago. One of my first publications, The Effect of
Collinearity on Parameter Estimates in Nonlinear Mixed Effect Models (1999),
showed that when you put correlated covariates into a model (with correlations
greater than 0.75) the standard error of the estimates become inflated, and the
estimates of the parameters themselves becomes biased. This is why we don't
put weight and BSA on the same parameter, for example. You can spot this
problem easily from the CN. So, although I would never choose a model based
solely on its condition number, I always look at it as part of the totality of
evidence for how good a model is. But maybe that's just me.
And to follow up with this statement from Ken:
That is, a high CN in any one of the three matrices (Hessian,
covariance matrix, correlation matrix) will result in a high CN in the others.
I would think that the correlation matrix will give you the smallest condition
number because it's scaled. I needed to see this for myself in R. I made a
covariance matrix and computed the eigenvalues then transformed it to a
correlation matrix. The condition number of the correlation matrix is lower
than the covariance matrix condition number.
> cov <- c(10, 2, 1, 2, 4, 3, 1, 3, 6)
> cov <- matrix(cov, nrow=3, byrow=TRUE) cov
[,1] [,2] [,3]
[1,] 10 2 1
[2,] 2 4 3
[3,] 1 3 6
> p <- cov2cor(cov)
> p
[,1] [,2] [,3]
[1,] 1.0000000 0.3162278 0.1290994
[2,] 0.3162278 1.0000000 0.6123724
[3,] 0.1290994 0.6123724 1.0000000
> eig.cov <- eigen(cov)
> eig.p <- eigen(p)
> CN.cov <- eig.cov$values[1]/eig.cov$values[3]
> CN.p <- eig.p$values[1]/eig.p$values[3] CN.cov
[1] 6.68266
> CN.p
[1] 4.899988
So I guess we need Bob Bauer to chime in on this latter issue.
pete
Peter Bonate, PhD
Executive Director
Pharmacokinetics, Modeling, and Simulation (PKMS) Clinical Pharmacology and
Exploratory Development (CPED) Astellas
1 Astellas Way
Northbrook, IL 60062
[email protected]
(224) 619-4901
Quote of the week –
“Dancing with the Stars” is not owned by Astellas.
-----Original Message-----
From: Ken Kowalski <[email protected]>
Sent: Tuesday, November 29, 2022 8:29 PM
To: Bonate, Peter <[email protected]>; 'Leonid Gibiansky'
<[email protected]>
Cc: 'Matthew Fidler' <[email protected]>; 'Kyun-Seop Bae'
<[email protected]>; [email protected]; 'Jeroen Elassaiss-Schaap
(PD-value B.V.)' <[email protected]>
Subject: RE: [NMusers] Condition number
Hi Pete,
I would say the Hessian would be the more appropriate matrix rather than the
Jacobian since the covariance matrix of the parameter estimates is typically
estimated as the inverse of the Hessian for most nonlinear regression packages
and what NONMEM does if you use the MATRIX=R option on the $COV step instead of
NONMEM's default sandwich estimator. Looking at the eigenvalues of the
Hessian, or the eigenvalues of the covariance matrix of the parameter estimates
or the eigenvalues of the correlation matrix of the parameter estimates are all
going to be related. That is, a high CN in any one of the three matrices
(Hessian, covariance matrix, correlation matrix) will result in a high CN in
the others.
I have encountered NONMEM reporting a negative eigenvalue too. I assume this
is the result of a numerical precision issue because if it was truly negative,
then the Hessian would not be positive semi-definite and hence the COV step
should fail. I am not a numerical analyst so this is another issue that I
would be interested in hearing from Bob Bauer on how NONMEM can report a
negative eigenvalue.
Best,
Ken
Kenneth G. Kowalski
Kowalski PMetrics Consulting, LLC
Email: [email protected]
Cell: 248-207-5082
-----Original Message-----
From: Bonate, Peter [mailto:[email protected]]
Sent: Tuesday, November 29, 2022 8:27 PM
To: Leonid Gibiansky <[email protected]>
Cc: Ken Kowalski <[email protected]>; Matthew Fidler
<[email protected]>; Kyun-Seop Bae <[email protected]>;
[email protected]; Jeroen Elassaiss-Schaap (PD-value B.V.)
<[email protected]>
Subject: Re: [NMusers] Condition number
This is great. Just like the glory days of NMusers. Any moment now Nick Holford
is jointing to chime in.
I’m not an expert in matrix algebra but is the correlation matrix the right one
to be using? We are concerned about inversion of the hessian. That
instability is what affects our parameter estimates and standard errors.
Doesn’t that depend on the Jacobian? Shouldn’t we be looking at the
eigenvalues of the Jacobian matrix instead?
And to echo what was already said. Never use the condition number as an
absolute. It’s a yardstick. FYI- one time I got a negative eigenvalue from
nonmem and would not have known how unstable the model was unless I looked at
the eigenvalue.
Pete.
> On Nov 29, 2022, at 7:17 PM, Leonid Gibiansky <[email protected]>
> wrote:
>
> from the manual:
>
> Iteration -1000000003 indicates that this line contains the condition number
> , lowest, highest, Eigen values of the correlation matrix of the variances of
> the final parameters.
>
>
>
>> On 11/29/2022 7:59 PM, Ken Kowalski wrote:
>> Hi Matt,
>> I’m pretty sure Stu Beal told me many years ago that NONMEM calculates the
>> eigenvalues from the correlation matrix. Maybe Bob Bauer can chime in here?
>> Ken
>> *From:*Matthew Fidler [mailto:[email protected]]
>> *Sent:* Tuesday, November 29, 2022 7:56 PM
>> *To:* Ken Kowalski <[email protected]>
>> *Cc:* Kyun-Seop Bae <[email protected]>; [email protected];
>> Jeroen Elassaiss-Schaap (PD-value B.V.) <[email protected]>
>> *Subject:* Re: [NMusers] Condition number Hi Ken, I am unsure, since
>> I don't have my NONMEM manual handy.
>> I based my understanding on reading about condition numbers in numerical
>> analysis, which seemed to use the parameter estimates:
>> https://en.wikipedia.org/wiki/Condition_number
>> https://en.wikipedia.org/wiki/Condition_number
>> If it uses the correlation matrix, it could be less sensitive.
>> Matt
>> On Tue, Nov 29, 2022 at 6:11 PM Ken Kowalski <[email protected]
>> <mailto:[email protected]>> wrote:
>> Hi Matt,
>> Correct me if I’m wrong but I thought NONMEM calculates the
>> condition number based on the correlation matrix of the parameter
>> estimates so it is scaled based on the standard errors of the estimates.
>> Ken
>> *From:*Matthew Fidler [mailto:[email protected]
>> <mailto:[email protected]>]
>> *Sent:* Tuesday, November 29, 2022 7:04 PM
>> *To:* Ken Kowalski <[email protected]
>> <mailto:[email protected]>>
>> *Cc:* Kyun-Seop Bae <[email protected]
>> <mailto:[email protected]>>; [email protected]
>> <mailto:[email protected]>; Jeroen Elassaiss-Schaap (PD-value
>> B.V.) <[email protected] <mailto:[email protected]>>
>> *Subject:* Re: [NMusers] Condition number
>> Hi Ken & Kyun-Seop,
>> I agree it should be taught, since it is prevalent in the industry,
>> and should be looked at as something to investigate further, but no
>> hard and fast rule should be applied to if the model is reasonable
>> and fit for purpose. That should be done in conjunction with other
>> diagnostic plots.
>> One thing that has always bothered me about the condition number is
>> that it is calculated based on the final parameter estimates, but
>> not the scaled parameter estimates. Truly the scaling is supposed
>> to help make the gradient on a comparable scale and fix many
>> numerical problems here. Hence, if the scaling works as it is
>> supposed to, small changes may not affect the colinearity as
>> strongly as the calculated condition number suggests.
>> This is mainly why I see it as a number to keep in mind instead of a
>> hard and fast rule.
>> Matt
>> On Tue, Nov 29, 2022 at 5:09 PM Ken Kowalski <[email protected]
>> <mailto:[email protected]>> wrote:
>> Hi Kyun-Seop,
>> I would state things a little differently rather than say
>> “devalue condition number and multi-collinearity” we should
>> treat CN as a diagnostic and rules such as CN>1000 should NOT be
>> used as a hard and fast rule to reject a model. I agree with
>> Jeroen that we should understand the implications of a high CN
>> and the impact multi-collinearity may have on the model
>> estimation and that there are other diagnostics such as
>> correlations, variance inflation factors (VIF), standard errors,
>> CIs, etc. that can also help with our understanding of the
>> effects of multi-collinearity and its implications for model
>> development.
>> That being said, if you have a model with a high CN and the
>> model converges with realistic point estimates and reasonable
>> standard errors then it may still be reasonable to accept that
>> model. However, in this setting I would probably still want to
>> re-run the model with different starting values and make sure it
>> converges to the same OFV and set of point estimates.
>> As the smallest eigenvalue goes to 0 and the CN goes to infinity
>> we end up with a singular Hessian matrix (R matrix) so we know
>> that at some point a high enough CN will result in convergence
>> and COV step failures. Thus, you shouldn’t simply dismiss CN as
>> not having any diagnostic value, just don’t apply it in a rule
>> such as CN>1000 to blindly reject a model. The CN>1000 rule
>> should only be used to call your attention to the potential for
>> an issue that warrants further investigation before accepting
>> the model or deciding how to alter the model to improve
>> stability in the estimation.
>> Best,
>> Ken
>> Kenneth G. Kowalski
>> Kowalski PMetrics Consulting, LLC
>> Email: [email protected] <mailto:[email protected]>
>> Cell: 248-207-5082
>> *From:*[email protected]
>> <mailto:[email protected]>
>> [mailto:[email protected]
>> <mailto:[email protected]>] *On Behalf Of *Kyun-Seop Bae
>> *Sent:* Tuesday, November 29, 2022 5:10 PM
>> *To:* [email protected] <mailto:[email protected]>
>> *Subject:* Fwd: [NMusers] Condition numbera
>> Dear All,
>> I would like to devalue condition number and multi-collinearity
>> in nonlinear regression.
>> The reason we consider condition number (or multi-collinearity)
>> is that this may cause the following fitting (estimation) problems;
>> 1. Fitting failure (fail to converge, fail to minimize)
>> 2. Unrealistic point estimates
>> 3. Too wide standard errors
>> If you do not see the above problems (i.e., no estimation
>> problem with modest standard error), you do not need to give
>> attention to the condition number.
>> I think I saw 10^(n – parameters) criterion in an old version of
>> Gabrielsson’s book many years ago (but not in the latest version).
>> Best regards,
>> Kyun-Seop Bae
>> On Tue, 29 Nov 2022 at 22:59, Ayyappa Chaturvedula
>> <[email protected] <mailto:[email protected]>> wrote:
>> Dear all,
>> I am wondering if someone can provide references for the
>> condition number thresholds we are seeing (<1000) etc. Also,
>> the other way I have seen when I was in graduate school that
>> condition number <10^n (n- number of parameters) is OK.
>> Personally, I am depending on correlation matrix rather than
>> condition number and have seen cases where condition number
>> is large (according to 1000 rule but less than 10^n rule)
>> but correlation matrix is fine.
>> I want to provide these for my teaching purposes and any
>> help is greatly appreciated.
>> Regards,
>> Ayyappa
>>
>> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm
> _campaign=sig-email&utm_content=emailclient>
>>
>> Virus-free.www.avast.com
>>
>> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm
> _campaign=sig-email&utm_content=emailclient>
>
--
This email has been checked for viruses by Avast antivirus software.
www.avast.com
Hi Pete,
I'm really not trying to conflate these two different concepts. What you are
describing is a desire to have a diagnostic that relates to the numeric
instability of matrix inversion. If that is your desire, then yes the CN of
the correlation matrix is not what you want. As I suggested previously, a CN
derived from the Hessian (R-matrix in NONMEM parlance) or from the covariance
matrix (inverse R if MATRIX=R option is employed) is probably what you want
because this is the matrix that is actually being inverted and the CN will be
larger because both differences in scales of the variances as well as
collinearity issues will contribute to the potential numerical instability in
the inversion process. However, my focus is more purely on the impact of
collinearity where the choice of model and the limitations with the data to
support the choice of model can have a big impact on model stability.
>From Bob Bauer's response earlier today it sounds like NONMEM behind the
>scenes is performing the eigenvalue analysis of the Hessian (R-matrix) as the
>first step and if that is successful (all eigenvalues positive such that the R
>matrix is positive semidefinite) and hence invertible then the COV step runs
>and the covariance matrix, correlation matrix and eigenvalues (PRINT=E option
>of $COV step) from the correlation matrix will be reported. Note when the COV
>step fails we often get the warning message that the R-matrix is non-positive
>semi-definite (NPSD) which implies one or more of the eigenvalues from the R
>matrix is 0 (singular) or negative. So clearly, NONMEM is calculating these
>eigenvalues behind the scenes and it sounds like you would like NONMEM to
>report these even if NONMEM determines that they are all positive and the $COV
>step can run successfully. I see no reason why NONMEM could not make this an
>option so that you can assess for yourself how much loss in accuracy there
>might have been in inverting the R matrix. Maybe broach this with Bob?
Best,
Ken
Quoted reply history
-----Original Message-----
From: Bonate, Peter [mailto:[email protected]]luSent: Wednesday,
November 30, 2022 7:52 PM
To: Ken Kowalski <[email protected]>; 'Leonid Gibiansky'
<[email protected]>
Cc: 'Matthew Fidler' <[email protected]>; 'Kyun-Seop Bae'
<[email protected]>; [email protected]; 'Jeroen Elassaiss-Schaap
(PD-value B.V.)' <[email protected]>; Alan Maloney ([email protected])
<[email protected]>
Subject: RE: [NMusers] Condition number
Thanks Ken and Al. I miss these discussions, while others in NMusers are
probably thinking “how can there be that many emails on this”.
I think we are conflating different things. The reason we look at the CN is
that during the optimization process NONMEM has to invert a matrix (its either
gradient or the Hessian, I am not sure), and also again in the calculation of
the standard errors. The log10 CN is how many digits are lost in that
inversion process. You can have matrix instability for many reasons,
collinearity being one of them, but that is not the only reason. As you said,
you can have parameters that are widely different in scale; that can also cause
instability in that inversion process. Thus, a high CN does not always imply
you have collinearity.
So if NONMEM is reporting the eigenvalues of the correlation matrix, then this
has a couple of consequences:
• The CN no longer means how many digits are lost during inversion
• The CN no longer indicates how stable that matrix inversion is
• The only thing it is good for now is detecting collinearity.
• We use a cutoff value of 1000 because that implies we lose 3 digits of
accuracy during the inversion. This value may not be applicable to a
correlation matrix eigenvector.
This is why using the correlation matrix makes no sense to me. Still doesn’t.
Ayyappa – see what a can of worms you opened. Lol.
pete
Peter Bonate, PhD
Executive Director
Pharmacokinetics, Modeling, and Simulation (PKMS) Clinical Pharmacology and
Exploratory Development (CPED) Astellas
1 Astellas Way
Northbrook, IL 60062
[email protected]
(224) 619-4901
Quote of the week –
“Dancing with the Stars” is not owned by Astellas.
-----Original Message-----
From: Bonate, Peter
Sent: Wednesday, November 30, 2022 9:13 AM
To: Ken Kowalski <[email protected]>; 'Leonid Gibiansky'
<[email protected]>
Cc: 'Matthew Fidler' <[email protected]>; 'Kyun-Seop Bae'
<[email protected]>; [email protected]; 'Jeroen Elassaiss-Schaap
(PD-value B.V.)' <[email protected]>
Subject: RE: [NMusers] Condition number
Just wanted to follow up on a few things.
First, Nick, glad to hear from you again.
I gave up trying to understand how NONMEM works years ago. I don't need to
know how the engine works in my car to drive it or make me a better driver.
But I did look at CN years ago. One of my first publications, The Effect of
Collinearity on Parameter Estimates in Nonlinear Mixed Effect Models (1999),
showed that when you put correlated covariates into a model (with correlations
greater than 0.75) the standard error of the estimates become inflated, and the
estimates of the parameters themselves becomes biased. This is why we don't
put weight and BSA on the same parameter, for example. You can spot this
problem easily from the CN. So, although I would never choose a model based
solely on its condition number, I always look at it as part of the totality of
evidence for how good a model is. But maybe that's just me.
And to follow up with this statement from Ken:
That is, a high CN in any one of the three matrices (Hessian,
covariance matrix, correlation matrix) will result in a high CN in the others.
I would think that the correlation matrix will give you the smallest condition
number because it's scaled. I needed to see this for myself in R. I made a
covariance matrix and computed the eigenvalues then transformed it to a
correlation matrix. The condition number of the correlation matrix is lower
than the covariance matrix condition number.
> cov <- c(10, 2, 1, 2, 4, 3, 1, 3, 6)
> cov <- matrix(cov, nrow=3, byrow=TRUE) cov
[,1] [,2] [,3]
[1,] 10 2 1
[2,] 2 4 3
[3,] 1 3 6
> p <- cov2cor(cov)
> p
[,1] [,2] [,3]
[1,] 1.0000000 0.3162278 0.1290994
[2,] 0.3162278 1.0000000 0.6123724
[3,] 0.1290994 0.6123724 1.0000000
> eig.cov <- eigen(cov)
> eig.p <- eigen(p)
> CN.cov <- eig.cov$values[1]/eig.cov$values[3]
> CN.p <- eig.p$values[1]/eig.p$values[3] CN.cov
[1] 6.68266
> CN.p
[1] 4.899988
So I guess we need Bob Bauer to chime in on this latter issue.
pete
Peter Bonate, PhD
Executive Director
Pharmacokinetics, Modeling, and Simulation (PKMS) Clinical Pharmacology and
Exploratory Development (CPED) Astellas
1 Astellas Way
Northbrook, IL 60062
[email protected]
(224) 619-4901
Quote of the week –
“Dancing with the Stars” is not owned by Astellas.
-----Original Message-----
From: Ken Kowalski <[email protected]>
Sent: Tuesday, November 29, 2022 8:29 PM
To: Bonate, Peter <[email protected]>; 'Leonid Gibiansky'
<[email protected]>
Cc: 'Matthew Fidler' <[email protected]>; 'Kyun-Seop Bae'
<[email protected]>; [email protected]; 'Jeroen Elassaiss-Schaap
(PD-value B.V.)' <[email protected]>
Subject: RE: [NMusers] Condition number
Hi Pete,
I would say the Hessian would be the more appropriate matrix rather than the
Jacobian since the covariance matrix of the parameter estimates is typically
estimated as the inverse of the Hessian for most nonlinear regression packages
and what NONMEM does if you use the MATRIX=R option on the $COV step instead of
NONMEM's default sandwich estimator. Looking at the eigenvalues of the
Hessian, or the eigenvalues of the covariance matrix of the parameter estimates
or the eigenvalues of the correlation matrix of the parameter estimates are all
going to be related. That is, a high CN in any one of the three matrices
(Hessian, covariance matrix, correlation matrix) will result in a high CN in
the others.
I have encountered NONMEM reporting a negative eigenvalue too. I assume this
is the result of a numerical precision issue because if it was truly negative,
then the Hessian would not be positive semi-definite and hence the COV step
should fail. I am not a numerical analyst so this is another issue that I
would be interested in hearing from Bob Bauer on how NONMEM can report a
negative eigenvalue.
Best,
Ken
Kenneth G. Kowalski
Kowalski PMetrics Consulting, LLC
Email: [email protected]
Cell: 248-207-5082
-----Original Message-----
From: Bonate, Peter [mailto:[email protected]]
Sent: Tuesday, November 29, 2022 8:27 PM
To: Leonid Gibiansky <[email protected]>
Cc: Ken Kowalski <[email protected]>; Matthew Fidler
<[email protected]>; Kyun-Seop Bae <[email protected]>;
[email protected]; Jeroen Elassaiss-Schaap (PD-value B.V.)
<[email protected]>
Subject: Re: [NMusers] Condition number
This is great. Just like the glory days of NMusers. Any moment now Nick Holford
is jointing to chime in.
I’m not an expert in matrix algebra but is the correlation matrix the right one
to be using? We are concerned about inversion of the hessian. That
instability is what affects our parameter estimates and standard errors.
Doesn’t that depend on the Jacobian? Shouldn’t we be looking at the
eigenvalues of the Jacobian matrix instead?
And to echo what was already said. Never use the condition number as an
absolute. It’s a yardstick. FYI- one time I got a negative eigenvalue from
nonmem and would not have known how unstable the model was unless I looked at
the eigenvalue.
Pete.
> On Nov 29, 2022, at 7:17 PM, Leonid Gibiansky <[email protected]>
> wrote:
>
> from the manual:
>
> Iteration -1000000003 indicates that this line contains the condition number
> , lowest, highest, Eigen values of the correlation matrix of the variances of
> the final parameters.
>
>
>
>> On 11/29/2022 7:59 PM, Ken Kowalski wrote:
>> Hi Matt,
>> I’m pretty sure Stu Beal told me many years ago that NONMEM calculates the
>> eigenvalues from the correlation matrix. Maybe Bob Bauer can chime in here?
>> Ken
>> *From:*Matthew Fidler [mailto:[email protected]]
>> *Sent:* Tuesday, November 29, 2022 7:56 PM
>> *To:* Ken Kowalski <[email protected]>
>> *Cc:* Kyun-Seop Bae <[email protected]>; [email protected];
>> Jeroen Elassaiss-Schaap (PD-value B.V.) <[email protected]>
>> *Subject:* Re: [NMusers] Condition number Hi Ken, I am unsure, since
>> I don't have my NONMEM manual handy.
>> I based my understanding on reading about condition numbers in numerical
>> analysis, which seemed to use the parameter estimates:
>> https://en.wikipedia.org/wiki/Condition_number
>> https://en.wikipedia.org/wiki/Condition_number
>> If it uses the correlation matrix, it could be less sensitive.
>> Matt
>> On Tue, Nov 29, 2022 at 6:11 PM Ken Kowalski <[email protected]
>> <mailto:[email protected]>> wrote:
>> Hi Matt,
>> Correct me if I’m wrong but I thought NONMEM calculates the
>> condition number based on the correlation matrix of the parameter
>> estimates so it is scaled based on the standard errors of the estimates.
>> Ken
>> *From:*Matthew Fidler [mailto:[email protected]
>> <mailto:[email protected]>]
>> *Sent:* Tuesday, November 29, 2022 7:04 PM
>> *To:* Ken Kowalski <[email protected]
>> <mailto:[email protected]>>
>> *Cc:* Kyun-Seop Bae <[email protected]
>> <mailto:[email protected]>>; [email protected]
>> <mailto:[email protected]>; Jeroen Elassaiss-Schaap (PD-value
>> B.V.) <[email protected] <mailto:[email protected]>>
>> *Subject:* Re: [NMusers] Condition number
>> Hi Ken & Kyun-Seop,
>> I agree it should be taught, since it is prevalent in the industry,
>> and should be looked at as something to investigate further, but no
>> hard and fast rule should be applied to if the model is reasonable
>> and fit for purpose. That should be done in conjunction with other
>> diagnostic plots.
>> One thing that has always bothered me about the condition number is
>> that it is calculated based on the final parameter estimates, but
>> not the scaled parameter estimates. Truly the scaling is supposed
>> to help make the gradient on a comparable scale and fix many
>> numerical problems here. Hence, if the scaling works as it is
>> supposed to, small changes may not affect the colinearity as
>> strongly as the calculated condition number suggests.
>> This is mainly why I see it as a number to keep in mind instead of a
>> hard and fast rule.
>> Matt
>> On Tue, Nov 29, 2022 at 5:09 PM Ken Kowalski <[email protected]
>> <mailto:[email protected]>> wrote:
>> Hi Kyun-Seop,
>> I would state things a little differently rather than say
>> “devalue condition number and multi-collinearity” we should
>> treat CN as a diagnostic and rules such as CN>1000 should NOT be
>> used as a hard and fast rule to reject a model. I agree with
>> Jeroen that we should understand the implications of a high CN
>> and the impact multi-collinearity may have on the model
>> estimation and that there are other diagnostics such as
>> correlations, variance inflation factors (VIF), standard errors,
>> CIs, etc. that can also help with our understanding of the
>> effects of multi-collinearity and its implications for model
>> development.
>> That being said, if you have a model with a high CN and the
>> model converges with realistic point estimates and reasonable
>> standard errors then it may still be reasonable to accept that
>> model. However, in this setting I would probably still want to
>> re-run the model with different starting values and make sure it
>> converges to the same OFV and set of point estimates.
>> As the smallest eigenvalue goes to 0 and the CN goes to infinity
>> we end up with a singular Hessian matrix (R matrix) so we know
>> that at some point a high enough CN will result in convergence
>> and COV step failures. Thus, you shouldn’t simply dismiss CN as
>> not having any diagnostic value, just don’t apply it in a rule
>> such as CN>1000 to blindly reject a model. The CN>1000 rule
>> should only be used to call your attention to the potential for
>> an issue that warrants further investigation before accepting
>> the model or deciding how to alter the model to improve
>> stability in the estimation.
>> Best,
>> Ken
>> Kenneth G. Kowalski
>> Kowalski PMetrics Consulting, LLC
>> Email: [email protected] <mailto:[email protected]>
>> Cell: 248-207-5082
>> *From:*[email protected]
>> <mailto:[email protected]>
>> [mailto:[email protected]
>> <mailto:[email protected]>] *On Behalf Of *Kyun-Seop Bae
>> *Sent:* Tuesday, November 29, 2022 5:10 PM
>> *To:* [email protected] <mailto:[email protected]>
>> *Subject:* Fwd: [NMusers] Condition numbera
>> Dear All,
>> I would like to devalue condition number and multi-collinearity
>> in nonlinear regression.
>> The reason we consider condition number (or multi-collinearity)
>> is that this may cause the following fitting (estimation) problems;
>> 1. Fitting failure (fail to converge, fail to minimize)
>> 2. Unrealistic point estimates
>> 3. Too wide standard errors
>> If you do not see the above problems (i.e., no estimation
>> problem with modest standard error), you do not need to give
>> attention to the condition number.
>> I think I saw 10^(n – parameters) criterion in an old version of
>> Gabrielsson’s book many years ago (but not in the latest version).
>> Best regards,
>> Kyun-Seop Bae
>> On Tue, 29 Nov 2022 at 22:59, Ayyappa Chaturvedula
>> <[email protected] <mailto:[email protected]>> wrote:
>> Dear all,
>> I am wondering if someone can provide references for the
>> condition number thresholds we are seeing (<1000) etc. Also,
>> the other way I have seen when I was in graduate school that
>> condition number <10^n (n- number of parameters) is OK.
>> Personally, I am depending on correlation matrix rather than
>> condition number and have seen cases where condition number
>> is large (according to 1000 rule but less than 10^n rule)
>> but correlation matrix is fine.
>> I want to provide these for my teaching purposes and any
>> help is greatly appreciated.
>> Regards,
>> Ayyappa
>>
>> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm
> _campaign=sig-email&utm_content=emailclient>
>>
>> Virus-free.www.avast.com
>>
>> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm
> _campaign=sig-email&utm_content=emailclient>
>
--
This email has been checked for viruses by Avast antivirus software.
www.avast.com
Dear Pete, Ken, Nick, Bob and all,
I truly appreciate the spirit of discussion and helpful background information.
It is nostalgic to see all on this thread. Thank you.
Regards,
Ayyappa
Quoted reply history
> On Nov 30, 2022, at 7:43 PM, Ken Kowalski <[email protected]> wrote:
>
> Hi Pete,
>
> I'm really not trying to conflate these two different concepts. What you are
> describing is a desire to have a diagnostic that relates to the numeric
> instability of matrix inversion. If that is your desire, then yes the CN of
> the correlation matrix is not what you want. As I suggested previously, a CN
> derived from the Hessian (R-matrix in NONMEM parlance) or from the covariance
> matrix (inverse R if MATRIX=R option is employed) is probably what you want
> because this is the matrix that is actually being inverted and the CN will be
> larger because both differences in scales of the variances as well as
> collinearity issues will contribute to the potential numerical instability in
> the inversion process. However, my focus is more purely on the impact of
> collinearity where the choice of model and the limitations with the data to
> support the choice of model can have a big impact on model stability.
>
> From Bob Bauer's response earlier today it sounds like NONMEM behind the
> scenes is performing the eigenvalue analysis of the Hessian (R-matrix) as the
> first step and if that is successful (all eigenvalues positive such that the
> R matrix is positive semidefinite) and hence invertible then the COV step
> runs and the covariance matrix, correlation matrix and eigenvalues (PRINT=E
> option of $COV step) from the correlation matrix will be reported. Note when
> the COV step fails we often get the warning message that the R-matrix is
> non-positive semi-definite (NPSD) which implies one or more of the
> eigenvalues from the R matrix is 0 (singular) or negative. So clearly,
> NONMEM is calculating these eigenvalues behind the scenes and it sounds like
> you would like NONMEM to report these even if NONMEM determines that they are
> all positive and the $COV step can run successfully. I see no reason why
> NONMEM could not make this an option so that you can assess for yourself how
> much loss in accuracy there might have been in inverting the R matrix. Maybe
> broach this with Bob?
>
> Best,
>
> Ken
>
> -----Original Message-----
> From: Bonate, Peter [mailto:[email protected]]luSent: Wednesday,
> November 30, 2022 7:52 PM
> To: Ken Kowalski <[email protected]>; 'Leonid Gibiansky'
> <[email protected]>
> Cc: 'Matthew Fidler' <[email protected]>; 'Kyun-Seop Bae'
> <[email protected]>; [email protected]; 'Jeroen Elassaiss-Schaap
> (PD-value B.V.)' <[email protected]>; Alan Maloney
> ([email protected]) <[email protected]>
> Subject: RE: [NMusers] Condition number
>
> Thanks Ken and Al. I miss these discussions, while others in NMusers are
> probably thinking “how can there be that many emails on this”.
>
> I think we are conflating different things. The reason we look at the CN is
> that during the optimization process NONMEM has to invert a matrix (its
> either gradient or the Hessian, I am not sure), and also again in the
> calculation of the standard errors. The log10 CN is how many digits are lost
> in that inversion process. You can have matrix instability for many reasons,
> collinearity being one of them, but that is not the only reason. As you
> said, you can have parameters that are widely different in scale; that can
> also cause instability in that inversion process. Thus, a high CN does not
> always imply you have collinearity.
>
> So if NONMEM is reporting the eigenvalues of the correlation matrix, then
> this has a couple of consequences:
> • The CN no longer means how many digits are lost during inversion
> • The CN no longer indicates how stable that matrix inversion is
> • The only thing it is good for now is detecting collinearity.
> • We use a cutoff value of 1000 because that implies we lose 3 digits of
> accuracy during the inversion. This value may not be applicable to a
> correlation matrix eigenvector.
>
> This is why using the correlation matrix makes no sense to me. Still doesn’t.
>
> Ayyappa – see what a can of worms you opened. Lol.
>
> pete
>
>
>
> Peter Bonate, PhD
> Executive Director
> Pharmacokinetics, Modeling, and Simulation (PKMS) Clinical Pharmacology and
> Exploratory Development (CPED) Astellas
> 1 Astellas Way
> Northbrook, IL 60062
> [email protected]
> (224) 619-4901
>
>
> Quote of the week –
> “Dancing with the Stars” is not owned by Astellas.
>
> -----Original Message-----
> From: Bonate, Peter
> Sent: Wednesday, November 30, 2022 9:13 AM
> To: Ken Kowalski <[email protected]>; 'Leonid Gibiansky'
> <[email protected]>
> Cc: 'Matthew Fidler' <[email protected]>; 'Kyun-Seop Bae'
> <[email protected]>; [email protected]; 'Jeroen Elassaiss-Schaap
> (PD-value B.V.)' <[email protected]>
> Subject: RE: [NMusers] Condition number
>
> Just wanted to follow up on a few things.
>
> First, Nick, glad to hear from you again.
>
> I gave up trying to understand how NONMEM works years ago. I don't need to
> know how the engine works in my car to drive it or make me a better driver.
> But I did look at CN years ago. One of my first publications, The Effect of
> Collinearity on Parameter Estimates in Nonlinear Mixed Effect Models (1999),
> showed that when you put correlated covariates into a model (with
> correlations greater than 0.75) the standard error of the estimates become
> inflated, and the estimates of the parameters themselves becomes biased.
> This is why we don't put weight and BSA on the same parameter, for example.
> You can spot this problem easily from the CN. So, although I would never
> choose a model based solely on its condition number, I always look at it as
> part of the totality of evidence for how good a model is. But maybe that's
> just me.
>
> And to follow up with this statement from Ken:
> That is, a high CN in any one of the three matrices (Hessian, covariance
> matrix, correlation matrix) will result in a high CN in the others.
> I would think that the correlation matrix will give you the smallest
> condition number because it's scaled. I needed to see this for myself in R.
> I made a covariance matrix and computed the eigenvalues then transformed it
> to a correlation matrix. The condition number of the correlation matrix is
> lower than the covariance matrix condition number.
>
>> cov <- c(10, 2, 1, 2, 4, 3, 1, 3, 6)
>> cov <- matrix(cov, nrow=3, byrow=TRUE) cov
> [,1] [,2] [,3]
> [1,] 10 2 1
> [2,] 2 4 3
> [3,] 1 3 6
>> p <- cov2cor(cov)
>> p
> [,1] [,2] [,3]
> [1,] 1.0000000 0.3162278 0.1290994
> [2,] 0.3162278 1.0000000 0.6123724
> [3,] 0.1290994 0.6123724 1.0000000
>> eig.cov <- eigen(cov)
>> eig.p <- eigen(p)
>> CN.cov <- eig.cov$values[1]/eig.cov$values[3]
>> CN.p <- eig.p$values[1]/eig.p$values[3] CN.cov
> [1] 6.68266
>> CN.p
> [1] 4.899988
>
> So I guess we need Bob Bauer to chime in on this latter issue.
>
> pete
>
>
> Peter Bonate, PhD
> Executive Director
> Pharmacokinetics, Modeling, and Simulation (PKMS) Clinical Pharmacology and
> Exploratory Development (CPED) Astellas
> 1 Astellas Way
> Northbrook, IL 60062
> [email protected]
> (224) 619-4901
>
>
> Quote of the week –
> “Dancing with the Stars” is not owned by Astellas.
>
> -----Original Message-----
> From: Ken Kowalski <[email protected]>
> Sent: Tuesday, November 29, 2022 8:29 PM
> To: Bonate, Peter <[email protected]>; 'Leonid Gibiansky'
> <[email protected]>
> Cc: 'Matthew Fidler' <[email protected]>; 'Kyun-Seop Bae'
> <[email protected]>; [email protected]; 'Jeroen Elassaiss-Schaap
> (PD-value B.V.)' <[email protected]>
> Subject: RE: [NMusers] Condition number
>
> Hi Pete,
>
> I would say the Hessian would be the more appropriate matrix rather than the
> Jacobian since the covariance matrix of the parameter estimates is typically
> estimated as the inverse of the Hessian for most nonlinear regression
> packages and what NONMEM does if you use the MATRIX=R option on the $COV step
> instead of NONMEM's default sandwich estimator. Looking at the eigenvalues
> of the Hessian, or the eigenvalues of the covariance matrix of the parameter
> estimates or the eigenvalues of the correlation matrix of the parameter
> estimates are all going to be related. That is, a high CN in any one of the
> three matrices (Hessian, covariance matrix, correlation matrix) will result
> in a high CN in the others.
>
> I have encountered NONMEM reporting a negative eigenvalue too. I assume this
> is the result of a numerical precision issue because if it was truly
> negative, then the Hessian would not be positive semi-definite and hence the
> COV step should fail. I am not a numerical analyst so this is another issue
> that I would be interested in hearing from Bob Bauer on how NONMEM can report
> a negative eigenvalue.
>
> Best,
>
> Ken
>
> Kenneth G. Kowalski
> Kowalski PMetrics Consulting, LLC
> Email: [email protected]
> Cell: 248-207-5082
>
>
>
> -----Original Message-----
> From: Bonate, Peter [mailto:[email protected]]
> Sent: Tuesday, November 29, 2022 8:27 PM
> To: Leonid Gibiansky <[email protected]>
> Cc: Ken Kowalski <[email protected]>; Matthew Fidler
> <[email protected]>; Kyun-Seop Bae <[email protected]>;
> [email protected]; Jeroen Elassaiss-Schaap (PD-value B.V.)
> <[email protected]>
> Subject: Re: [NMusers] Condition number
>
> This is great. Just like the glory days of NMusers. Any moment now Nick
> Holford is jointing to chime in.
>
> I’m not an expert in matrix algebra but is the correlation matrix the right
> one to be using? We are concerned about inversion of the hessian. That
> instability is what affects our parameter estimates and standard errors.
> Doesn’t that depend on the Jacobian? Shouldn’t we be looking at the
> eigenvalues of the Jacobian matrix instead?
>
> And to echo what was already said. Never use the condition number as an
> absolute. It’s a yardstick. FYI- one time I got a negative eigenvalue from
> nonmem and would not have known how unstable the model was unless I looked at
> the eigenvalue.
>
> Pete.
>
>> On Nov 29, 2022, at 7:17 PM, Leonid Gibiansky <[email protected]>
>> wrote:
>>
>> from the manual:
>>
>> Iteration -1000000003 indicates that this line contains the condition number
>> , lowest, highest, Eigen values of the correlation matrix of the variances
>> of the final parameters.
>>
>>
>>
>>>> On 11/29/2022 7:59 PM, Ken Kowalski wrote:
>>> Hi Matt,
>>> I’m pretty sure Stu Beal told me many years ago that NONMEM calculates the
>>> eigenvalues from the correlation matrix. Maybe Bob Bauer can chime in here?
>>> Ken
>>> *From:*Matthew Fidler [mailto:[email protected]]
>>> *Sent:* Tuesday, November 29, 2022 7:56 PM
>>> *To:* Ken Kowalski <[email protected]>
>>> *Cc:* Kyun-Seop Bae <[email protected]>; [email protected];
>>> Jeroen Elassaiss-Schaap (PD-value B.V.) <[email protected]>
>>> *Subject:* Re: [NMusers] Condition number Hi Ken, I am unsure, since
>>> I don't have my NONMEM manual handy.
>>> I based my understanding on reading about condition numbers in numerical
>>> analysis, which seemed to use the parameter estimates:
>>> https://en.wikipedia.org/wiki/Condition_number
>>> https://en.wikipedia.org/wiki/Condition_number
>>> If it uses the correlation matrix, it could be less sensitive.
>>> Matt
>>>> On Tue, Nov 29, 2022 at 6:11 PM Ken Kowalski <[email protected]
>>>> <mailto:[email protected]>> wrote:
>>> Hi Matt,
>>> Correct me if I’m wrong but I thought NONMEM calculates the
>>> condition number based on the correlation matrix of the parameter
>>> estimates so it is scaled based on the standard errors of the estimates.
>>> Ken
>>> *From:*Matthew Fidler [mailto:[email protected]
>>> <mailto:[email protected]>]
>>> *Sent:* Tuesday, November 29, 2022 7:04 PM
>>> *To:* Ken Kowalski <[email protected]
>>> <mailto:[email protected]>>
>>> *Cc:* Kyun-Seop Bae <[email protected]
>>> <mailto:[email protected]>>; [email protected]
>>> <mailto:[email protected]>; Jeroen Elassaiss-Schaap (PD-value
>>> B.V.) <[email protected] <mailto:[email protected]>>
>>> *Subject:* Re: [NMusers] Condition number
>>> Hi Ken & Kyun-Seop,
>>> I agree it should be taught, since it is prevalent in the industry,
>>> and should be looked at as something to investigate further, but no
>>> hard and fast rule should be applied to if the model is reasonable
>>> and fit for purpose. That should be done in conjunction with other
>>> diagnostic plots.
>>> One thing that has always bothered me about the condition number is
>>> that it is calculated based on the final parameter estimates, but
>>> not the scaled parameter estimates. Truly the scaling is supposed
>>> to help make the gradient on a comparable scale and fix many
>>> numerical problems here. Hence, if the scaling works as it is
>>> supposed to, small changes may not affect the colinearity as
>>> strongly as the calculated condition number suggests.
>>> This is mainly why I see it as a number to keep in mind instead of a
>>> hard and fast rule.
>>> Matt
>>> On Tue, Nov 29, 2022 at 5:09 PM Ken Kowalski <[email protected]
>>> <mailto:[email protected]>> wrote:
>>> Hi Kyun-Seop,
>>> I would state things a little differently rather than say
>>> “devalue condition number and multi-collinearity” we should
>>> treat CN as a diagnostic and rules such as CN>1000 should NOT be
>>> used as a hard and fast rule to reject a model. I agree with
>>> Jeroen that we should understand the implications of a high CN
>>> and the impact multi-collinearity may have on the model
>>> estimation and that there are other diagnostics such as
>>> correlations, variance inflation factors (VIF), standard errors,
>>> CIs, etc. that can also help with our understanding of the
>>> effects of multi-collinearity and its implications for model
>>> development.
>>> That being said, if you have a model with a high CN and the
>>> model converges with realistic point estimates and reasonable
>>> standard errors then it may still be reasonable to accept that
>>> model. However, in this setting I would probably still want to
>>> re-run the model with different starting values and make sure it
>>> converges to the same OFV and set of point estimates.
>>> As the smallest eigenvalue goes to 0 and the CN goes to infinity
>>> we end up with a singular Hessian matrix (R matrix) so we know
>>> that at some point a high enough CN will result in convergence
>>> and COV step failures. Thus, you shouldn’t simply dismiss CN as
>>> not having any diagnostic value, just don’t apply it in a rule
>>> such as CN>1000 to blindly reject a model. The CN>1000 rule
>>> should only be used to call your attention to the potential for
>>> an issue that warrants further investigation before accepting
>>> the model or deciding how to alter the model to improve
>>> stability in the estimation.
>>> Best,
>>> Ken
>>> Kenneth G. Kowalski
>>> Kowalski PMetrics Consulting, LLC
>>> Email: [email protected] <mailto:[email protected]>
>>> Cell: 248-207-5082
>>> *From:*[email protected]
>>> <mailto:[email protected]>
>>> [mailto:[email protected]
>>> <mailto:[email protected]>] *On Behalf Of *Kyun-Seop Bae
>>> *Sent:* Tuesday, November 29, 2022 5:10 PM
>>> *To:* [email protected] <mailto:[email protected]>
>>> *Subject:* Fwd: [NMusers] Condition numbera
>>> Dear All,
>>> I would like to devalue condition number and multi-collinearity
>>> in nonlinear regression.
>>> The reason we consider condition number (or multi-collinearity)
>>> is that this may cause the following fitting (estimation) problems;
>>> 1. Fitting failure (fail to converge, fail to minimize)
>>> 2. Unrealistic point estimates
>>> 3. Too wide standard errors
>>> If you do not see the above problems (i.e., no estimation
>>> problem with modest standard error), you do not need to give
>>> attention to the condition number.
>>> I think I saw 10^(n – parameters) criterion in an old version of
>>> Gabrielsson’s book many years ago (but not in the latest version).
>>> Best regards,
>>> Kyun-Seop Bae
>>> On Tue, 29 Nov 2022 at 22:59, Ayyappa Chaturvedula
>>> <[email protected] <mailto:[email protected]>> wrote:
>>> Dear all,
>>> I am wondering if someone can provide references for the
>>> condition number thresholds we are seeing (<1000) etc. Also,
>>> the other way I have seen when I was in graduate school that
>>> condition number <10^n (n- number of parameters) is OK.
>>> Personally, I am depending on correlation matrix rather than
>>> condition number and have seen cases where condition number
>>> is large (according to 1000 rule but less than 10^n rule)
>>> but correlation matrix is fine.
>>> I want to provide these for my teaching purposes and any
>>> help is greatly appreciated.
>>> Regards,
>>> Ayyappa
>>>
>>> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm
>> _campaign=sig-email&utm_content=emailclient>
>>>
>>> Virus-free.www.avast.com
>>>
>>> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm
>> _campaign=sig-email&utm_content=emailclient>
>>
>
>
> --
> This email has been checked for viruses by Avast antivirus software.
> www.avast.com
>
>
…trying to resend this, as I think only Pete got it, and not NMUSERS!
Hi All,
I wanted to make two comments related to the interesting discussions around
condition number (CN) and a related point regarding Dose-Exposure-Response
(D-E-R) modelling with weak data.
As a foreword, I always wish to simulate from any model I have developed. Thus
I am always interested in the joint distribution of the P parameters given the
data, and never just some point estimates of the P parameters. We can either
sample, say, 1000 parameter sets (e.g. using Bayesian MCMC) or we can use the
multivariate normal (MVN) approximation to draw 1000 samples using the
estimated parameters and the approximate var-cov matrix (in a maximum
likelihood estimation (MLE) world). The latter is weak for many reasons. For
example, it is not parameterisation invariant, so if you model ED50 as ED50 or
log(ED50) you will get a different MVN approximation (likelihood profiling may
help here, but not nearly enough!). In addition, we are just “hoping” the full
joint distribution is MVN based on crude curvature measures at the peak of the
likelihood function. These curvature measures can imply 0 correlation between
two parameters that are truly dependent (e.g. a u shaped relationship). Thus I
would encourage all modellers to think of a “model” as these 1000 samples from
the P dimensional joint distribution of the parameters (e.g. I put 0.1% of my
belief to each of the 1000 parameter sets). Crude summaries of the marginal
distributions of each parameter that are typically reported (such as
mean/median, SE, 95% CrI(CI) etc.) are univariate, but our model is always
multivariate (hence why any model published is only useful if the authors
(minimally) provide the correlation matrix in addition to point estimates and
SE's.). In a Bayesian MCMC world you can make the P*P coplot to visualise these
correlations…for example, this could show a u shaped relationship between two
parameters, whilst the equivalent MVN plot may (incorrectly) suggest the two
parameters are uncorrelated; in this case, the MVN approximation is not working
well, and hence could yield simulations that are not realistic (i.e. is the
model poor, or the MVN approximation poor?). When you realise you really do
want to draw random samples of the parameters from their joint distributions
based on a given model+data, you will realise why Bayesian MCMC (via
Hamiltonian MC (HMC)) is the way to go. After fitting a number of complex NLME
models in both MLE and Bayesian HMC frameworks over the last 10-15 years, I
consider the MLE+MVN as plain ugly (fast yes, but “best” no).
On the condition number (CN) topic
From a modelling perspective, the CN only makes sense for the correlation
matrix (not the covariance matrix or hessian (both are scale dependent!)), and
I understand this is what NM correctly reports. As Ken mentioned, you should
always try to understand high correlations, and a heat plot showing the P*P
correlations is a good place to start. This may suggest a group of parameters
that can be reparameterised to reduce their correlations (e.g. to model 5 Emax
parameters as offsets to one Emax parameter, rather than as 6 unique Emax
parameters).
The CN will increase with any increase in the number of parameters (since no
additional parameter will be perfectly orthogonal to all other parameters).
Hence any “rules of thumb” are limited.
Lower correlations between parameters are desirable. From a numerical
perspective, a correlation of 0.999 and 0.998 may be (numerically) as similar
as a correlation of 0.002 and 0.001, but the former will have a greater
influence on your subsequence simulations. High correlations may be reflective
of poor trial design; consider using a better trial design (see next comment!).
Lower correlations also help most estimation algorithms work quicker, so some
effort to better parameterise your model is often worthwhile (although Radford
Neil showed that HMC still works very well with a CN of 10000 using 100
parameters https://youtu.be/ZGtezhDaSpM?t=1659
https://youtu.be/ZGtezhDaSpM?t=1659).
On fitting Dose-Exposure-Response (D-E-R) models to weak data, and why a linear
D-E-R model is never sound
Firstly, all “dose-ranging” and “dose-finding” trials state their objective is
to “determine the dose (exposure) response relationship”. However most have
awful designs, with far too few dose levels that are far too closely spaced,
with small N. Any simple simulation and re-estimation effort would show just
how awful they are; they are invariably incapable of accurately and precisely
the D-E-R relationship (for efficacy or safety). Pharma must do so much better
here!
Thus the first goal for any analyst is to ensure any trial design you will be
asked to analyse is fit for D-E-R modelling BEFORE you get any data. To avoid
the following “car crash” scenario, you need to use a suitable model (like the
sigmoidal Emax model), determine the range of credible parameter values, and
then use simulation re-estimation to show your company how well/poorly
different candidate designs will be at quantifying the true D-E-R relationship.
You will find that designs like Placebo, 1 mg, 2 mg and 4 mg are really poor;
you typically need much wider dose ranges, with wider dose spacing, and large
N. Optimal/adaptive D-E-R trials are best.
The “car crash” scenario which I, and many, will have seen is being presented
with “weak” data from poorly designed D-E-R trials. The analyst is given data,
for example, data from Placebo, 1 mg, 2 mg and 4 mg from a (too small) trial
and asked to perform the D-E-R modelling. The doses are far too closely spaced,
and the observed data may even appear “linear”. Some awful choices are:
1) Option 1: Refer to your simulation/re-estimation work - you told them it
would be a “car-crash”.
2) Option 2: Weak: Fit moderate priors for Emax and Hill, and get 1000 D-E-R
curves. Summarise, but warn that there is a lot of “guessing” here. Refer
senior management back to 1), and politely ask they listen to you next time :-)
3) Option 3: Wrong: Fit a linear D-E-R model. Do not do this! Think about the
1000 D-E-R you will show; 1000 linear relationships. As an expert in clinical
pharmacology, how can you present such nonsense!? You KNOW D-E-R relationships
are non-linear!
Option 2 will produce a wide “band” of D-E-R curves, with the 1000 D-E-R curves
more honestly reflecting the “ignorance” that such a weak design and dataset
yield. Comparisons across doses will be (rightly) imprecise.
In contrast, Option 3 will yield a “bow-tie” shaped uncertainty, with high
precision around the middle of the dose range, with wider uncertainty towards
the extremes. This precision in the middle of the dose range is a “fools
fallacy”, since it conditions on the fact that the true D-E-R is 100% linear.
But you know that is wrong. Appearing linear and being linear are totally
different. If I gave you placebo (dose=0) and dose = 100 mg, would you equally
enjoy the narrow precision at 50 mg?? I hope not! Alas great D-E-R modelling
often needs very wide dose ranges (think 10-100 fold) and large N if you truly
wish to precisely estimate D-E-R relationships. Linear D-E-R models are plain
wrong.
I am quite unfamiliar with NM, but Bob Bauer told me a few years ago that a
full Bayesian HMC analysis was supported, so using moderate priors for Emax and
Hill should be possible (as in Option 2). In STAN, Option 2 works well in those
cases where you cannot influence the trial designs (e.g. in MBMA work, where
you integrate D-R models across many drugs/classes, often with limited dose
ranges for each drug e.g. like here
https://ascpt.onlinelibrary.wiley.com/doi/10.1002/cpt.1307
https://ascpt.onlinelibrary.wiley.com/doi/10.1002/cpt.1307).
I hope this is clear. Feel free to reach out to me personally if you would like
to follow up “quietly” on anything in the above.
Best wishes,
Al
Al Maloney PhD
Consultant Pharmacometrician
Hello Bill and others:
During estimation NONMEM transforms parameters into their “unconstrained”
domains (you can see what those transformations are in appendix K of the
NONMEM7 Technical Guide). The parameters are moved in this unconstrained
domain. The R matrix produced at the end of estimation is in this
unconstrained domain. This unconstrained form is tested for positive
definiteness as part of an interim analysis, but then it is also then
transformed back into the original user’s parameter domain, and then inverted
by Cholesky decomposition. This is the final variance-covariance matrix
reported.
Robert J. Bauer, Ph.D.
Senior Director
Pharmacometrics R&D
ICON Early Phase
731 Arbor way, suite 100
Blue Bell, PA 19422
Office: (215) 616-6428
Mobile: (925) 286-0769
[email protected]<mailto:[email protected]>
http://www.iconplc.com/
Quoted reply history
From: Bill Denney
<[email protected]<mailto:[email protected]>>
Sent: Wednesday, November 30, 2022 11:31 AM
To: Bauer, Robert <[email protected]<mailto:[email protected]>>;
[email protected]<mailto:[email protected]>
Subject: [EXTERNAL] RE: [NMusers] Condition number
Hi everyone,
This has been a great discussion!
Bob: I’d like to clarify something that Pete, Matt, Ken, and Leonid were
discussing about how the covariance matrix is calculated. I believe that
NONMEM rescales the values for estimation and then reverses the rescaling for
reporting. Is the covariance matrix calculated on the rescaled values or on
the final parameter estimate values?
Thanks,
Bill
Hello Ken:
I am quite unaware that some eigenvalues of a properly positive-definite
verified variance-covariance from a pure R matrix would be negative, or that
this would even occur for its correlation matrix.
Similarly, if the variance-covariance form is of sandwich form, such as
(Rinv)S(Rinv), if there components (R, S) were each verified to be positive
definite, then it, and its correlation matrix would necessarily have all
positive eigenvalues.
I would need to see your NONMEM result file to understand why this would
happen. Is the negative eigenvalues very small but negative? (such as 10^-15,
or something like that).
Robert J. Bauer, Ph.D.
Senior Director
Pharmacometrics R&D
ICON Early Phase
731 Arbor way, suite 100
Blue Bell, PA 19422
Office: (215) 616-6428
Mobile: (925) 286-0769
[email protected]<mailto:[email protected]>
http://www.iconplc.com/
Quoted reply history
From: Ken Kowalski <[email protected]>
Sent: Thursday, December 1, 2022 6:59 AM
To: Bauer, Robert <[email protected]>; [email protected]
Cc: 'Bonate, Peter' <[email protected]>
Subject: [EXTERNAL] RE: [NMusers] Condition number
Hi Bob,
Could it possibly be related to the S matrix and the default sandwich estimator
used in estimating the covariance and correlation matrices?
Ken
From: Ken Kowalski [mailto:[email protected]]
Sent: Thursday, December 1, 2022 9:52 AM
To: 'Bauer, Robert' <[email protected]>; [email protected]
Cc: 'Bonate, Peter' <[email protected]>
Subject: RE: [NMusers] Condition number
Hey Bob,
I get that NONMEM can encounter negative eigenvalues during the R matrix
decomposition and inversion step and if it does then the $COV step fails.
However, both Pete and I have encountered situations where the R matrix is
apparently positive definite since the $COV step runs but NONMEM reports a
negative eigenvalue from the correlation matrix from the PRINT=E option. It is
very rarely that I have seen this happen but it has happened to me. How can
this be if the R matrix is positive definite?
Thanks,
Ken
Kenneth G. Kowalski
Kowalski PMetrics Consulting, LLC
Email: [email protected]<mailto:[email protected]>
Cell: 248-207-5082
From: [email protected]<mailto:[email protected]>
[mailto:[email protected]] On Behalf Of Bauer, Robert
Sent: Wednesday, November 30, 2022 1:53 PM
To: '[email protected]'
<[email protected]<mailto:[email protected]>>
Subject: RE: [NMusers] Condition number
Hello all:
Report of non-positive definiteness or negative eigenvalues, are reported
during the analysis of the R matrix (decomposition and inversion), which occurs
before the correlation matrix is constructed. Often, this is caused by
numerical imprecision. If the R matrix step fails, the $COV step fails to
produce a final variance-covariance matrix, and of course, does not produce a
correlation matrix. If the R matrix inversion step succeeds, the
variance-covariance matrix and its correlation matrix are produced, and the
correlation matrix is then assessed for its eigenvalues. So, both the R matrix
(first step) and correlation matrix (second step) are decomposed and assessed.
Robert J. Bauer, Ph.D.
Senior Director
Pharmacometrics R&D
ICON Early Phase
731 Arbor way, suite 100
Blue Bell, PA 19422
Office: (215) 616-6428
Mobile: (925) 286-0769
[email protected]<mailto:[email protected]>
http://www.iconplc.com/
Hello Ken:
I am quite unaware that some eigenvalues of a properly positive-definite
verified variance-covariance from a pure R matrix would be negative, or that
this would even occur for its correlation matrix.
Similarly, if the variance-covariance form is of sandwich form, such as
(Rinv)S(Rinv), if there components (R, S) were each verified to be positive
definite, then it, and its correlation matrix would necessarily have all
positive eigenvalues.
I would need to see your NONMEM result file to understand why this would
happen. Is the negative eigenvalues very small but negative? (such as 10^-15,
or something like that).
Robert J. Bauer, Ph.D.
Senior Director
Pharmacometrics R&D
ICON Early Phase
731 Arbor way, suite 100
Blue Bell, PA 19422
Office: (215) 616-6428
Mobile: (925) 286-0769
[email protected]<mailto:[email protected]>
http://www.iconplc.com/