RE: Condition number

From: Peter Bonate Date: December 01, 2022 technical Source: mail-archive.com
Thanks Ken and Al. I miss these discussions, while others in NMusers are probably thinking “how can there be that many emails on this”. I think we are conflating different things. The reason we look at the CN is that during the optimization process NONMEM has to invert a matrix (its either gradient or the Hessian, I am not sure), and also again in the calculation of the standard errors. The log10 CN is how many digits are lost in that inversion process. You can have matrix instability for many reasons, collinearity being one of them, but that is not the only reason. As you said, you can have parameters that are widely different in scale; that can also cause instability in that inversion process. Thus, a high CN does not always imply you have collinearity. So if NONMEM is reporting the eigenvalues of the correlation matrix, then this has a couple of consequences: • The CN no longer means how many digits are lost during inversion • The CN no longer indicates how stable that matrix inversion is • The only thing it is good for now is detecting collinearity. • We use a cutoff value of 1000 because that implies we lose 3 digits of accuracy during the inversion. This value may not be applicable to a correlation matrix eigenvector. This is why using the correlation matrix makes no sense to me. Still doesn’t. Ayyappa – see what a can of worms you opened. Lol. pete Peter Bonate, PhD Executive Director Pharmacokinetics, Modeling, and Simulation (PKMS) Clinical Pharmacology and Exploratory Development (CPED) Astellas 1 Astellas Way Northbrook, IL 60062 [email protected] (224) 619-4901 Quote of the week – “Dancing with the Stars” is not owned by Astellas.
Quoted reply history
-----Original Message----- From: Bonate, Peter Sent: Wednesday, November 30, 2022 9:13 AM To: Ken Kowalski <[email protected]>; 'Leonid Gibiansky' <[email protected]> Cc: 'Matthew Fidler' <[email protected]>; 'Kyun-Seop Bae' <[email protected]>; [email protected]; 'Jeroen Elassaiss-Schaap (PD-value B.V.)' <[email protected]> Subject: RE: [NMusers] Condition number Just wanted to follow up on a few things. First, Nick, glad to hear from you again. I gave up trying to understand how NONMEM works years ago. I don't need to know how the engine works in my car to drive it or make me a better driver. But I did look at CN years ago. One of my first publications, The Effect of Collinearity on Parameter Estimates in Nonlinear Mixed Effect Models (1999), showed that when you put correlated covariates into a model (with correlations greater than 0.75) the standard error of the estimates become inflated, and the estimates of the parameters themselves becomes biased. This is why we don't put weight and BSA on the same parameter, for example. You can spot this problem easily from the CN. So, although I would never choose a model based solely on its condition number, I always look at it as part of the totality of evidence for how good a model is. But maybe that's just me. And to follow up with this statement from Ken: That is, a high CN in any one of the three matrices (Hessian, covariance matrix, correlation matrix) will result in a high CN in the others. I would think that the correlation matrix will give you the smallest condition number because it's scaled. I needed to see this for myself in R. I made a covariance matrix and computed the eigenvalues then transformed it to a correlation matrix. The condition number of the correlation matrix is lower than the covariance matrix condition number. > cov <- c(10, 2, 1, 2, 4, 3, 1, 3, 6) > cov <- matrix(cov, nrow=3, byrow=TRUE) cov [,1] [,2] [,3] [1,] 10 2 1 [2,] 2 4 3 [3,] 1 3 6 > p <- cov2cor(cov) > p [,1] [,2] [,3] [1,] 1.0000000 0.3162278 0.1290994 [2,] 0.3162278 1.0000000 0.6123724 [3,] 0.1290994 0.6123724 1.0000000 > eig.cov <- eigen(cov) > eig.p <- eigen(p) > CN.cov <- eig.cov$values[1]/eig.cov$values[3] > CN.p <- eig.p$values[1]/eig.p$values[3] CN.cov [1] 6.68266 > CN.p [1] 4.899988 So I guess we need Bob Bauer to chime in on this latter issue. pete Peter Bonate, PhD Executive Director Pharmacokinetics, Modeling, and Simulation (PKMS) Clinical Pharmacology and Exploratory Development (CPED) Astellas 1 Astellas Way Northbrook, IL 60062 [email protected] (224) 619-4901 Quote of the week – “Dancing with the Stars” is not owned by Astellas. -----Original Message----- From: Ken Kowalski <[email protected]> Sent: Tuesday, November 29, 2022 8:29 PM To: Bonate, Peter <[email protected]>; 'Leonid Gibiansky' <[email protected]> Cc: 'Matthew Fidler' <[email protected]>; 'Kyun-Seop Bae' <[email protected]>; [email protected]; 'Jeroen Elassaiss-Schaap (PD-value B.V.)' <[email protected]> Subject: RE: [NMusers] Condition number Hi Pete, I would say the Hessian would be the more appropriate matrix rather than the Jacobian since the covariance matrix of the parameter estimates is typically estimated as the inverse of the Hessian for most nonlinear regression packages and what NONMEM does if you use the MATRIX=R option on the $COV step instead of NONMEM's default sandwich estimator. Looking at the eigenvalues of the Hessian, or the eigenvalues of the covariance matrix of the parameter estimates or the eigenvalues of the correlation matrix of the parameter estimates are all going to be related. That is, a high CN in any one of the three matrices (Hessian, covariance matrix, correlation matrix) will result in a high CN in the others. I have encountered NONMEM reporting a negative eigenvalue too. I assume this is the result of a numerical precision issue because if it was truly negative, then the Hessian would not be positive semi-definite and hence the COV step should fail. I am not a numerical analyst so this is another issue that I would be interested in hearing from Bob Bauer on how NONMEM can report a negative eigenvalue. Best, Ken Kenneth G. Kowalski Kowalski PMetrics Consulting, LLC Email: [email protected] Cell: 248-207-5082 -----Original Message----- From: Bonate, Peter [mailto:[email protected]] Sent: Tuesday, November 29, 2022 8:27 PM To: Leonid Gibiansky <[email protected]> Cc: Ken Kowalski <[email protected]>; Matthew Fidler <[email protected]>; Kyun-Seop Bae <[email protected]>; [email protected]; Jeroen Elassaiss-Schaap (PD-value B.V.) <[email protected]> Subject: Re: [NMusers] Condition number This is great. Just like the glory days of NMusers. Any moment now Nick Holford is jointing to chime in. I’m not an expert in matrix algebra but is the correlation matrix the right one to be using? We are concerned about inversion of the hessian. That instability is what affects our parameter estimates and standard errors. Doesn’t that depend on the Jacobian? Shouldn’t we be looking at the eigenvalues of the Jacobian matrix instead? And to echo what was already said. Never use the condition number as an absolute. It’s a yardstick. FYI- one time I got a negative eigenvalue from nonmem and would not have known how unstable the model was unless I looked at the eigenvalue. Pete. > On Nov 29, 2022, at 7:17 PM, Leonid Gibiansky <[email protected]> > wrote: > > from the manual: > > Iteration -1000000003 indicates that this line contains the condition number > , lowest, highest, Eigen values of the correlation matrix of the variances of > the final parameters. > > > >> On 11/29/2022 7:59 PM, Ken Kowalski wrote: >> Hi Matt, >> I’m pretty sure Stu Beal told me many years ago that NONMEM calculates the >> eigenvalues from the correlation matrix. Maybe Bob Bauer can chime in here? >> Ken >> *From:*Matthew Fidler [mailto:[email protected]] >> *Sent:* Tuesday, November 29, 2022 7:56 PM >> *To:* Ken Kowalski <[email protected]> >> *Cc:* Kyun-Seop Bae <[email protected]>; [email protected]; >> Jeroen Elassaiss-Schaap (PD-value B.V.) <[email protected]> >> *Subject:* Re: [NMusers] Condition number Hi Ken, I am unsure, since >> I don't have my NONMEM manual handy. >> I based my understanding on reading about condition numbers in numerical >> analysis, which seemed to use the parameter estimates: >> https://en.wikipedia.org/wiki/Condition_number >> https://en.wikipedia.org/wiki/Condition_number >> If it uses the correlation matrix, it could be less sensitive. >> Matt >> On Tue, Nov 29, 2022 at 6:11 PM Ken Kowalski <[email protected] >> <mailto:[email protected]>> wrote: >> Hi Matt, >> Correct me if I’m wrong but I thought NONMEM calculates the >> condition number based on the correlation matrix of the parameter >> estimates so it is scaled based on the standard errors of the estimates. >> Ken >> *From:*Matthew Fidler [mailto:[email protected] >> <mailto:[email protected]>] >> *Sent:* Tuesday, November 29, 2022 7:04 PM >> *To:* Ken Kowalski <[email protected] >> <mailto:[email protected]>> >> *Cc:* Kyun-Seop Bae <[email protected] >> <mailto:[email protected]>>; [email protected] >> <mailto:[email protected]>; Jeroen Elassaiss-Schaap (PD-value >> B.V.) <[email protected] <mailto:[email protected]>> >> *Subject:* Re: [NMusers] Condition number >> Hi Ken & Kyun-Seop, >> I agree it should be taught, since it is prevalent in the industry, >> and should be looked at as something to investigate further, but no >> hard and fast rule should be applied to if the model is reasonable >> and fit for purpose. That should be done in conjunction with other >> diagnostic plots. >> One thing that has always bothered me about the condition number is >> that it is calculated based on the final parameter estimates, but >> not the scaled parameter estimates. Truly the scaling is supposed >> to help make the gradient on a comparable scale and fix many >> numerical problems here. Hence, if the scaling works as it is >> supposed to, small changes may not affect the colinearity as >> strongly as the calculated condition number suggests. >> This is mainly why I see it as a number to keep in mind instead of a >> hard and fast rule. >> Matt >> On Tue, Nov 29, 2022 at 5:09 PM Ken Kowalski <[email protected] >> <mailto:[email protected]>> wrote: >> Hi Kyun-Seop, >> I would state things a little differently rather than say >> “devalue condition number and multi-collinearity” we should >> treat CN as a diagnostic and rules such as CN>1000 should NOT be >> used as a hard and fast rule to reject a model. I agree with >> Jeroen that we should understand the implications of a high CN >> and the impact multi-collinearity may have on the model >> estimation and that there are other diagnostics such as >> correlations, variance inflation factors (VIF), standard errors, >> CIs, etc. that can also help with our understanding of the >> effects of multi-collinearity and its implications for model >> development. >> That being said, if you have a model with a high CN and the >> model converges with realistic point estimates and reasonable >> standard errors then it may still be reasonable to accept that >> model. However, in this setting I would probably still want to >> re-run the model with different starting values and make sure it >> converges to the same OFV and set of point estimates. >> As the smallest eigenvalue goes to 0 and the CN goes to infinity >> we end up with a singular Hessian matrix (R matrix) so we know >> that at some point a high enough CN will result in convergence >> and COV step failures. Thus, you shouldn’t simply dismiss CN as >> not having any diagnostic value, just don’t apply it in a rule >> such as CN>1000 to blindly reject a model. The CN>1000 rule >> should only be used to call your attention to the potential for >> an issue that warrants further investigation before accepting >> the model or deciding how to alter the model to improve >> stability in the estimation. >> Best, >> Ken >> Kenneth G. Kowalski >> Kowalski PMetrics Consulting, LLC >> Email: [email protected] <mailto:[email protected]> >> Cell: 248-207-5082 >> *From:*[email protected] >> <mailto:[email protected]> >> [mailto:[email protected] >> <mailto:[email protected]>] *On Behalf Of *Kyun-Seop Bae >> *Sent:* Tuesday, November 29, 2022 5:10 PM >> *To:* [email protected] <mailto:[email protected]> >> *Subject:* Fwd: [NMusers] Condition numbera >> Dear All, >> I would like to devalue condition number and multi-collinearity >> in nonlinear regression. >> The reason we consider condition number (or multi-collinearity) >> is that this may cause the following fitting (estimation) problems; >> 1. Fitting failure (fail to converge, fail to minimize) >> 2. Unrealistic point estimates >> 3. Too wide standard errors >> If you do not see the above problems (i.e., no estimation >> problem with modest standard error), you do not need to give >> attention to the condition number. >> I think I saw 10^(n – parameters) criterion in an old version of >> Gabrielsson’s book many years ago (but not in the latest version). >> Best regards, >> Kyun-Seop Bae >> On Tue, 29 Nov 2022 at 22:59, Ayyappa Chaturvedula >> <[email protected] <mailto:[email protected]>> wrote: >> Dear all, >> I am wondering if someone can provide references for the >> condition number thresholds we are seeing (<1000) etc. Also, >> the other way I have seen when I was in graduate school that >> condition number <10^n (n- number of parameters) is OK. >> Personally, I am depending on correlation matrix rather than >> condition number and have seen cases where condition number >> is large (according to 1000 rule but less than 10^n rule) >> but correlation matrix is fine. >> I want to provide these for my teaching purposes and any >> help is greatly appreciated. >> Regards, >> Ayyappa >> >> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm > _campaign=sig-email&utm_content=emailclient> >> >> Virus-free.www.avast.com >> >> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm > _campaign=sig-email&utm_content=emailclient> > -- This email has been checked for viruses by Avast antivirus software. www.avast.com
Nov 29, 2022 Ayyappa Chaturvedula Condition number
Nov 29, 2022 Kenneth Kowalski RE: Condition number
Nov 29, 2022 Peter Bonate RE: Condition number
Nov 29, 2022 Jeroen Elassaiss-Schaap Re: Condition number
Nov 29, 2022 Kyun-Seop Bae Fwd: Condition number
Nov 30, 2022 Matt Fidler Re: Condition number
Nov 30, 2022 Kenneth Kowalski RE: Condition number
Nov 30, 2022 Leonid Gibiansky Re: Condition number
Nov 30, 2022 Peter Bonate Re: Condition number
Nov 30, 2022 Robert Bauer RE: Condition number
Nov 30, 2022 Bill Denney RE: Condition number
Dec 01, 2022 Kyun-Seop Bae Re: Condition number
Dec 01, 2022 Peter Bonate RE: Condition number
Dec 01, 2022 Kenneth Kowalski RE: Condition number
Dec 01, 2022 Ayyappa Chaturvedula Re: Condition number
Dec 01, 2022 Al Maloney Re: Condition number
Dec 01, 2022 Robert Bauer RE: [EXTERNAL] RE: Condition number
Dec 01, 2022 Robert Bauer Condition number
Dec 02, 2022 Robert Bauer Condition number