Re: Condition number

From: Ayyappa Chaturvedula Date: December 01, 2022 technical Source: mail-archive.com
Dear Pete, Ken, Nick, Bob and all, I truly appreciate the spirit of discussion and helpful background information. It is nostalgic to see all on this thread. Thank you. Regards, Ayyappa
Quoted reply history
> On Nov 30, 2022, at 7:43 PM, Ken Kowalski <[email protected]> wrote: > > Hi Pete, > > I'm really not trying to conflate these two different concepts. What you are > describing is a desire to have a diagnostic that relates to the numeric > instability of matrix inversion. If that is your desire, then yes the CN of > the correlation matrix is not what you want. As I suggested previously, a CN > derived from the Hessian (R-matrix in NONMEM parlance) or from the covariance > matrix (inverse R if MATRIX=R option is employed) is probably what you want > because this is the matrix that is actually being inverted and the CN will be > larger because both differences in scales of the variances as well as > collinearity issues will contribute to the potential numerical instability in > the inversion process. However, my focus is more purely on the impact of > collinearity where the choice of model and the limitations with the data to > support the choice of model can have a big impact on model stability. > > From Bob Bauer's response earlier today it sounds like NONMEM behind the > scenes is performing the eigenvalue analysis of the Hessian (R-matrix) as the > first step and if that is successful (all eigenvalues positive such that the > R matrix is positive semidefinite) and hence invertible then the COV step > runs and the covariance matrix, correlation matrix and eigenvalues (PRINT=E > option of $COV step) from the correlation matrix will be reported. Note when > the COV step fails we often get the warning message that the R-matrix is > non-positive semi-definite (NPSD) which implies one or more of the > eigenvalues from the R matrix is 0 (singular) or negative. So clearly, > NONMEM is calculating these eigenvalues behind the scenes and it sounds like > you would like NONMEM to report these even if NONMEM determines that they are > all positive and the $COV step can run successfully. I see no reason why > NONMEM could not make this an option so that you can assess for yourself how > much loss in accuracy there might have been in inverting the R matrix. Maybe > broach this with Bob? > > Best, > > Ken > > -----Original Message----- > From: Bonate, Peter [mailto:[email protected]]luSent: Wednesday, > November 30, 2022 7:52 PM > To: Ken Kowalski <[email protected]>; 'Leonid Gibiansky' > <[email protected]> > Cc: 'Matthew Fidler' <[email protected]>; 'Kyun-Seop Bae' > <[email protected]>; [email protected]; 'Jeroen Elassaiss-Schaap > (PD-value B.V.)' <[email protected]>; Alan Maloney > ([email protected]) <[email protected]> > Subject: RE: [NMusers] Condition number > > Thanks Ken and Al. I miss these discussions, while others in NMusers are > probably thinking “how can there be that many emails on this”. > > I think we are conflating different things. The reason we look at the CN is > that during the optimization process NONMEM has to invert a matrix (its > either gradient or the Hessian, I am not sure), and also again in the > calculation of the standard errors. The log10 CN is how many digits are lost > in that inversion process. You can have matrix instability for many reasons, > collinearity being one of them, but that is not the only reason. As you > said, you can have parameters that are widely different in scale; that can > also cause instability in that inversion process. Thus, a high CN does not > always imply you have collinearity. > > So if NONMEM is reporting the eigenvalues of the correlation matrix, then > this has a couple of consequences: > • The CN no longer means how many digits are lost during inversion > • The CN no longer indicates how stable that matrix inversion is > • The only thing it is good for now is detecting collinearity. > • We use a cutoff value of 1000 because that implies we lose 3 digits of > accuracy during the inversion. This value may not be applicable to a > correlation matrix eigenvector. > > This is why using the correlation matrix makes no sense to me. Still doesn’t. > > Ayyappa – see what a can of worms you opened. Lol. > > pete > > > > Peter Bonate, PhD > Executive Director > Pharmacokinetics, Modeling, and Simulation (PKMS) Clinical Pharmacology and > Exploratory Development (CPED) Astellas > 1 Astellas Way > Northbrook, IL 60062 > [email protected] > (224) 619-4901 > > > Quote of the week – > “Dancing with the Stars” is not owned by Astellas. > > -----Original Message----- > From: Bonate, Peter > Sent: Wednesday, November 30, 2022 9:13 AM > To: Ken Kowalski <[email protected]>; 'Leonid Gibiansky' > <[email protected]> > Cc: 'Matthew Fidler' <[email protected]>; 'Kyun-Seop Bae' > <[email protected]>; [email protected]; 'Jeroen Elassaiss-Schaap > (PD-value B.V.)' <[email protected]> > Subject: RE: [NMusers] Condition number > > Just wanted to follow up on a few things. > > First, Nick, glad to hear from you again. > > I gave up trying to understand how NONMEM works years ago. I don't need to > know how the engine works in my car to drive it or make me a better driver. > But I did look at CN years ago. One of my first publications, The Effect of > Collinearity on Parameter Estimates in Nonlinear Mixed Effect Models (1999), > showed that when you put correlated covariates into a model (with > correlations greater than 0.75) the standard error of the estimates become > inflated, and the estimates of the parameters themselves becomes biased. > This is why we don't put weight and BSA on the same parameter, for example. > You can spot this problem easily from the CN. So, although I would never > choose a model based solely on its condition number, I always look at it as > part of the totality of evidence for how good a model is. But maybe that's > just me. > > And to follow up with this statement from Ken: > That is, a high CN in any one of the three matrices (Hessian, covariance > matrix, correlation matrix) will result in a high CN in the others. > I would think that the correlation matrix will give you the smallest > condition number because it's scaled. I needed to see this for myself in R. > I made a covariance matrix and computed the eigenvalues then transformed it > to a correlation matrix. The condition number of the correlation matrix is > lower than the covariance matrix condition number. > >> cov <- c(10, 2, 1, 2, 4, 3, 1, 3, 6) >> cov <- matrix(cov, nrow=3, byrow=TRUE) cov > [,1] [,2] [,3] > [1,] 10 2 1 > [2,] 2 4 3 > [3,] 1 3 6 >> p <- cov2cor(cov) >> p > [,1] [,2] [,3] > [1,] 1.0000000 0.3162278 0.1290994 > [2,] 0.3162278 1.0000000 0.6123724 > [3,] 0.1290994 0.6123724 1.0000000 >> eig.cov <- eigen(cov) >> eig.p <- eigen(p) >> CN.cov <- eig.cov$values[1]/eig.cov$values[3] >> CN.p <- eig.p$values[1]/eig.p$values[3] CN.cov > [1] 6.68266 >> CN.p > [1] 4.899988 > > So I guess we need Bob Bauer to chime in on this latter issue. > > pete > > > Peter Bonate, PhD > Executive Director > Pharmacokinetics, Modeling, and Simulation (PKMS) Clinical Pharmacology and > Exploratory Development (CPED) Astellas > 1 Astellas Way > Northbrook, IL 60062 > [email protected] > (224) 619-4901 > > > Quote of the week – > “Dancing with the Stars” is not owned by Astellas. > > -----Original Message----- > From: Ken Kowalski <[email protected]> > Sent: Tuesday, November 29, 2022 8:29 PM > To: Bonate, Peter <[email protected]>; 'Leonid Gibiansky' > <[email protected]> > Cc: 'Matthew Fidler' <[email protected]>; 'Kyun-Seop Bae' > <[email protected]>; [email protected]; 'Jeroen Elassaiss-Schaap > (PD-value B.V.)' <[email protected]> > Subject: RE: [NMusers] Condition number > > Hi Pete, > > I would say the Hessian would be the more appropriate matrix rather than the > Jacobian since the covariance matrix of the parameter estimates is typically > estimated as the inverse of the Hessian for most nonlinear regression > packages and what NONMEM does if you use the MATRIX=R option on the $COV step > instead of NONMEM's default sandwich estimator. Looking at the eigenvalues > of the Hessian, or the eigenvalues of the covariance matrix of the parameter > estimates or the eigenvalues of the correlation matrix of the parameter > estimates are all going to be related. That is, a high CN in any one of the > three matrices (Hessian, covariance matrix, correlation matrix) will result > in a high CN in the others. > > I have encountered NONMEM reporting a negative eigenvalue too. I assume this > is the result of a numerical precision issue because if it was truly > negative, then the Hessian would not be positive semi-definite and hence the > COV step should fail. I am not a numerical analyst so this is another issue > that I would be interested in hearing from Bob Bauer on how NONMEM can report > a negative eigenvalue. > > Best, > > Ken > > Kenneth G. Kowalski > Kowalski PMetrics Consulting, LLC > Email: [email protected] > Cell: 248-207-5082 > > > > -----Original Message----- > From: Bonate, Peter [mailto:[email protected]] > Sent: Tuesday, November 29, 2022 8:27 PM > To: Leonid Gibiansky <[email protected]> > Cc: Ken Kowalski <[email protected]>; Matthew Fidler > <[email protected]>; Kyun-Seop Bae <[email protected]>; > [email protected]; Jeroen Elassaiss-Schaap (PD-value B.V.) > <[email protected]> > Subject: Re: [NMusers] Condition number > > This is great. Just like the glory days of NMusers. Any moment now Nick > Holford is jointing to chime in. > > I’m not an expert in matrix algebra but is the correlation matrix the right > one to be using? We are concerned about inversion of the hessian. That > instability is what affects our parameter estimates and standard errors. > Doesn’t that depend on the Jacobian? Shouldn’t we be looking at the > eigenvalues of the Jacobian matrix instead? > > And to echo what was already said. Never use the condition number as an > absolute. It’s a yardstick. FYI- one time I got a negative eigenvalue from > nonmem and would not have known how unstable the model was unless I looked at > the eigenvalue. > > Pete. > >> On Nov 29, 2022, at 7:17 PM, Leonid Gibiansky <[email protected]> >> wrote: >> >> from the manual: >> >> Iteration -1000000003 indicates that this line contains the condition number >> , lowest, highest, Eigen values of the correlation matrix of the variances >> of the final parameters. >> >> >> >>>> On 11/29/2022 7:59 PM, Ken Kowalski wrote: >>> Hi Matt, >>> I’m pretty sure Stu Beal told me many years ago that NONMEM calculates the >>> eigenvalues from the correlation matrix. Maybe Bob Bauer can chime in here? >>> Ken >>> *From:*Matthew Fidler [mailto:[email protected]] >>> *Sent:* Tuesday, November 29, 2022 7:56 PM >>> *To:* Ken Kowalski <[email protected]> >>> *Cc:* Kyun-Seop Bae <[email protected]>; [email protected]; >>> Jeroen Elassaiss-Schaap (PD-value B.V.) <[email protected]> >>> *Subject:* Re: [NMusers] Condition number Hi Ken, I am unsure, since >>> I don't have my NONMEM manual handy. >>> I based my understanding on reading about condition numbers in numerical >>> analysis, which seemed to use the parameter estimates: >>> https://en.wikipedia.org/wiki/Condition_number >>> https://en.wikipedia.org/wiki/Condition_number >>> If it uses the correlation matrix, it could be less sensitive. >>> Matt >>>> On Tue, Nov 29, 2022 at 6:11 PM Ken Kowalski <[email protected] >>>> <mailto:[email protected]>> wrote: >>> Hi Matt, >>> Correct me if I’m wrong but I thought NONMEM calculates the >>> condition number based on the correlation matrix of the parameter >>> estimates so it is scaled based on the standard errors of the estimates. >>> Ken >>> *From:*Matthew Fidler [mailto:[email protected] >>> <mailto:[email protected]>] >>> *Sent:* Tuesday, November 29, 2022 7:04 PM >>> *To:* Ken Kowalski <[email protected] >>> <mailto:[email protected]>> >>> *Cc:* Kyun-Seop Bae <[email protected] >>> <mailto:[email protected]>>; [email protected] >>> <mailto:[email protected]>; Jeroen Elassaiss-Schaap (PD-value >>> B.V.) <[email protected] <mailto:[email protected]>> >>> *Subject:* Re: [NMusers] Condition number >>> Hi Ken & Kyun-Seop, >>> I agree it should be taught, since it is prevalent in the industry, >>> and should be looked at as something to investigate further, but no >>> hard and fast rule should be applied to if the model is reasonable >>> and fit for purpose. That should be done in conjunction with other >>> diagnostic plots. >>> One thing that has always bothered me about the condition number is >>> that it is calculated based on the final parameter estimates, but >>> not the scaled parameter estimates. Truly the scaling is supposed >>> to help make the gradient on a comparable scale and fix many >>> numerical problems here. Hence, if the scaling works as it is >>> supposed to, small changes may not affect the colinearity as >>> strongly as the calculated condition number suggests. >>> This is mainly why I see it as a number to keep in mind instead of a >>> hard and fast rule. >>> Matt >>> On Tue, Nov 29, 2022 at 5:09 PM Ken Kowalski <[email protected] >>> <mailto:[email protected]>> wrote: >>> Hi Kyun-Seop, >>> I would state things a little differently rather than say >>> “devalue condition number and multi-collinearity” we should >>> treat CN as a diagnostic and rules such as CN>1000 should NOT be >>> used as a hard and fast rule to reject a model. I agree with >>> Jeroen that we should understand the implications of a high CN >>> and the impact multi-collinearity may have on the model >>> estimation and that there are other diagnostics such as >>> correlations, variance inflation factors (VIF), standard errors, >>> CIs, etc. that can also help with our understanding of the >>> effects of multi-collinearity and its implications for model >>> development. >>> That being said, if you have a model with a high CN and the >>> model converges with realistic point estimates and reasonable >>> standard errors then it may still be reasonable to accept that >>> model. However, in this setting I would probably still want to >>> re-run the model with different starting values and make sure it >>> converges to the same OFV and set of point estimates. >>> As the smallest eigenvalue goes to 0 and the CN goes to infinity >>> we end up with a singular Hessian matrix (R matrix) so we know >>> that at some point a high enough CN will result in convergence >>> and COV step failures. Thus, you shouldn’t simply dismiss CN as >>> not having any diagnostic value, just don’t apply it in a rule >>> such as CN>1000 to blindly reject a model. The CN>1000 rule >>> should only be used to call your attention to the potential for >>> an issue that warrants further investigation before accepting >>> the model or deciding how to alter the model to improve >>> stability in the estimation. >>> Best, >>> Ken >>> Kenneth G. Kowalski >>> Kowalski PMetrics Consulting, LLC >>> Email: [email protected] <mailto:[email protected]> >>> Cell: 248-207-5082 >>> *From:*[email protected] >>> <mailto:[email protected]> >>> [mailto:[email protected] >>> <mailto:[email protected]>] *On Behalf Of *Kyun-Seop Bae >>> *Sent:* Tuesday, November 29, 2022 5:10 PM >>> *To:* [email protected] <mailto:[email protected]> >>> *Subject:* Fwd: [NMusers] Condition numbera >>> Dear All, >>> I would like to devalue condition number and multi-collinearity >>> in nonlinear regression. >>> The reason we consider condition number (or multi-collinearity) >>> is that this may cause the following fitting (estimation) problems; >>> 1. Fitting failure (fail to converge, fail to minimize) >>> 2. Unrealistic point estimates >>> 3. Too wide standard errors >>> If you do not see the above problems (i.e., no estimation >>> problem with modest standard error), you do not need to give >>> attention to the condition number. >>> I think I saw 10^(n – parameters) criterion in an old version of >>> Gabrielsson’s book many years ago (but not in the latest version). >>> Best regards, >>> Kyun-Seop Bae >>> On Tue, 29 Nov 2022 at 22:59, Ayyappa Chaturvedula >>> <[email protected] <mailto:[email protected]>> wrote: >>> Dear all, >>> I am wondering if someone can provide references for the >>> condition number thresholds we are seeing (<1000) etc. Also, >>> the other way I have seen when I was in graduate school that >>> condition number <10^n (n- number of parameters) is OK. >>> Personally, I am depending on correlation matrix rather than >>> condition number and have seen cases where condition number >>> is large (according to 1000 rule but less than 10^n rule) >>> but correlation matrix is fine. >>> I want to provide these for my teaching purposes and any >>> help is greatly appreciated. >>> Regards, >>> Ayyappa >>> >>> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm >> _campaign=sig-email&utm_content=emailclient> >>> >>> Virus-free.www.avast.com >>> >>> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm >> _campaign=sig-email&utm_content=emailclient> >> > > > -- > This email has been checked for viruses by Avast antivirus software. > www.avast.com > >
Nov 29, 2022 Ayyappa Chaturvedula Condition number
Nov 29, 2022 Kenneth Kowalski RE: Condition number
Nov 29, 2022 Peter Bonate RE: Condition number
Nov 29, 2022 Jeroen Elassaiss-Schaap Re: Condition number
Nov 29, 2022 Kyun-Seop Bae Fwd: Condition number
Nov 30, 2022 Matt Fidler Re: Condition number
Nov 30, 2022 Kenneth Kowalski RE: Condition number
Nov 30, 2022 Leonid Gibiansky Re: Condition number
Nov 30, 2022 Peter Bonate Re: Condition number
Nov 30, 2022 Robert Bauer RE: Condition number
Nov 30, 2022 Bill Denney RE: Condition number
Dec 01, 2022 Kyun-Seop Bae Re: Condition number
Dec 01, 2022 Peter Bonate RE: Condition number
Dec 01, 2022 Kenneth Kowalski RE: Condition number
Dec 01, 2022 Ayyappa Chaturvedula Re: Condition number
Dec 01, 2022 Al Maloney Re: Condition number
Dec 01, 2022 Robert Bauer RE: [EXTERNAL] RE: Condition number
Dec 01, 2022 Robert Bauer Condition number
Dec 02, 2022 Robert Bauer Condition number