OFV higher with FOCEI than FO

9 messages 6 people Latest: Dec 11, 2008

OFV higher with FOCEI than FO

From: Ayyappa.5.chaturvedula Date: December 10, 2008 technical

Dear All, I am analyzing a data set pooled from 4 clinical studies with rich sampling. When I fit a 2 comp oral absorption model with lag time using FO, I got successful minimization with COV step, but minimization was not successful when I used FO parameter estimates as initial estimates for FOCE run. When I used FOCE with INTER minimization was successful with COV step but the OFV is much higher (~25000 vs 20000) with FOCEI estimation than FO. The parameter estimates make more sense with FOCEI than FO. My questions are, Can we get something like this or I am missing something here? Can we compare OFV between different estimation methods (my understanding is no and OFV in case of FO does not make a lot of sense)? Regards, Ayyappa Chaturvedula GlaxoSmithKline 1500 Littleton Road, Parsippany, NJ 07054 Ph:9738892200

OFV higher with FOCEI than FO

From: Ayyappa . 5 . Chaturvedula Date: December 10, 2008 technical

RE: OFV higher with FOCEI than FO

From: Ziad Hussein Date: December 10, 2008 technical

Hi Ayyappa, Your understanding is correct that the objective functions from FOCEI and FO can not be compared. Regards, Ziad Dr Ziad Hussein Senior Director, Pharmacometrics ICON Development Solutions Manchester United Kingdom

Quoted reply history

________________________________ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: 10 December 2008 14:40 To: [EMAIL PROTECTED]; [email protected] Subject: [NMusers] OFV higher with FOCEI than FO Dear All, I am analyzing a data set pooled from 4 clinical studies with rich sampling. When I fit a 2 comp oral absorption model with lag time using FO, I got successful minimization with COV step, but minimization was not successful when I used FO parameter estimates as initial estimates for FOCE run. When I used FOCE with INTER minimization was successful with COV step but the OFV is much higher (~25000 vs 20000) with FOCEI estimation than FO. The parameter estimates make more sense with FOCEI than FO. My questions are, Can we get something like this or I am missing something here? Can we compare OFV between different estimation methods (my understanding is no and OFV in case of FO does not make a lot of sense)? Regards, Ayyappa Chaturvedula GlaxoSmithKline 1500 Littleton Road, Parsippany, NJ 07054 Ph:9738892200

RE: OFV higher with FOCEI than FO

From: Bob Leary Date: December 10, 2008 technical

As shown by X. Wang, FO, FOCE and LAPLACE form a hierarchy of approximations. Both the FO and FOCE methods are based on the same underlying Laplacian approximation to the integral of the joint likelihood function of the random effects (eta's). The basic Laplace approximation requires knowledge of the value of the joint likelihood function at its peak, and the second derivatives at the eta values at which the peak is reached. The FOCE method adds 1 additional approximation to get the Hessian matrix of second derivatives at the peak of the joint likelihood function from first derivatives, but accurately determines the position of the peak (the empirical Bayes estimates) in random effects (eta) space and the function value at the peak (this determination of the EBE's is what the 'conditional step' is all about and is computationally costly.) Although the underlying Laplacian approximation is based on the local behavior of the joint log likelihood function in the neighborhood of its peak, FO does not investigate the behavior of the joint likelihood function near its peak at all (which is basically why FO estimates can be arbitrarily poor). Instead it guestimates the value at the peak by extrapolating from eta=0, using a single Newton step based on approximate first and second derivatives at eta=0. It also simply assigns the FOCE approximate values of the second derivatives at eta=0 to the values at the peak in order to evaluate the Laplacian approximation. These additional approximations layered on top of the basic Laplacian and FOCE approximations by FO are quite dubious for significantly nonlinear model functions, and often result in very poor quality parameter estimates compared to FOCE and Laplace. Strictly speaking. FOCE and FO objective values cannot be compared in any consistently meaningful sense. But loosely speaking, since both FO and FOCE share a common base Laplacian approximation, but FO layers on additional approximations on top of FOCE, the difference in FO vs FOCE objective values reflects the effects of the additional FO approximations. Large differences may suggest that the additional FO approximations have large effects, and make the FO estimates even more suspect relative to FOCE. Robert H. Leary, PhD Principal Software Engineer Pharsight Corp. 5520 Dillard Dr., Suite 210 Cary, NC 27511 Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871 This email message (including any attachments) is for the sole use of the intended recipient and may contain confidential and proprietary information. Any disclosure or distribution to third parties that is not specifically authorized by the sender is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

Quoted reply history

-----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of [EMAIL PROTECTED] Sent: Wednesday, December 10, 2008 9:40 AM To: [EMAIL PROTECTED]; [email protected] Subject: [NMusers] OFV higher with FOCEI than FO Dear All, I am analyzing a data set pooled from 4 clinical studies with rich sampling. When I fit a 2 comp oral absorption model with lag time using FO, I got successful minimization with COV step, but minimization was not successful when I used FO parameter estimates as initial estimates for FOCE run. When I used FOCE with INTER minimization was successful with COV step but the OFV is much higher (~25000 vs 20000) with FOCEI estimation than FO. The parameter estimates make more sense with FOCEI than FO. My questions are, Can we get something like this or I am missing something here? Can we compare OFV between different estimation methods (my understanding is no and OFV in case of FO does not make a lot of sense)? Regards, Ayyappa Chaturvedula GlaxoSmithKline 1500 Littleton Road, Parsippany, NJ 07054 Ph:9738892200

RE: OFV higher with FOCEI than FO

From: Matt Hutmacher Date: December 10, 2008 technical

Hi Bob, I would just add one point of clarification. My understanding is that the FOCE approximate is a Laplace-based approximation (related to it) only if the within subject residual error model does not contain any subject-specific random effects. Wolfinger R (1993). Laplace's approximation for nonlinear mixed models. Biometrika 80, 791-795. Vonesh ER, Chinchilli VM (1997). Linear and nonlinear models for the analysis of repeated measurements. Marcel Dekker. Matt

Quoted reply history

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bob Leary Sent: Wednesday, December 10, 2008 12:11 PM To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [email protected] Subject: RE: [NMusers] OFV higher with FOCEI than FO As shown by X. Wang, FO, FOCE and LAPLACE form a hierarchy of approximations. Both the FO and FOCE methods are based on the same underlying Laplacian approximation to the integral of the joint likelihood function of the random effects (eta's). The basic Laplace approximation requires knowledge of the value of the joint likelihood function at its peak, and the second derivatives at the eta values at which the peak is reached. The FOCE method adds 1 additional approximation to get the Hessian matrix of second derivatives at the peak of the joint likelihood function from first derivatives, but accurately determines the position of the peak (the empirical Bayes estimates) in random effects (eta) space and the function value at the peak (this determination of the EBE's is what the 'conditional step' is all about and is computationally costly.) Although the underlying Laplacian approximation is based on the local behavior of the joint log likelihood function in the neighborhood of its peak, FO does not investigate the behavior of the joint likelihood function near its peak at all (which is basically why FO estimates can be arbitrarily poor). Instead it guestimates the value at the peak by extrapolating from eta=0, using a single Newton step based on approximate first and second derivatives at eta=0. It also simply assigns the FOCE approximate values of the second derivatives at eta=0 to the values at the peak in order to evaluate the Laplacian approximation. These additional approximations layered on top of the basic Laplacian and FOCE approximations by FO are quite dubious for significantly nonlinear model functions, and often result in very poor quality parameter estimates compared to FOCE and Laplace. Strictly speaking. FOCE and FO objective values cannot be compared in any consistently meaningful sense. But loosely speaking, since both FO and FOCE share a common base Laplacian approximation, but FO layers on additional approximations on top of FOCE, the difference in FO vs FOCE objective values reflects the effects of the additional FO approximations. Large differences may suggest that the additional FO approximations have large effects, and make the FO estimates even more suspect relative to FOCE. Robert H. Leary, PhD Principal Software Engineer Pharsight Corp. 5520 Dillard Dr., Suite 210 Cary, NC 27511 Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871 This email message (including any attachments) is for the sole use of the intended recipient and may contain confidential and proprietary information. Any disclosure or distribution to third parties that is not specifically authorized by the sender is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of [EMAIL PROTECTED] Sent: Wednesday, December 10, 2008 9:40 AM To: [EMAIL PROTECTED]; [email protected] Subject: [NMusers] OFV higher with FOCEI than FO Dear All, I am analyzing a data set pooled from 4 clinical studies with rich sampling. When I fit a 2 comp oral absorption model with lag time using FO, I got successful minimization with COV step, but minimization was not successful when I used FO parameter estimates as initial estimates for FOCE run. When I used FOCE with INTER minimization was successful with COV step but the OFV is much higher (~25000 vs 20000) with FOCEI estimation than FO. The parameter estimates make more sense with FOCEI than FO. My questions are, Can we get something like this or I am missing something here? Can we compare OFV between different estimation methods (my understanding is no and OFV in case of FO does not make a lot of sense)? Regards, Ayyappa Chaturvedula GlaxoSmithKline 1500 Littleton Road, Parsippany, NJ 07054 Ph:9738892200

RE: OFV higher with FOCEI than FO

From: Yaning Wang Date: December 10, 2008 technical

Matt: That's not true. Those two references are discussing when the linearized structure model can also be derived from direct Laplacian approximation of the marginal likelihood. When there is an interaction between residual and between subject variability (or residual error model contain subject-specific random effect), linearizing the structure model around eta_hat cannot be derived from the Laplacian approximation any more. But in NONMEM, FOCE with interaction (when residual error model contain subject-specific random effect) is still derived from Laplacian approximation. In other words, NONMEM does not linearize the structure model for FOCE with interaction case. I discussed this in details in my paper (1). Adding the following splus code to the splus code in my paper and using the simple numerical example, you can see how NONMEM is calculating the objective function for FOCE with interaction. These things are further visualized in my talk recently put on ACCP webpage ( http://www.accp1.org/pharmacometrics/PopPKCourse.html). Yaning #reproduce NONMEM result using my equation 28 which is further approximation of Laplacian method sum<-0 for (i in 1:10) { data1<-data[data$ID==i,] cov<-data1$fp%*%t(data1$fp)*omega+diag(data1$f**2)*eps+2*data1$fp%*%t(da ta1$fp)*omega*eps cov1<-diag(data1$f**2)*eps ginv<-solve(cov1) sec<-t(data1$DV-data1$IPRE)%*%ginv%*%(data1$DV-data1$IPRE)+data1$ETA1[1] **2/omega frs<-determinant(cov, logarithm=T)$modulus[[1]] sum1<-sec+frs sum<-sum+sum1 } sum#39.45756 same as NONMEM OFV 39.458 1. Yaning Wang. Derivation of various NONMEM estimation methods. Journal of Pharmacokinetics and pharmacodynamics. 34:575-93 (2007) Yaning Wang, Ph.D. Team Leader, Pharmacometrics Office of Clinical Pharmacology Office of Translational Science Center for Drug Evaluation and Research U.S. Food and Drug Administration Phone: 301-796-1624 Email: yaning.wang "The contents of this message are mine personally and do not necessarily reflect any position of the Government or the Food and Drug Administration."

Quoted reply history

________________________________ From: owner-nmusers On Behalf Of Matt Hutmacher Sent: Wednesday, December 10, 2008 2:04 PM To: 'Bob Leary'; ayyappa.5.chaturvedula owner-nmusers Subject: RE: [NMusers] OFV higher with FOCEI than FO Hi Bob, I would just add one point of clarification. My understanding is that the FOCE approximate is a Laplace-based approximation (related to it) only if the within subject residual error model does not contain any subject-specific random effects. Wolfinger R (1993). Laplace's approximation for nonlinear mixed models. Biometrika 80, 791-795. Vonesh ER, Chinchilli VM (1997). Linear and nonlinear models for the analysis of repeated measurements. Marcel Dekker. Matt From: owner-nmusers On Behalf Of Bob Leary Sent: Wednesday, December 10, 2008 12:11 PM To: ayyappa.5.chaturvedula nmusers Subject: RE: [NMusers] OFV higher with FOCEI than FO As shown by X. Wang, FO, FOCE and LAPLACE form a hierarchy of approximations. Both the FO and FOCE methods are based on the same underlying Laplacian approximation to the integral of the joint likelihood function of the random effects (eta's). The basic Laplace approximation requires knowledge of the value of the joint likelihood function at its peak, and the second derivatives at the eta values at which the peak is reached. The FOCE method adds 1 additional approximation to get the Hessian matrix of second derivatives at the peak of the joint likelihood function from first derivatives, but accurately determines the position of the peak (the empirical Bayes estimates) in random effects (eta) space and the function value at the peak (this determination of the EBE's is what the 'conditional step' is all about and is computationally costly.) Although the underlying Laplacian approximation is based on the local behavior of the joint log likelihood function in the neighborhood of its peak, FO does not investigate the behavior of the joint likelihood function near its peak at all (which is basically why FO estimates can be arbitrarily poor). Instead it guestimates the value at the peak by extrapolating from eta=0, using a single Newton step based on approximate first and second derivatives at eta=0. It also simply assigns the FOCE approximate values of the second derivatives at eta=0 to the values at the peak in order to evaluate the Laplacian approximation. These additional approximations layered on top of the basic Laplacian and FOCE approximations by FO are quite dubious for significantly nonlinear model functions, and often result in very poor quality parameter estimates compared to FOCE and Laplace. Strictly speaking. FOCE and FO objective values cannot be compared in any consistently meaningful sense. But loosely speaking, since both FO and FOCE share a common base Laplacian approximation, but FO layers on additional approximations on top of FOCE, the difference in FO vs FOCE objective values reflects the effects of the additional FO approximations. Large differences may suggest that the additional FO approximations have large effects, and make the FO estimates even more suspect relative to FOCE. Robert H. Leary, PhD Principal Software Engineer Pharsight Corp. 5520 Dillard Dr., Suite 210 Cary, NC 27511 Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871 This email message (including any attachments) is for the sole use of the intended recipient and may contain confidential and proprietary information. Any disclosure or distribution to third parties that is not specifically authorized by the sender is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----Original Message----- From: owner-nmusers [mailto:owner-nmusers ayyappa.5.chaturvedula Sent: Wednesday, December 10, 2008 9:40 AM To: owner-nmusers Subject: [NMusers] OFV higher with FOCEI than FO Dear All, I am analyzing a data set pooled from 4 clinical studies with rich sampling. When I fit a 2 comp oral absorption model with lag time using FO, I got successful minimization with COV step, but minimization was not successful when I used FO parameter estimates as initial estimates for FOCE run. When I used FOCE with INTER minimization was successful with COV step but the OFV is much higher (~25000 vs 20000) with FOCEI estimation than FO. The parameter estimates make more sense with FOCEI than FO. My questions are, Can we get something like this or I am missing something here? Can we compare OFV between different estimation methods (my understanding is no and OFV in case of FO does not make a lot of sense)? Regards, Ayyappa Chaturvedula GlaxoSmithKline 1500 Littleton Road, Parsippany, NJ 07054 Ph:9738892200

RE: OFV higher with FOCEI than FO

From: Yaning Wang Date: December 11, 2008 technical

Matt: That's not true. Those two references are discussing when the linearized structure model can also be derived from direct Laplacian approximation of the marginal likelihood. When there is an interaction between residual and between subject variability (or residual error model contain subject-specific random effect), linearizing the structure model around eta_hat cannot be derived from the Laplacian approximation any more. But in NONMEM, FOCE with interaction (when residual error model contain subject-specific random effect) is still derived from Laplacian approximation. In other words, NONMEM does not linearize the structure model for FOCE with interaction case. I discussed this in details in my paper (1). Adding the following splus code to the splus code in my paper and using the simple numerical example, you can see how NONMEM is calculating the objective function for FOCE with interaction. These things are further visualized in my talk recently put on ACCP webpage ( http://www.accp1.org/pharmacometrics/PopPKCourse.html). Yaning #reproduce NONMEM result using my equation 28 which is further approximation of Laplacian method sum<-0 for (i in 1:10) { data1<-data[data$ID==i,] cov<-data1$fp%*%t(data1$fp)*omega+diag(data1$f**2)*eps+2*data1$fp%*%t(da ta1$fp)*omega*eps cov1<-diag(data1$f**2)*eps ginv<-solve(cov1) sec<-t(data1$DV-data1$IPRE)%*%ginv%*%(data1$DV-data1$IPRE)+data1$ETA1[1] **2/omega frs<-determinant(cov, logarithm=T)$modulus[[1]] sum1<-sec+frs sum<-sum+sum1 } sum#39.45756 same as NONMEM OFV 39.458 1. Yaning Wang. Derivation of various NONMEM estimation methods. Journal of Pharmacokinetics and pharmacodynamics. 34:575-93 (2007) Yaning Wang, Ph.D. Team Leader, Pharmacometrics Office of Clinical Pharmacology Office of Translational Science Center for Drug Evaluation and Research U.S. Food and Drug Administration Phone: 301-796-1624 Email: [EMAIL PROTECTED] "The contents of this message are mine personally and do not necessarily reflect any position of the Government or the Food and Drug Administration."

Quoted reply history

________________________________ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Matt Hutmacher Sent: Wednesday, December 10, 2008 2:04 PM To: 'Bob Leary'; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [email protected] Subject: RE: [NMusers] OFV higher with FOCEI than FO Hi Bob, I would just add one point of clarification. My understanding is that the FOCE approximate is a Laplace-based approximation (related to it) only if the within subject residual error model does not contain any subject-specific random effects. Wolfinger R (1993). Laplace's approximation for nonlinear mixed models. Biometrika 80, 791-795. Vonesh ER, Chinchilli VM (1997). Linear and nonlinear models for the analysis of repeated measurements. Marcel Dekker. Matt From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bob Leary Sent: Wednesday, December 10, 2008 12:11 PM To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [email protected] Subject: RE: [NMusers] OFV higher with FOCEI than FO As shown by X. Wang, FO, FOCE and LAPLACE form a hierarchy of approximations. Both the FO and FOCE methods are based on the same underlying Laplacian approximation to the integral of the joint likelihood function of the random effects (eta's). The basic Laplace approximation requires knowledge of the value of the joint likelihood function at its peak, and the second derivatives at the eta values at which the peak is reached. The FOCE method adds 1 additional approximation to get the Hessian matrix of second derivatives at the peak of the joint likelihood function from first derivatives, but accurately determines the position of the peak (the empirical Bayes estimates) in random effects (eta) space and the function value at the peak (this determination of the EBE's is what the 'conditional step' is all about and is computationally costly.) Although the underlying Laplacian approximation is based on the local behavior of the joint log likelihood function in the neighborhood of its peak, FO does not investigate the behavior of the joint likelihood function near its peak at all (which is basically why FO estimates can be arbitrarily poor). Instead it guestimates the value at the peak by extrapolating from eta=0, using a single Newton step based on approximate first and second derivatives at eta=0. It also simply assigns the FOCE approximate values of the second derivatives at eta=0 to the values at the peak in order to evaluate the Laplacian approximation. These additional approximations layered on top of the basic Laplacian and FOCE approximations by FO are quite dubious for significantly nonlinear model functions, and often result in very poor quality parameter estimates compared to FOCE and Laplace. Strictly speaking. FOCE and FO objective values cannot be compared in any consistently meaningful sense. But loosely speaking, since both FO and FOCE share a common base Laplacian approximation, but FO layers on additional approximations on top of FOCE, the difference in FO vs FOCE objective values reflects the effects of the additional FO approximations. Large differences may suggest that the additional FO approximations have large effects, and make the FO estimates even more suspect relative to FOCE. Robert H. Leary, PhD Principal Software Engineer Pharsight Corp. 5520 Dillard Dr., Suite 210 Cary, NC 27511 Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871 This email message (including any attachments) is for the sole use of the intended recipient and may contain confidential and proprietary information. Any disclosure or distribution to third parties that is not specifically authorized by the sender is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of [EMAIL PROTECTED] Sent: Wednesday, December 10, 2008 9:40 AM To: [EMAIL PROTECTED]; [email protected] Subject: [NMusers] OFV higher with FOCEI than FO Dear All, I am analyzing a data set pooled from 4 clinical studies with rich sampling. When I fit a 2 comp oral absorption model with lag time using FO, I got successful minimization with COV step, but minimization was not successful when I used FO parameter estimates as initial estimates for FOCE run. When I used FOCE with INTER minimization was successful with COV step but the OFV is much higher (~25000 vs 20000) with FOCEI estimation than FO. The parameter estimates make more sense with FOCEI than FO. My questions are, Can we get something like this or I am missing something here? Can we compare OFV between different estimation methods (my understanding is no and OFV in case of FO does not make a lot of sense)? Regards, Ayyappa Chaturvedula GlaxoSmithKline 1500 Littleton Road, Parsippany, NJ 07054 Ph:9738892200

RE: OFV higher with FOCEI than FO

From: Bob Leary Date: December 11, 2008 technical

Yaning - (my apologies for citing your work as 'X Wang' in an earlier post) . Thanks for the cogent explanation - indeed, the basic concept of the Laplacian approximation is to compute the numerical integral of an arbitrary joint likelihood function by replacing it with a 'nearby' surrogate Gaussian function and then using the analytic integral of the Gaussian. 'Nearby' usually means that the approximating Gaussian locally matches the underlying function in terms of function value at the peak (or as in the case of FO, approximate function value at the peakl) and second derivative at the peak or least some approximation to the second derivative (the first derivatives necessarily also match because they are zero at the peak). This basic Laplacian idea of substituting a Gaussian function for the original integrand and then integrating the Gaussian is common to all NONMEM FOCE/FOCEI/FO/Laplace variants, regardless of whether the residual model has an eta-dependency. Indeed, the basic Laplacian approach generalizes to models with discrete or categorical responses where the residual error model is replaced by a fairly arbitrary user defined likelihood function. As your JPP paper shows, the variants simply differ in the details of how they approximate the peak value and second derivatives of the Gaussian surrogate. Robert H. Leary, PhD Principal Software Engineer Pharsight Corp. 5520 Dillard Dr., Suite 210 Cary, NC 27511 Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871 This email message (including any attachments) is for the sole use of the intended recipient and may contain confidential and proprietary information. Any disclosure or distribution to third parties that is not specifically authorized by the sender is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

Quoted reply history

-----Original Message----- From: Wang, Yaning [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 10, 2008 20:45 PM To: Matt Hutmacher; Bob Leary; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [email protected] Subject: RE: [NMusers] OFV higher with FOCEI than FO Matt: That's not true. Those two references are discussing when the linearized structure model can also be derived from direct Laplacian approximation of the marginal likelihood. When there is an interaction between residual and between subject variability (or residual error model contain subject-specific random effect), linearizing the structure model around eta_hat cannot be derived from the Laplacian approximation any more. But in NONMEM, FOCE with interaction (when residual error model contain subject-specific random effect) is still derived from Laplacian approximation. In other words, NONMEM does not linearize the structure model for FOCE with interaction case. I discussed this in details in my paper (1). Adding the following splus code to the splus code in my paper and using the simple numerical example, you can see how NONMEM is calculating the objective function for FOCE with interaction. These things are further visualized in my talk recently put on ACCP webpage ( http://www.accp1.org/pharmacometrics/PopPKCourse.html). Yaning #reproduce NONMEM result using my equation 28 which is further approximation of Laplacian method sum<-0 for (i in 1:10) { data1<-data[data$ID==i,] cov<-data1$fp%*%t(data1$fp)*omega+diag(data1$f**2)*eps+2*data1$fp%*%t(data1$fp)*omega*eps cov1<-diag(data1$f**2)*eps ginv<-solve(cov1) sec<-t(data1$DV-data1$IPRE)%*%ginv%*%(data1$DV-data1$IPRE)+data1$ETA1[1]**2/omega frs<-determinant(cov, logarithm=T)$modulus[[1]] sum1<-sec+frs sum<-sum+sum1 } sum#39.45756 same as NONMEM OFV 39.458 1. Yaning Wang. Derivation of various NONMEM estimation methods. Journal of Pharmacokinetics and pharmacodynamics. 34:575-93 (2007) Yaning Wang, Ph.D. Team Leader, Pharmacometrics Office of Clinical Pharmacology Office of Translational Science Center for Drug Evaluation and Research U.S. Food and Drug Administration Phone: 301-796-1624 Email: [EMAIL PROTECTED] "The contents of this message are mine personally and do not necessarily reflect any position of the Government or the Food and Drug Administration." _____ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Matt Hutmacher Sent: Wednesday, December 10, 2008 2:04 PM To: 'Bob Leary'; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [email protected] Subject: RE: [NMusers] OFV higher with FOCEI than FO Hi Bob, I would just add one point of clarification. My understanding is that the FOCE approximate is a Laplace-based approximation (related to it) only if the within subject residual error model does not contain any subject-specific random effects. Wolfinger R (1993). Laplace's approximation for nonlinear mixed models. Biometrika 80, 791-795. Vonesh ER, Chinchilli VM (1997). Linear and nonlinear models for the analysis of repeated measurements. Marcel Dekker. Matt From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bob Leary Sent: Wednesday, December 10, 2008 12:11 PM To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [email protected] Subject: RE: [NMusers] OFV higher with FOCEI than FO As shown by X. Wang, FO, FOCE and LAPLACE form a hierarchy of approximations. Both the FO and FOCE methods are based on the same underlying Laplacian approximation to the integral of the joint likelihood function of the random effects (eta's). The basic Laplace approximation requires knowledge of the value of the joint likelihood function at its peak, and the second derivatives at the eta values at which the peak is reached. The FOCE method adds 1 additional approximation to get the Hessian matrix of second derivatives at the peak of the joint likelihood function from first derivatives, but accurately determines the position of the peak (the empirical Bayes estimates) in random effects (eta) space and the function value at the peak (this determination of the EBE's is what the 'conditional step' is all about and is computationally costly.) Although the underlying Laplacian approximation is based on the local behavior of the joint log likelihood function in the neighborhood of its peak, FO does not investigate the behavior of the joint likelihood function near its peak at all (which is basically why FO estimates can be arbitrarily poor). Instead it guestimates the value at the peak by extrapolating from eta=0, using a single Newton step based on approximate first and second derivatives at eta=0. It also simply assigns the FOCE approximate values of the second derivatives at eta=0 to the values at the peak in order to evaluate the Laplacian approximation. These additional approximations layered on top of the basic Laplacian and FOCE approximations by FO are quite dubious for significantly nonlinear model functions, and often result in very poor quality parameter estimates compared to FOCE and Laplace. Strictly speaking. FOCE and FO objective values cannot be compared in any consistently meaningful sense. But loosely speaking, since both FO and FOCE share a common base Laplacian approximation, but FO layers on additional approximations on top of FOCE, the difference in FO vs FOCE objective values reflects the effects of the additional FO approximations. Large differences may suggest that the additional FO approximations have large effects, and make the FO estimates even more suspect relative to FOCE. Robert H. Leary, PhD Principal Software Engineer Pharsight Corp. 5520 Dillard Dr., Suite 210 Cary, NC 27511 Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871 This email message (including any attachments) is for the sole use of the intended recipient and may contain confidential and proprietary information. Any disclosure or distribution to third parties that is not specifically authorized by the sender is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of [EMAIL PROTECTED] Sent: Wednesday, December 10, 2008 9:40 AM To: [EMAIL PROTECTED]; [email protected] Subject: [NMusers] OFV higher with FOCEI than FO Dear All, I am analyzing a data set pooled from 4 clinical studies with rich sampling. When I fit a 2 comp oral absorption model with lag time using FO, I got successful minimization with COV step, but minimization was not successful when I used FO parameter estimates as initial estimates for FOCE run. When I used FOCE with INTER minimization was successful with COV step but the OFV is much higher (~25000 vs 20000) with FOCEI estimation than FO. The parameter estimates make more sense with FOCEI than FO. My questions are, Can we get something like this or I am missing something here? Can we compare OFV between different estimation methods (my understanding is no and OFV in case of FO does not make a lot of sense)? Regards, Ayyappa Chaturvedula GlaxoSmithKline 1500 Littleton Road, Parsippany, NJ 07054 Ph:9738892200

RE: OFV higher with FOCEI than FO

From: Matt Hutmacher Date: December 11, 2008 technical

Yaning, Perhaps I was not clear in my email. I should have stated it more explicitly in the following; For the normal density case then application of the Laplace approximation yields -2LL = (y-f(eta))'SIG^-1(y-f(eta)+eta'*OM^-1*eta+log|SIG| Where y are the data, f is the mean function, eta is the subject specific random variable, SIG is the intrasubject residual variance, OM is the between subject variance of the etas. If SIG depends on eta, then the extended least squares form, ie -2LL =( y-f(etahat)-G*etahat)'MSIG^-1(y-f(etahat)-Getahat)+log(MSIG) Where MSIG=G*OM^-1*G+SIG no longer represents a Laplace based approximation to the marginal distribution of y. Now it can be approximately Laplacian based by various procedures, but it is not Laplacian based anymore. See Page 345 of Vonesh. Note that Wolfinger shows this derivation. Matt

Quoted reply history

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Wang, Yaning Sent: Wednesday, December 10, 2008 8:45 PM To: Matt Hutmacher; Bob Leary; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [email protected] Subject: RE: [NMusers] OFV higher with FOCEI than FO Matt: That's not true. Those two references are discussing when the linearized structure model can also be derived from direct Laplacian approximation of the marginal likelihood. When there is an interaction between residual and between subject variability (or residual error model contain subject-specific random effect), linearizing the structure model around eta_hat cannot be derived from the Laplacian approximation any more. But in NONMEM, FOCE with interaction (when residual error model contain subject-specific random effect) is still derived from Laplacian approximation. In other words, NONMEM does not linearize the structure model for FOCE with interaction case. I discussed this in details in my paper (1). Adding the following splus code to the splus code in my paper and using the simple numerical example, you can see how NONMEM is calculating the objective function for FOCE with interaction. These things are further visualized in my talk recently put on ACCP webpage ( http://www.accp1.org/pharmacometrics/PopPKCourse.html). Yaning #reproduce NONMEM result using my equation 28 which is further approximation of Laplacian method sum<-0 for (i in 1:10) { data1<-data[data$ID==i,] cov<-data1$fp%*%t(data1$fp)*omega+diag(data1$f**2)*eps+2*data1$fp%*%t(data1$ fp)*omega*eps cov1<-diag(data1$f**2)*eps ginv<-solve(cov1) sec<-t(data1$DV-data1$IPRE)%*%ginv%*%(data1$DV-data1$IPRE)+data1$ETA1[1]**2/ omega frs<-determinant(cov, logarithm=T)$modulus[[1]] sum1<-sec+frs sum<-sum+sum1 } sum#39.45756 same as NONMEM OFV 39.458 1. Yaning Wang. Derivation of various NONMEM estimation methods. Journal of Pharmacokinetics and pharmacodynamics. 34:575-93 (2007) Yaning Wang, Ph.D. Team Leader, Pharmacometrics Office of Clinical Pharmacology Office of Translational Science Center for Drug Evaluation and Research U.S. Food and Drug Administration Phone: 301-796-1624 Email: [EMAIL PROTECTED] "The contents of this message are mine personally and do not necessarily reflect any position of the Government or the Food and Drug Administration." _____ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Matt Hutmacher Sent: Wednesday, December 10, 2008 2:04 PM To: 'Bob Leary'; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [email protected] Subject: RE: [NMusers] OFV higher with FOCEI than FO Hi Bob, I would just add one point of clarification. My understanding is that the FOCE approximate is a Laplace-based approximation (related to it) only if the within subject residual error model does not contain any subject-specific random effects. Wolfinger R (1993). Laplace's approximation for nonlinear mixed models. Biometrika 80, 791-795. Vonesh ER, Chinchilli VM (1997). Linear and nonlinear models for the analysis of repeated measurements. Marcel Dekker. Matt From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bob Leary Sent: Wednesday, December 10, 2008 12:11 PM To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [email protected] Subject: RE: [NMusers] OFV higher with FOCEI than FO As shown by X. Wang, FO, FOCE and LAPLACE form a hierarchy of approximations. Both the FO and FOCE methods are based on the same underlying Laplacian approximation to the integral of the joint likelihood function of the random effects (eta's). The basic Laplace approximation requires knowledge of the value of the joint likelihood function at its peak, and the second derivatives at the eta values at which the peak is reached. The FOCE method adds 1 additional approximation to get the Hessian matrix of second derivatives at the peak of the joint likelihood function from first derivatives, but accurately determines the position of the peak (the empirical Bayes estimates) in random effects (eta) space and the function value at the peak (this determination of the EBE's is what the 'conditional step' is all about and is computationally costly.) Although the underlying Laplacian approximation is based on the local behavior of the joint log likelihood function in the neighborhood of its peak, FO does not investigate the behavior of the joint likelihood function near its peak at all (which is basically why FO estimates can be arbitrarily poor). Instead it guestimates the value at the peak by extrapolating from eta=0, using a single Newton step based on approximate first and second derivatives at eta=0. It also simply assigns the FOCE approximate values of the second derivatives at eta=0 to the values at the peak in order to evaluate the Laplacian approximation. These additional approximations layered on top of the basic Laplacian and FOCE approximations by FO are quite dubious for significantly nonlinear model functions, and often result in very poor quality parameter estimates compared to FOCE and Laplace. Strictly speaking. FOCE and FO objective values cannot be compared in any consistently meaningful sense. But loosely speaking, since both FO and FOCE share a common base Laplacian approximation, but FO layers on additional approximations on top of FOCE, the difference in FO vs FOCE objective values reflects the effects of the additional FO approximations. Large differences may suggest that the additional FO approximations have large effects, and make the FO estimates even more suspect relative to FOCE. Robert H. Leary, PhD Principal Software Engineer Pharsight Corp. 5520 Dillard Dr., Suite 210 Cary, NC 27511 Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871 This email message (including any attachments) is for the sole use of the intended recipient and may contain confidential and proprietary information. Any disclosure or distribution to third parties that is not specifically authorized by the sender is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of [EMAIL PROTECTED] Sent: Wednesday, December 10, 2008 9:40 AM To: [EMAIL PROTECTED]; [email protected] Subject: [NMusers] OFV higher with FOCEI than FO Dear All, I am analyzing a data set pooled from 4 clinical studies with rich sampling. When I fit a 2 comp oral absorption model with lag time using FO, I got successful minimization with COV step, but minimization was not successful when I used FO parameter estimates as initial estimates for FOCE run. When I used FOCE with INTER minimization was successful with COV step but the OFV is much higher (~25000 vs 20000) with FOCEI estimation than FO. The parameter estimates make more sense with FOCEI than FO. My questions are, Can we get something like this or I am missing something here? Can we compare OFV between different estimation methods (my understanding is no and OFV in case of FO does not make a lot of sense)? Regards, Ayyappa Chaturvedula GlaxoSmithKline 1500 Littleton Road, Parsippany, NJ 07054 Ph:9738892200

`j` / `k`	Next / previous message
`o`	Open message
`f`	Search
`s`	Copy link
`t`	Filters
`c`	Copy message body
`r`	Related threads
`?`	This help
`Esc`	Close / clear