Dear All,
I am analyzing a data set pooled from 4 clinical studies with rich
sampling. When I fit a 2 comp oral absorption model with lag time using
FO, I got successful minimization with COV step, but minimization was not
successful when I used FO parameter estimates as initial estimates for
FOCE run. When I used FOCE with INTER minimization was successful with
COV step but the OFV is much higher (~25000 vs 20000) with FOCEI
estimation than FO. The parameter estimates make more sense with FOCEI
than FO. My questions are,
Can we get something like this or I am missing something here?
Can we compare OFV between different estimation methods (my understanding
is no and OFV in case of FO does not make a lot of sense)?
Regards,
Ayyappa Chaturvedula
GlaxoSmithKline
1500 Littleton Road,
Parsippany, NJ 07054
Ph:9738892200
OFV higher with FOCEI than FO
9 messages
6 people
Latest: Dec 11, 2008
Dear All,
I am analyzing a data set pooled from 4 clinical studies with rich
sampling. When I fit a 2 comp oral absorption model with lag time using
FO, I got successful minimization with COV step, but minimization was not
successful when I used FO parameter estimates as initial estimates for
FOCE run. When I used FOCE with INTER minimization was successful with
COV step but the OFV is much higher (~25000 vs 20000) with FOCEI
estimation than FO. The parameter estimates make more sense with FOCEI
than FO. My questions are,
Can we get something like this or I am missing something here?
Can we compare OFV between different estimation methods (my understanding
is no and OFV in case of FO does not make a lot of sense)?
Regards,
Ayyappa Chaturvedula
GlaxoSmithKline
1500 Littleton Road,
Parsippany, NJ 07054
Ph:9738892200
Hi Ayyappa,
Your understanding is correct that the objective functions from FOCEI
and FO can not be compared.
Regards,
Ziad
Dr Ziad Hussein
Senior Director, Pharmacometrics
ICON Development Solutions
Manchester
United Kingdom
Quoted reply history
________________________________
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of [EMAIL PROTECTED]
Sent: 10 December 2008 14:40
To: [EMAIL PROTECTED]; [email protected]
Subject: [NMusers] OFV higher with FOCEI than FO
Dear All,
I am analyzing a data set pooled from 4 clinical studies with rich
sampling. When I fit a 2 comp oral absorption model with lag time using
FO, I got successful minimization with COV step, but minimization was
not successful when I used FO parameter estimates as initial estimates
for FOCE run. When I used FOCE with INTER minimization was successful
with COV step but the OFV is much higher (~25000 vs 20000) with FOCEI
estimation than FO. The parameter estimates make more sense with FOCEI
than FO. My questions are,
Can we get something like this or I am missing something here?
Can we compare OFV between different estimation methods (my
understanding is no and OFV in case of FO does not make a lot of sense)?
Regards,
Ayyappa Chaturvedula
GlaxoSmithKline
1500 Littleton Road,
Parsippany, NJ 07054
Ph:9738892200
As shown by X. Wang, FO, FOCE and LAPLACE form a hierarchy of approximations.
Both the FO and FOCE methods are based on the same underlying Laplacian
approximation to the
integral of the joint likelihood function of the random effects (eta's).
The basic Laplace approximation requires knowledge of
the value of the joint likelihood function at its peak, and the second
derivatives at the
eta values at which the peak is reached.
The FOCE method adds 1 additional approximation to get the
Hessian matrix of second derivatives at the peak of the joint likelihood
function
from first derivatives, but accurately
determines the position of the peak (the empirical Bayes estimates)
in random effects (eta) space
and the function value at the peak (this determination of the EBE's is what
the 'conditional step'
is all about and is computationally costly.)
Although the underlying Laplacian approximation is based on the local behavior
of the
joint log likelihood function in the neighborhood of its peak, FO does not
investigate the behavior
of the joint likelihood function near its peak at all (which is basically why
FO estimates can be arbitrarily
poor). Instead it guestimates the value at the peak by extrapolating from
eta=0, using a single Newton step
based on approximate first and second derivatives at eta=0. It also simply
assigns the FOCE
approximate values of the
second derivatives at eta=0 to the values at the peak in order to evaluate the
Laplacian approximation.
These additional approximations layered on top of the basic Laplacian and FOCE
approximations
by FO are quite dubious for significantly nonlinear model functions, and often
result in very poor quality
parameter estimates compared to FOCE and Laplace.
Strictly speaking. FOCE and FO objective values cannot be compared in any
consistently meaningful sense.
But loosely speaking, since both FO and FOCE share a common base Laplacian
approximation, but FO layers
on additional approximations on top of FOCE, the difference in FO vs FOCE
objective values reflects the
effects of the additional FO approximations. Large differences may suggest
that the additional FO approximations
have large effects, and make the FO estimates even more suspect relative to
FOCE.
Robert H. Leary, PhD
Principal Software Engineer
Pharsight Corp.
5520 Dillard Dr., Suite 210
Cary, NC 27511
Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871
This email message (including any attachments) is for the sole use of the
intended recipient and may contain confidential and proprietary information.
Any disclosure or distribution to third parties that is not specifically
authorized by the sender is prohibited. If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of the original
message.
Quoted reply history
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of [EMAIL PROTECTED]
Sent: Wednesday, December 10, 2008 9:40 AM
To: [EMAIL PROTECTED]; [email protected]
Subject: [NMusers] OFV higher with FOCEI than FO
Dear All,
I am analyzing a data set pooled from 4 clinical studies with rich sampling.
When I fit a 2 comp oral absorption model with lag time using FO, I got
successful minimization with COV step, but minimization was not successful when
I used FO parameter estimates as initial estimates for FOCE run. When I used
FOCE with INTER minimization was successful with COV step but the OFV is much
higher (~25000 vs 20000) with FOCEI estimation than FO. The parameter
estimates make more sense with FOCEI than FO. My questions are,
Can we get something like this or I am missing something here?
Can we compare OFV between different estimation methods (my understanding is no
and OFV in case of FO does not make a lot of sense)?
Regards,
Ayyappa Chaturvedula
GlaxoSmithKline
1500 Littleton Road,
Parsippany, NJ 07054
Ph:9738892200
Hi Bob,
I would just add one point of clarification. My understanding is that the
FOCE approximate is a Laplace-based approximation (related to it) only if
the within subject residual error model does not contain any
subject-specific random effects.
Wolfinger R (1993). Laplace's approximation for nonlinear mixed models.
Biometrika 80, 791-795.
Vonesh ER, Chinchilli VM (1997). Linear and nonlinear models for the
analysis of repeated measurements. Marcel Dekker.
Matt
Quoted reply history
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Bob Leary
Sent: Wednesday, December 10, 2008 12:11 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED];
[email protected]
Subject: RE: [NMusers] OFV higher with FOCEI than FO
As shown by X. Wang, FO, FOCE and LAPLACE form a hierarchy of
approximations.
Both the FO and FOCE methods are based on the same underlying Laplacian
approximation to the
integral of the joint likelihood function of the random effects (eta's).
The basic Laplace approximation requires knowledge of
the value of the joint likelihood function at its peak, and the second
derivatives at the
eta values at which the peak is reached.
The FOCE method adds 1 additional approximation to get the
Hessian matrix of second derivatives at the peak of the joint likelihood
function
from first derivatives, but accurately
determines the position of the peak (the empirical Bayes estimates)
in random effects (eta) space
and the function value at the peak (this determination of the EBE's is
what the 'conditional step'
is all about and is computationally costly.)
Although the underlying Laplacian approximation is based on the local
behavior of the
joint log likelihood function in the neighborhood of its peak, FO does not
investigate the behavior
of the joint likelihood function near its peak at all (which is basically
why FO estimates can be arbitrarily
poor). Instead it guestimates the value at the peak by extrapolating from
eta=0, using a single Newton step
based on approximate first and second derivatives at eta=0. It also simply
assigns the FOCE
approximate values of the
second derivatives at eta=0 to the values at the peak in order to evaluate
the Laplacian approximation.
These additional approximations layered on top of the basic Laplacian and
FOCE approximations
by FO are quite dubious for significantly nonlinear model functions, and
often result in very poor quality
parameter estimates compared to FOCE and Laplace.
Strictly speaking. FOCE and FO objective values cannot be compared in any
consistently meaningful sense.
But loosely speaking, since both FO and FOCE share a common base Laplacian
approximation, but FO layers
on additional approximations on top of FOCE, the difference in FO vs FOCE
objective values reflects the
effects of the additional FO approximations. Large differences may suggest
that the additional FO approximations
have large effects, and make the FO estimates even more suspect relative to
FOCE.
Robert H. Leary, PhD
Principal Software Engineer
Pharsight Corp.
5520 Dillard Dr., Suite 210
Cary, NC 27511
Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871
This email message (including any attachments) is for the sole use of the
intended recipient and may contain confidential and proprietary
information. Any disclosure or distribution to third parties that is not
specifically authorized by the sender is prohibited. If you are not the
intended recipient, please contact the sender by reply email and destroy all
copies of the original message.
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Behalf Of [EMAIL PROTECTED]
Sent: Wednesday, December 10, 2008 9:40 AM
To: [EMAIL PROTECTED]; [email protected]
Subject: [NMusers] OFV higher with FOCEI than FO
Dear All,
I am analyzing a data set pooled from 4 clinical studies with rich sampling.
When I fit a 2 comp oral absorption model with lag time using FO, I got
successful minimization with COV step, but minimization was not successful
when I used FO parameter estimates as initial estimates for FOCE run. When
I used FOCE with INTER minimization was successful with COV step but the OFV
is much higher (~25000 vs 20000) with FOCEI estimation than FO. The
parameter estimates make more sense with FOCEI than FO. My questions are,
Can we get something like this or I am missing something here?
Can we compare OFV between different estimation methods (my understanding is
no and OFV in case of FO does not make a lot of sense)?
Regards,
Ayyappa Chaturvedula
GlaxoSmithKline
1500 Littleton Road,
Parsippany, NJ 07054
Ph:9738892200
Matt:
That's not true. Those two references are discussing when the linearized
structure model can also be derived from direct Laplacian approximation
of the marginal likelihood. When there is an interaction between
residual and between subject variability (or residual error model
contain subject-specific random effect), linearizing the structure model
around eta_hat cannot be derived from the Laplacian approximation any
more. But in NONMEM, FOCE with interaction (when residual error model
contain subject-specific random effect) is still derived from Laplacian
approximation. In other words, NONMEM does not linearize the structure
model for FOCE with interaction case. I discussed this in details in my
paper (1). Adding the following splus code to the splus code in my paper
and using the simple numerical example, you can see how NONMEM is
calculating the objective function for FOCE with interaction. These
things are further visualized in my talk recently put on ACCP webpage
( http://www.accp1.org/pharmacometrics/PopPKCourse.html).
Yaning
#reproduce NONMEM result using my equation 28 which is further
approximation of Laplacian method
sum<-0
for (i in 1:10) {
data1<-data[data$ID==i,]
cov<-data1$fp%*%t(data1$fp)*omega+diag(data1$f**2)*eps+2*data1$fp%*%t(da
ta1$fp)*omega*eps
cov1<-diag(data1$f**2)*eps
ginv<-solve(cov1)
sec<-t(data1$DV-data1$IPRE)%*%ginv%*%(data1$DV-data1$IPRE)+data1$ETA1[1]
**2/omega
frs<-determinant(cov, logarithm=T)$modulus[[1]]
sum1<-sec+frs
sum<-sum+sum1
}
sum#39.45756 same as NONMEM OFV 39.458
1. Yaning Wang. Derivation of various NONMEM estimation methods. Journal
of Pharmacokinetics and pharmacodynamics. 34:575-93 (2007)
Yaning Wang, Ph.D.
Team Leader, Pharmacometrics
Office of Clinical Pharmacology
Office of Translational Science
Center for Drug Evaluation and Research
U.S. Food and Drug Administration
Phone: 301-796-1624
Email: yaning.wang
"The contents of this message are mine personally and do not necessarily
reflect any position of the Government or the Food and Drug
Administration."
Quoted reply history
________________________________
From: owner-nmusers
On Behalf Of Matt Hutmacher
Sent: Wednesday, December 10, 2008 2:04 PM
To: 'Bob Leary'; ayyappa.5.chaturvedula
owner-nmusers
Subject: RE: [NMusers] OFV higher with FOCEI than FO
Hi Bob,
I would just add one point of clarification. My understanding is that
the FOCE approximate is a Laplace-based approximation (related to it)
only if the within subject residual error model does not contain any
subject-specific random effects.
Wolfinger R (1993). Laplace's approximation for nonlinear mixed models.
Biometrika 80, 791-795.
Vonesh ER, Chinchilli VM (1997). Linear and nonlinear models for the
analysis of repeated measurements. Marcel Dekker.
Matt
From: owner-nmusers
On Behalf Of Bob Leary
Sent: Wednesday, December 10, 2008 12:11 PM
To: ayyappa.5.chaturvedula
nmusers
Subject: RE: [NMusers] OFV higher with FOCEI than FO
As shown by X. Wang, FO, FOCE and LAPLACE form a hierarchy of
approximations.
Both the FO and FOCE methods are based on the same underlying Laplacian
approximation to the
integral of the joint likelihood function of the random effects (eta's).
The basic Laplace approximation requires knowledge of
the value of the joint likelihood function at its peak, and the second
derivatives at the
eta values at which the peak is reached.
The FOCE method adds 1 additional approximation to get the
Hessian matrix of second derivatives at the peak of the joint likelihood
function
from first derivatives, but accurately
determines the position of the peak (the empirical Bayes estimates)
in random effects (eta) space
and the function value at the peak (this determination of the EBE's is
what the 'conditional step'
is all about and is computationally costly.)
Although the underlying Laplacian approximation is based on the local
behavior of the
joint log likelihood function in the neighborhood of its peak, FO does
not investigate the behavior
of the joint likelihood function near its peak at all (which is
basically why FO estimates can be arbitrarily
poor). Instead it guestimates the value at the peak by extrapolating
from eta=0, using a single Newton step
based on approximate first and second derivatives at eta=0. It also
simply assigns the FOCE
approximate values of the
second derivatives at eta=0 to the values at the peak in order to
evaluate the Laplacian approximation.
These additional approximations layered on top of the basic Laplacian
and FOCE approximations
by FO are quite dubious for significantly nonlinear model functions, and
often result in very poor quality
parameter estimates compared to FOCE and Laplace.
Strictly speaking. FOCE and FO objective values cannot be compared in
any consistently meaningful sense.
But loosely speaking, since both FO and FOCE share a common base
Laplacian approximation, but FO layers
on additional approximations on top of FOCE, the difference in FO vs
FOCE objective values reflects the
effects of the additional FO approximations. Large differences may
suggest that the additional FO approximations
have large effects, and make the FO estimates even more suspect relative
to FOCE.
Robert H. Leary, PhD
Principal Software Engineer
Pharsight Corp.
5520 Dillard Dr., Suite 210
Cary, NC 27511
Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871
This email message (including any attachments) is for the sole use of
the intended recipient and may contain confidential and proprietary
information. Any disclosure or distribution to third parties that is
not specifically authorized by the sender is prohibited. If you are not
the intended recipient, please contact the sender by reply email and
destroy all copies of the original message.
-----Original Message-----
From: owner-nmusers
[mailto:owner-nmusers
ayyappa.5.chaturvedula
Sent: Wednesday, December 10, 2008 9:40 AM
To: owner-nmusers
Subject: [NMusers] OFV higher with FOCEI than FO
Dear All,
I am analyzing a data set pooled from 4 clinical studies with
rich sampling. When I fit a 2 comp oral absorption model with lag time
using FO, I got successful minimization with COV step, but minimization
was not successful when I used FO parameter estimates as initial
estimates for FOCE run. When I used FOCE with INTER minimization was
successful with COV step but the OFV is much higher (~25000 vs 20000)
with FOCEI estimation than FO. The parameter estimates make more sense
with FOCEI than FO. My questions are,
Can we get something like this or I am missing something here?
Can we compare OFV between different estimation methods (my
understanding is no and OFV in case of FO does not make a lot of sense)?
Regards,
Ayyappa Chaturvedula
GlaxoSmithKline
1500 Littleton Road,
Parsippany, NJ 07054
Ph:9738892200
Matt:
That's not true. Those two references are discussing when the linearized
structure model can also be derived from direct Laplacian approximation
of the marginal likelihood. When there is an interaction between
residual and between subject variability (or residual error model
contain subject-specific random effect), linearizing the structure model
around eta_hat cannot be derived from the Laplacian approximation any
more. But in NONMEM, FOCE with interaction (when residual error model
contain subject-specific random effect) is still derived from Laplacian
approximation. In other words, NONMEM does not linearize the structure
model for FOCE with interaction case. I discussed this in details in my
paper (1). Adding the following splus code to the splus code in my paper
and using the simple numerical example, you can see how NONMEM is
calculating the objective function for FOCE with interaction. These
things are further visualized in my talk recently put on ACCP webpage
( http://www.accp1.org/pharmacometrics/PopPKCourse.html).
Yaning
#reproduce NONMEM result using my equation 28 which is further
approximation of Laplacian method
sum<-0
for (i in 1:10) {
data1<-data[data$ID==i,]
cov<-data1$fp%*%t(data1$fp)*omega+diag(data1$f**2)*eps+2*data1$fp%*%t(da
ta1$fp)*omega*eps
cov1<-diag(data1$f**2)*eps
ginv<-solve(cov1)
sec<-t(data1$DV-data1$IPRE)%*%ginv%*%(data1$DV-data1$IPRE)+data1$ETA1[1]
**2/omega
frs<-determinant(cov, logarithm=T)$modulus[[1]]
sum1<-sec+frs
sum<-sum+sum1
}
sum#39.45756 same as NONMEM OFV 39.458
1. Yaning Wang. Derivation of various NONMEM estimation methods. Journal
of Pharmacokinetics and pharmacodynamics. 34:575-93 (2007)
Yaning Wang, Ph.D.
Team Leader, Pharmacometrics
Office of Clinical Pharmacology
Office of Translational Science
Center for Drug Evaluation and Research
U.S. Food and Drug Administration
Phone: 301-796-1624
Email: [EMAIL PROTECTED]
"The contents of this message are mine personally and do not necessarily
reflect any position of the Government or the Food and Drug
Administration."
Quoted reply history
________________________________
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Matt Hutmacher
Sent: Wednesday, December 10, 2008 2:04 PM
To: 'Bob Leary'; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [email protected]
Subject: RE: [NMusers] OFV higher with FOCEI than FO
Hi Bob,
I would just add one point of clarification. My understanding is that
the FOCE approximate is a Laplace-based approximation (related to it)
only if the within subject residual error model does not contain any
subject-specific random effects.
Wolfinger R (1993). Laplace's approximation for nonlinear mixed models.
Biometrika 80, 791-795.
Vonesh ER, Chinchilli VM (1997). Linear and nonlinear models for the
analysis of repeated measurements. Marcel Dekker.
Matt
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Bob Leary
Sent: Wednesday, December 10, 2008 12:11 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED];
[email protected]
Subject: RE: [NMusers] OFV higher with FOCEI than FO
As shown by X. Wang, FO, FOCE and LAPLACE form a hierarchy of
approximations.
Both the FO and FOCE methods are based on the same underlying Laplacian
approximation to the
integral of the joint likelihood function of the random effects (eta's).
The basic Laplace approximation requires knowledge of
the value of the joint likelihood function at its peak, and the second
derivatives at the
eta values at which the peak is reached.
The FOCE method adds 1 additional approximation to get the
Hessian matrix of second derivatives at the peak of the joint likelihood
function
from first derivatives, but accurately
determines the position of the peak (the empirical Bayes estimates)
in random effects (eta) space
and the function value at the peak (this determination of the EBE's is
what the 'conditional step'
is all about and is computationally costly.)
Although the underlying Laplacian approximation is based on the local
behavior of the
joint log likelihood function in the neighborhood of its peak, FO does
not investigate the behavior
of the joint likelihood function near its peak at all (which is
basically why FO estimates can be arbitrarily
poor). Instead it guestimates the value at the peak by extrapolating
from eta=0, using a single Newton step
based on approximate first and second derivatives at eta=0. It also
simply assigns the FOCE
approximate values of the
second derivatives at eta=0 to the values at the peak in order to
evaluate the Laplacian approximation.
These additional approximations layered on top of the basic Laplacian
and FOCE approximations
by FO are quite dubious for significantly nonlinear model functions, and
often result in very poor quality
parameter estimates compared to FOCE and Laplace.
Strictly speaking. FOCE and FO objective values cannot be compared in
any consistently meaningful sense.
But loosely speaking, since both FO and FOCE share a common base
Laplacian approximation, but FO layers
on additional approximations on top of FOCE, the difference in FO vs
FOCE objective values reflects the
effects of the additional FO approximations. Large differences may
suggest that the additional FO approximations
have large effects, and make the FO estimates even more suspect relative
to FOCE.
Robert H. Leary, PhD
Principal Software Engineer
Pharsight Corp.
5520 Dillard Dr., Suite 210
Cary, NC 27511
Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871
This email message (including any attachments) is for the sole use of
the intended recipient and may contain confidential and proprietary
information. Any disclosure or distribution to third parties that is
not specifically authorized by the sender is prohibited. If you are not
the intended recipient, please contact the sender by reply email and
destroy all copies of the original message.
-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of
[EMAIL PROTECTED]
Sent: Wednesday, December 10, 2008 9:40 AM
To: [EMAIL PROTECTED]; [email protected]
Subject: [NMusers] OFV higher with FOCEI than FO
Dear All,
I am analyzing a data set pooled from 4 clinical studies with
rich sampling. When I fit a 2 comp oral absorption model with lag time
using FO, I got successful minimization with COV step, but minimization
was not successful when I used FO parameter estimates as initial
estimates for FOCE run. When I used FOCE with INTER minimization was
successful with COV step but the OFV is much higher (~25000 vs 20000)
with FOCEI estimation than FO. The parameter estimates make more sense
with FOCEI than FO. My questions are,
Can we get something like this or I am missing something here?
Can we compare OFV between different estimation methods (my
understanding is no and OFV in case of FO does not make a lot of sense)?
Regards,
Ayyappa Chaturvedula
GlaxoSmithKline
1500 Littleton Road,
Parsippany, NJ 07054
Ph:9738892200
Yaning -
(my apologies for citing your work as 'X Wang' in an earlier post) . Thanks
for the cogent explanation - indeed, the basic concept of the Laplacian
approximation is to compute the numerical integral of an arbitrary joint
likelihood function by replacing it with a 'nearby' surrogate Gaussian
function and then using the analytic integral of the Gaussian. 'Nearby'
usually means that the approximating Gaussian locally matches the underlying
function in terms of function value at the peak (or as in the case of FO,
approximate function value at the peakl) and second derivative at the peak or
least some approximation to the second derivative (the first derivatives
necessarily also match because they are zero at the peak). This basic
Laplacian idea of substituting a Gaussian function for the original integrand
and then integrating the Gaussian is common to all NONMEM
FOCE/FOCEI/FO/Laplace variants, regardless of
whether the residual model has an eta-dependency. Indeed, the basic Laplacian
approach generalizes to models with
discrete or categorical responses where the residual error model is replaced by
a fairly arbitrary user defined likelihood
function. As your JPP paper shows, the variants simply differ in the details
of how they approximate the peak value and second derivatives of the Gaussian
surrogate.
Robert H. Leary, PhD
Principal Software Engineer
Pharsight Corp.
5520 Dillard Dr., Suite 210
Cary, NC 27511
Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871
This email message (including any attachments) is for the sole use of the
intended recipient and may contain confidential and proprietary information.
Any disclosure or distribution to third parties that is not specifically
authorized by the sender is prohibited. If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of the original
message.
Quoted reply history
-----Original Message-----
From: Wang, Yaning [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 10, 2008 20:45 PM
To: Matt Hutmacher; Bob Leary; [EMAIL PROTECTED]; [EMAIL PROTECTED];
[email protected]
Subject: RE: [NMusers] OFV higher with FOCEI than FO
Matt:
That's not true. Those two references are discussing when the linearized
structure model can also be derived from direct Laplacian approximation of the
marginal likelihood. When there is an interaction between residual and between
subject variability (or residual error model contain subject-specific random
effect), linearizing the structure model around eta_hat cannot be derived from
the Laplacian approximation any more. But in NONMEM, FOCE with interaction
(when residual error model contain subject-specific random effect) is still
derived from Laplacian approximation. In other words, NONMEM does not linearize
the structure model for FOCE with interaction case. I discussed this in details
in my paper (1). Adding the following splus code to the splus code in my paper
and using the simple numerical example, you can see how NONMEM is calculating
the objective function for FOCE with interaction. These things are further
visualized in my talk recently put on ACCP webpage (
http://www.accp1.org/pharmacometrics/PopPKCourse.html).
Yaning
#reproduce NONMEM result using my equation 28 which is further approximation of
Laplacian method
sum<-0
for (i in 1:10) {
data1<-data[data$ID==i,]
cov<-data1$fp%*%t(data1$fp)*omega+diag(data1$f**2)*eps+2*data1$fp%*%t(data1$fp)*omega*eps
cov1<-diag(data1$f**2)*eps
ginv<-solve(cov1)
sec<-t(data1$DV-data1$IPRE)%*%ginv%*%(data1$DV-data1$IPRE)+data1$ETA1[1]**2/omega
frs<-determinant(cov, logarithm=T)$modulus[[1]]
sum1<-sec+frs
sum<-sum+sum1
}
sum#39.45756 same as NONMEM OFV 39.458
1. Yaning Wang. Derivation of various NONMEM estimation methods. Journal of
Pharmacokinetics and pharmacodynamics. 34:575-93 (2007)
Yaning Wang, Ph.D.
Team Leader, Pharmacometrics
Office of Clinical Pharmacology
Office of Translational Science
Center for Drug Evaluation and Research
U.S. Food and Drug Administration
Phone: 301-796-1624
Email: [EMAIL PROTECTED]
"The contents of this message are mine personally and do not necessarily
reflect any position of the Government or the Food and Drug Administration."
_____
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Matt Hutmacher
Sent: Wednesday, December 10, 2008 2:04 PM
To: 'Bob Leary'; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [email protected]
Subject: RE: [NMusers] OFV higher with FOCEI than FO
Hi Bob,
I would just add one point of clarification. My understanding is that the FOCE
approximate is a Laplace-based approximation (related to it) only if the within
subject residual error model does not contain any subject-specific random
effects.
Wolfinger R (1993). Laplace's approximation for nonlinear mixed models.
Biometrika 80, 791-795.
Vonesh ER, Chinchilli VM (1997). Linear and nonlinear models for the analysis
of repeated measurements. Marcel Dekker.
Matt
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bob Leary
Sent: Wednesday, December 10, 2008 12:11 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [email protected]
Subject: RE: [NMusers] OFV higher with FOCEI than FO
As shown by X. Wang, FO, FOCE and LAPLACE form a hierarchy of approximations.
Both the FO and FOCE methods are based on the same underlying Laplacian
approximation to the
integral of the joint likelihood function of the random effects (eta's).
The basic Laplace approximation requires knowledge of
the value of the joint likelihood function at its peak, and the second
derivatives at the
eta values at which the peak is reached.
The FOCE method adds 1 additional approximation to get the
Hessian matrix of second derivatives at the peak of the joint likelihood
function
from first derivatives, but accurately
determines the position of the peak (the empirical Bayes estimates)
in random effects (eta) space
and the function value at the peak (this determination of the EBE's is what
the 'conditional step'
is all about and is computationally costly.)
Although the underlying Laplacian approximation is based on the local behavior
of the
joint log likelihood function in the neighborhood of its peak, FO does not
investigate the behavior
of the joint likelihood function near its peak at all (which is basically why
FO estimates can be arbitrarily
poor). Instead it guestimates the value at the peak by extrapolating from
eta=0, using a single Newton step
based on approximate first and second derivatives at eta=0. It also simply
assigns the FOCE
approximate values of the
second derivatives at eta=0 to the values at the peak in order to evaluate the
Laplacian approximation.
These additional approximations layered on top of the basic Laplacian and FOCE
approximations
by FO are quite dubious for significantly nonlinear model functions, and often
result in very poor quality
parameter estimates compared to FOCE and Laplace.
Strictly speaking. FOCE and FO objective values cannot be compared in any
consistently meaningful sense.
But loosely speaking, since both FO and FOCE share a common base Laplacian
approximation, but FO layers
on additional approximations on top of FOCE, the difference in FO vs FOCE
objective values reflects the
effects of the additional FO approximations. Large differences may suggest
that the additional FO approximations
have large effects, and make the FO estimates even more suspect relative to
FOCE.
Robert H. Leary, PhD
Principal Software Engineer
Pharsight Corp.
5520 Dillard Dr., Suite 210
Cary, NC 27511
Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871
This email message (including any attachments) is for the sole use of the
intended recipient and may contain confidential and proprietary information.
Any disclosure or distribution to third parties that is not specifically
authorized by the sender is prohibited. If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of the original
message.
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of [EMAIL PROTECTED]
Sent: Wednesday, December 10, 2008 9:40 AM
To: [EMAIL PROTECTED]; [email protected]
Subject: [NMusers] OFV higher with FOCEI than FO
Dear All,
I am analyzing a data set pooled from 4 clinical studies with rich sampling.
When I fit a 2 comp oral absorption model with lag time using FO, I got
successful minimization with COV step, but minimization was not successful when
I used FO parameter estimates as initial estimates for FOCE run. When I used
FOCE with INTER minimization was successful with COV step but the OFV is much
higher (~25000 vs 20000) with FOCEI estimation than FO. The parameter
estimates make more sense with FOCEI than FO. My questions are,
Can we get something like this or I am missing something here?
Can we compare OFV between different estimation methods (my understanding is no
and OFV in case of FO does not make a lot of sense)?
Regards,
Ayyappa Chaturvedula
GlaxoSmithKline
1500 Littleton Road,
Parsippany, NJ 07054
Ph:9738892200
Yaning,
Perhaps I was not clear in my email. I should have stated it more
explicitly in the following;
For the normal density case then application of the Laplace approximation
yields
-2LL = (y-f(eta))'SIG^-1(y-f(eta)+eta'*OM^-1*eta+log|SIG|
Where y are the data, f is the mean function, eta is the subject specific
random variable, SIG is the intrasubject residual variance, OM is the
between subject variance of the etas. If SIG depends on eta, then the
extended least squares form, ie
-2LL =( y-f(etahat)-G*etahat)'MSIG^-1(y-f(etahat)-Getahat)+log(MSIG)
Where MSIG=G*OM^-1*G+SIG no longer represents a Laplace based approximation
to the marginal distribution of y. Now it can be approximately Laplacian
based by various procedures, but it is not Laplacian based anymore.
See Page 345 of Vonesh. Note that Wolfinger shows this derivation.
Matt
Quoted reply history
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Wang, Yaning
Sent: Wednesday, December 10, 2008 8:45 PM
To: Matt Hutmacher; Bob Leary; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [email protected]
Subject: RE: [NMusers] OFV higher with FOCEI than FO
Matt:
That's not true. Those two references are discussing when the linearized
structure model can also be derived from direct Laplacian approximation of
the marginal likelihood. When there is an interaction between residual and
between subject variability (or residual error model contain
subject-specific random effect), linearizing the structure model around
eta_hat cannot be derived from the Laplacian approximation any more. But in
NONMEM, FOCE with interaction (when residual error model contain
subject-specific random effect) is still derived from Laplacian
approximation. In other words, NONMEM does not linearize the structure model
for FOCE with interaction case. I discussed this in details in my paper (1).
Adding the following splus code to the splus code in my paper and using the
simple numerical example, you can see how NONMEM is calculating the
objective function for FOCE with interaction. These things are further
visualized in my talk recently put on ACCP webpage
( http://www.accp1.org/pharmacometrics/PopPKCourse.html).
Yaning
#reproduce NONMEM result using my equation 28 which is further approximation
of Laplacian method
sum<-0
for (i in 1:10) {
data1<-data[data$ID==i,]
cov<-data1$fp%*%t(data1$fp)*omega+diag(data1$f**2)*eps+2*data1$fp%*%t(data1$
fp)*omega*eps
cov1<-diag(data1$f**2)*eps
ginv<-solve(cov1)
sec<-t(data1$DV-data1$IPRE)%*%ginv%*%(data1$DV-data1$IPRE)+data1$ETA1[1]**2/
omega
frs<-determinant(cov, logarithm=T)$modulus[[1]]
sum1<-sec+frs
sum<-sum+sum1
}
sum#39.45756 same as NONMEM OFV 39.458
1. Yaning Wang. Derivation of various NONMEM estimation methods. Journal of
Pharmacokinetics and pharmacodynamics. 34:575-93 (2007)
Yaning Wang, Ph.D.
Team Leader, Pharmacometrics
Office of Clinical Pharmacology
Office of Translational Science
Center for Drug Evaluation and Research
U.S. Food and Drug Administration
Phone: 301-796-1624
Email: [EMAIL PROTECTED]
"The contents of this message are mine personally and do not necessarily
reflect any position of the Government or the Food and Drug Administration."
_____
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Matt Hutmacher
Sent: Wednesday, December 10, 2008 2:04 PM
To: 'Bob Leary'; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [email protected]
Subject: RE: [NMusers] OFV higher with FOCEI than FO
Hi Bob,
I would just add one point of clarification. My understanding is that the
FOCE approximate is a Laplace-based approximation (related to it) only if
the within subject residual error model does not contain any
subject-specific random effects.
Wolfinger R (1993). Laplace's approximation for nonlinear mixed models.
Biometrika 80, 791-795.
Vonesh ER, Chinchilli VM (1997). Linear and nonlinear models for the
analysis of repeated measurements. Marcel Dekker.
Matt
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Bob Leary
Sent: Wednesday, December 10, 2008 12:11 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED];
[email protected]
Subject: RE: [NMusers] OFV higher with FOCEI than FO
As shown by X. Wang, FO, FOCE and LAPLACE form a hierarchy of
approximations.
Both the FO and FOCE methods are based on the same underlying Laplacian
approximation to the
integral of the joint likelihood function of the random effects (eta's).
The basic Laplace approximation requires knowledge of
the value of the joint likelihood function at its peak, and the second
derivatives at the
eta values at which the peak is reached.
The FOCE method adds 1 additional approximation to get the
Hessian matrix of second derivatives at the peak of the joint likelihood
function
from first derivatives, but accurately
determines the position of the peak (the empirical Bayes estimates)
in random effects (eta) space
and the function value at the peak (this determination of the EBE's is
what the 'conditional step'
is all about and is computationally costly.)
Although the underlying Laplacian approximation is based on the local
behavior of the
joint log likelihood function in the neighborhood of its peak, FO does not
investigate the behavior
of the joint likelihood function near its peak at all (which is basically
why FO estimates can be arbitrarily
poor). Instead it guestimates the value at the peak by extrapolating from
eta=0, using a single Newton step
based on approximate first and second derivatives at eta=0. It also simply
assigns the FOCE
approximate values of the
second derivatives at eta=0 to the values at the peak in order to evaluate
the Laplacian approximation.
These additional approximations layered on top of the basic Laplacian and
FOCE approximations
by FO are quite dubious for significantly nonlinear model functions, and
often result in very poor quality
parameter estimates compared to FOCE and Laplace.
Strictly speaking. FOCE and FO objective values cannot be compared in any
consistently meaningful sense.
But loosely speaking, since both FO and FOCE share a common base Laplacian
approximation, but FO layers
on additional approximations on top of FOCE, the difference in FO vs FOCE
objective values reflects the
effects of the additional FO approximations. Large differences may suggest
that the additional FO approximations
have large effects, and make the FO estimates even more suspect relative to
FOCE.
Robert H. Leary, PhD
Principal Software Engineer
Pharsight Corp.
5520 Dillard Dr., Suite 210
Cary, NC 27511
Phone/Voice Mail: (919) 852-4625, Fax: (919) 859-6871
This email message (including any attachments) is for the sole use of the
intended recipient and may contain confidential and proprietary
information. Any disclosure or distribution to third parties that is not
specifically authorized by the sender is prohibited. If you are not the
intended recipient, please contact the sender by reply email and destroy all
copies of the original message.
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Behalf Of [EMAIL PROTECTED]
Sent: Wednesday, December 10, 2008 9:40 AM
To: [EMAIL PROTECTED]; [email protected]
Subject: [NMusers] OFV higher with FOCEI than FO
Dear All,
I am analyzing a data set pooled from 4 clinical studies with rich sampling.
When I fit a 2 comp oral absorption model with lag time using FO, I got
successful minimization with COV step, but minimization was not successful
when I used FO parameter estimates as initial estimates for FOCE run. When
I used FOCE with INTER minimization was successful with COV step but the OFV
is much higher (~25000 vs 20000) with FOCEI estimation than FO. The
parameter estimates make more sense with FOCEI than FO. My questions are,
Can we get something like this or I am missing something here?
Can we compare OFV between different estimation methods (my understanding is
no and OFV in case of FO does not make a lot of sense)?
Regards,
Ayyappa Chaturvedula
GlaxoSmithKline
1500 Littleton Road,
Parsippany, NJ 07054
Ph:9738892200