RE: Theoretical questions about Beal's M2 method
Dear Jun and NMusers,
I would like to make a correction to my email below. Just to make things
clearer, let me restate the way that NONMEM’s PHI function works: internally it
takes the absolute value of the argument (X) and computes the approximate area
of normal density curve from |X| to the positive infinity. Since the standard
normal density curve is symmetric around 0, the approximated value is actually
the same as the area from minus infinity to -|X|. However, regardless of
whether X is positive or negative, the PHI function returns the cdf.
Assuming f(t) is smaller than QL, the argument ((QL-f(t))/g(t)) is positive.
The SMALLER f(t) makes the distance from QL larger (assuming g(t) is same or
smaller) and so is the argument. Thus, the area between the argument and
positive infinity gets smaller and the cdf gets BIGGER. I missed the part that
the argument is not just f(t) but the difference from QL. Therefore, as Jun
pointed out, the smaller f(t), the higher the cdf. This makes sense because the
censored observation with smaller f(t) is more likely to be BQL than the one
with bigger f(t) (in M3, the likelihood for BQL observation is the probability
that the observation is BQL). For example, if the f(t) is same as QL, there is
a 50% chance that an observation is BQL. If f(t) is 0, the probability of being
BQL will be 1.
However, it may not necessarily be the case that the censored observation with
smaller f(t) will have more contribution to the objective function. For
uncensored observations, the closer f(t) to the observations is more favorable
as the algorithm tries to maximize the likelihood (actually minimize -2 log
likelihood). For censored observations, the smaller f(t) is more favorable as
it makes cdf bigger. The answer we will get from the software should be
somewhere in between where the likelihood for all the data (the product) is
maximized.
Thank you,
Jae Eun
Quoted reply history
________________________________
From: [email protected] [mailto:[email protected]] On
Behalf Of Ahn, Jae Eun
Sent: Monday, February 23, 2009 12:02 PM
To: Jun Shen
Cc: [email protected]
Subject: RE: [NMusers] Theoretical questions about Beal's M2 method
Dear Jun,
For your thought # 3, it is not quite true that “If we use cdf to represent the
contribution of censored observations, then the smaller concentrations will
have larger contribution to the objective function“. Assuming f(t) is smaller
than QL (which is usually the case for BQL data), the argument for the PHI
function becomes a positive number. When the approximated value for cdf is
returned, the final output is actually “1-phi” for a positive argument (note
that NONMEM takes the absolute value for the argument when it computes the
integral and returns the value according to the sign of the argument; see
below). So using your terms, the smaller the f(t), yes, the higher the phi();
however, the cdf will be smaller because it is actually 1-phi().
; below is a part of NONMEM’s PHI function.
ARG=X
IF (ARG.LT.0) PHI=PHITL(ARG)
IF (ARG.GT.0) PHI=ONE-PHITL(ARG)
IF (ARG.EQ.0) PHI=P5
Jae Eun Ahn
________________________________
From: [email protected] [mailto:[email protected]] On
Behalf Of Jun Shen
Sent: Sunday, February 22, 2009 8:09 PM
To: Sebastien Bihorel
Cc: Leonid Gibiansky; [email protected]
Subject: Re: [NMusers] Theoretical questions about Beal's M2 method
Sebastien,
I have been reading the BQL papers recently. Here is what I think.
1. M2 can be regarded as a truncated distribution method. You can look at
it this way. Any probability density function (pdf) MUST be integrated into a
cumulative distribution function (cdf) of 1 on its entire sample space, which
is negative infinity to positive infinity for a normal distribution. If the BQL
observations are discarded, the sample space is truncated from negative
infinity to QL. This results in a cdf less than 1. In order to bring the cdf
back to 1 on the new sample space (from QL to positive infinity), the pdf
function has to be corrected by ( 1- phi((QL-f(t)/g(t))). Wikipedia has an
exellent entry to explain the math
( http://en.wikipedia.org/wiki/Truncated_distribution)
2. The question I have is from M3 and M4 where the likelihood of an
uncencored observation is expressed in pdf and the likelihood of a cencored
observation is in cdf. Well, you can still combine everything together to get
an objective function. But what is the definition of likelihood here exactly,
pdf or cdf?
3. If we use cdf to represent the contribution of censored observations,
then the smaller concentrations will have larger contribution to the objective
function. For example, in M3 the likelihood of a censored observation is
l(t)=phi((QL-f(t))/sqrt(g(t))). The smaller the f(t), the higher the phi().
Does that make sense?
I believe the clear undersdanding on these theoretical concepts is just as
important as to know those "know-how". Appreciate more comments.
--
Jun Shen PhD
PK/PD Scientist
BioPharma Services
Millipore Corporation
15 Research Park Dr.
St Charles, MO 63304
Direct: 636-720-1589
On Mon, Feb 16, 2009 at 4:22 PM, Sebastien Bihorel
<[email protected]> wrote:
Thanks Leonid,
However, there is still one point that is unclear to me. You have demonstrated
that p(y | y > LLQ) = p(y) / p(y>LOQ), given the assumptions of the text. Now,
this is a discrete probability, while l(t) is a likelihood... How can one
mathematically demonstrate the expression of l(t) used by Dr. Beal starting
from the previous expression of p(y | y > LLQ)?
*Sebastien Bihorel, PharmD, PhD*
PKPD Scientist
Cognigen Corp
Email: [email protected]
<mailto:[email protected]>
Phone: (716) 633-3463 ext. 323
Leonid Gibiansky wrote:
You can view it as:
p(y ∩ y > LLQ) = 0 when y < LLQ
p(y ∩ y > LLQ) = p(y) when y > LLQ
Another way to look on this is to say that
p(y | y > LLQ) is proportional to p(y) and should integrate to 1
integral(p(y)) over y > LLQ is ( 1- phi((LLQ-f(t)/g(t))) that immediately
leads to l(t) below.
As to the 0 to 1 restriction, l(t) is the density, not probability. It should
integrate to one but can be smaller or greater than 1 (any positive number).
Leonid
--------------------------------------
Leonid Gibiansky, Ph.D.
President, QuantPharm LLC
web: www.quantpharm.com
e-mail: LGibiansky at quantpharm.com
tel: (301) 767 5566
Sebastien Bihorel wrote:
Dear colleagues,
In a paper dated from 2001, Dr. Beal presented several methods to handle data
below the quantification limit (Journal of Pharmacokinetics and
Pharmacodynamics, Vol. 28, No. 5, October 2001), including the M2 method that
can be implemented in NONMEM 6 via the YLO functionnality. I would like to
submit some questions to the list about the theory associated to the M2 method.
I quote:
"...the BQL observations can be discarded, and under the assumption that all
the D(t) [the distribution of residual errors] are normal, the method of
maximum conditional likelihood estimation can be applied to the remaining
observations (method M2). With this method, the likelihood for the data,
conditional on the fact that by design, all (remaining) observations are above
the QL, is maximized with respect to the model parameters. The density function
of the distribution on possible observations at time t, evaluated at y(t), is
1/sqrt( 2*pi*g(t) ))*exp( -0.5*( y(t)-f(t) )^2/g(t) ) and the probability that
an observation at time t is above the QL is 1- phi((QL-f(t)/g(t))), where phi
is the cumulative normal distribution function. Therefore, conditional on the
observation at time t being above QL, the likelihood for y(t) is the ratio:
l(t)=(1/sqrt( 2*pi*g(t) ))*exp( -0.5*( y(t)-f(t) )^2/g(t) ) /( 1-
phi((QL-f(t)/g(t))) [equation 1]"
Now, lets A and B be two events. The probability of A, given B is: p(A|B) =
p(A∩B) / p(B)
In the context of Dr. Beal's paper, I interpret A as simply the observation
y(t) and B as the fact that y(t) is above QL, and thus have the following
questions about equation 1:
- it looks like p(A∩B) in equation 1 simplifies to the probability of y(t)
given the model parameters, i.e. p(A). Which part of the problem allows this
simplification?
- how can l(t) be constrained between 0 and 1 if both numerator and denominator
can vary between 0 and 1?
Any comment from nmusers will be greatly appreciated.