NONMEM Bug - Noninfluential Correlated Eta
From:"S.Beal"
Subject:NONMEM Bug - Noninfluential Correlated Eta
Date: Thu, 14 Nov 2002 12:08:33 -0800 (PST)
Nick Holford and Diane Mould have recently helped identify an interesting
bug in NONMEM. It affects results obtained with conditional estimation
only, and it has been present in all NONMEM releases since conditional
estimation was introduced in 1992. It cannot be easily fixed in NONMEM
Version V. It is identifiable in a fairly easily way, and chances are that
if you had been affected by it, and paying attention, you would have
noticed it. However, the circumstances in which it arises (see I below)
are rather rare, and to date, Nick and Diane may well have been the only
ones affected by this bug.
This bug affects NONMEM results in two different ways:
A) Population estimates obtained via conditional estimation are not
correct. However, limited investigation and informal reasoning (see below)
suggest that the discrepancies will not be large. Most analyses obtain
population estimates using the First-Order estimation method, and because
this method is not a conditional estimation method, such estimates are
unaffected by the bug.
B) Conditional on whatever values are used for the population parameters,
estimates of individuals' parameters are not correct. So if, for example,
the First-Order estimation method is used to obtain population estimates,
but then the POSTHOC option is used to obtain estimates of individuals'
parameters, these latter estimates are affected by the bug. Certain such
estimates are quite noticably affected, allowing the bug to be clearly
observed. Certain other such estimates are far less adversely affected.
The discussion below contains:
(I) a description of the circumstances in which the bug arises
(II) a description of the bug's effects and how one might notice that
there is indeed some problem
(III) a description of a method with which one might revisit an analysis
where this bug occurred and investigate to what extent the results were
affected by the bug
This method might also allow one to redo the entire analysis "correctly",
and it can be used to more correctly implement future analyses where it
is known that the bug will occur.
I. Circumstances for the Bug to Appear
There are three conditions, all of which must occur together.
A.
As indicated above, conditional estimation (including POSTHOC estimation)
is used.
B.
The model contains an eta whose value for some individual has no influence
on the individual's data. For example, suppose that PK and PD responses
are being fit simultaneously, that the PD model is an EMAX model with an
eta (eta1) on EMAX, but no PD responses are available from some individual.
Then the value of eta1 for the individual in question has no influence on
the individual's data. We shall call the eta "a noninfluential eta" (for
the data from the given individual). With most studies where multiple
types of response are observed, all types are observed with all patients,
and so a noninfluential eta does not exist.
C.
The noninfluential eta is correlated with another eta that does influence
the data. That is, the structure of OMEGA matrix must be specified such
that a nonzero covariance element between the two eta's exists. We shall
call such a noninfluential eta "a correlated noninfluential eta". Com-
monly, but by no means invariably, population modelers have constrained the
OMEGA matrix so that it has only zero-valued covariance elements. But
even when a noninfluential eta is correlated with another eta, usually this
other eta is also noninfluential; e.g. etas affecting PK are usually not
taken to be correlated with etas affecting PD.
II. The Bug's Effects
If there are too few data available from an individual, an estimate of the
vector of etas variables pertaining to the individual cannot be obtained
from these data alone. So in general, along with the individual's data, a
multivariate normal prior distribution on the eta vector is used to further
constrain the estimate. The parameters of this distribution are the popu-
lation parameters: the mean (which is the same as the mode) and variance-
covariance matrix of the distribution are taken to be the 0 vector and the
OMEGA matrix, respectively. Then the estimate of the eta vector is taken
to be the mode of the posterior Bayes distribution, given the individual's
data. If an eta in noninfluential, so that there are no data at all which
bear directly on its value, then its estimate will depend heavily on the
prior constraint. The estimate is identical to the corresponding element
of the mode of the conditional prior distribution, given that the values of
the influential etas are equal to their estimates. If the eta is nonin-
fluential and noncorrelated, the conditioning has no effect, and the esti-
mate is simply 0. If the eta is a correlated noninfluential eta, the
correlation affects the conditioning in such a way that estimate is gen-
erally different from 0. A scatterplot between the estimates of a corre-
lated noninfluential eta and those of an eta with which it is correlated
should reflect the correlation.
The effect of the bug is that this is not actually what happens. If an eta
is used in the control stream, which, for some individual is a correlated
noninfluential eta, then the estimate of the eta vector for this individual
is not exactly the posterior mode; it lies closer to the 0 vector than it
should. (Of course, if there are no actual observations for the indivi-
dual, then the posterior mode is in fact 0.)
The error in the estimate of an influential eta is not likely to be large.
As it turns out, for the purpose of computing the estimate, a value of
OMEGA is taken which is too small, and this is what leads to the incorrect
estimate. It may be seen analytically that this results in a relative
error in the expected value of the estimate which, as the amount of data
from the individual increases, decreases inversely proportional to this
amount. Also, because the incorrect estimate lies between the correct
estimate and 0, the relative error cannot be larger than 100%. It is pos-
sible that as the amount of data decreases, this bound is approached, but
at the same time the correct estimate itself becomes small.
On the other hand, with the bug the estimate of a correlated noninfluential
eta is always exactly 0 (even if it very highly correlated with an influen-
tial eta whose estimate is far from 0). If one displays the values for
this eta in a table or scatterplot, one should see this spurious zero
value.
More generally, consider a parameter modeled in terms of eta variables.
For a given individual, we usually call the value of the parameter, the
"individual's parameter" (e.g. the "individual's CL"). To estimate this
value, the estimate of the eta vector is substituted into the model for the
parameter. The typical individual is the (hypothetical) individual whose
eta vector is 0, and the parameter for this individual is obtained by sub-
stituting the 0 vector into the model for the parameter; the resulting
value is regarded as a population parameter. If an eta is used in the con-
trol stream, which, for some individual is a correlated noninfluential eta,
then the estimate of the individual's parameter lies closer to the parame-
ter for the typical individual than it should. If the parameter itself is
modeled with only influential etas, the discrepancy is likely not to be
much. If it is modeled with a noninfluential correlated eta, the
discrepancy can be much greater. If it is modeled only with noninfluential
correlated etas, the estimate is exactly the typical individual's parame-
ter.
Using the First-Order estimation method, estimates of the population param-
eters THETA, OMEGA (and SIGMA) are obtained independently from estimates of
the eta vector. Using conditional estimation methods, estimates of these
parameters depend on the estimates of the eta vector from each individual.
(Of course, an estimate of the eta vector itself depends on values for the
population parameters; see NONMEM Users Guide VII for how this circularity
is handled.) If the latter estimates are not quite correct, then the
former estimates are also not quite correct. (Of course, only estimates of
influential elements of the eta vector affect the estimates of the popula-
tion parameters.) However, compare the First-Order estimation method (FO)
with the First-Order Conditional Estimation method (FOCE). Both methods
may be viewed as using the same general objective function, which involves
values for eta vectors for all individuals, but each method uses different
values (again; see NONMEM Users Guide VII). FO uses the 0 vector for all
of these values, and FOCE, in an attempt to reduce bias, uses the posterior
modes (described above). If, due to the bug, the values actually used are
closer to 0 than they should be, then FOCE is simply "more FO-like" than it
should be. In this case FOCE may not be entirely as effective in reducing
bias as one hopes it will be, but at worst, its estimates are probably
acceptable in the same way that FO estimates often are.
When the correct FOCE results are already close to the FO results, as when
the amounts of data from all but an insignificant fraction of individuals
are small, what discrepancies exist due to the bug must be small. The
correct FOCE results can differ substantially from FO results only when
there are at least moderate amounts of data from a significant fraction of
individuals, and so it is only in this circumstance that FOCE is really
helpful. As explained above, discrepancies in the computation of the pos-
terior modes become small as the amounts of data from the individuals
increase, and therefore, so do the discrepancies in the computation of the
FOCE objective function. Therefore, it is unlikely that descrepancies in
FOCE estimates will be large.
Generally, it will not be possible to spot the bug by examining only popu-
lation estimates. One will need to look closely at estimates of individual
parameters or of etas, as explained above.
III. Zeta Transformation
There is a way to rewrite a model that has used etas, so that all etas
become influential for every individual, and thus the bug will not occur.
However, this technique is somewhat unwieldy, and one would not want to use
it as a matter of course. I suggest that it be tried only when it is
desired to check the effects of the bug on an old analysis, or to avoid the
bug in a future analysis where one knows the bug will arise.
Consider a subset of n etas whose variance-covariance matrix is a diagonal
block of OMEGA with nonzero covariance elements - i.e. the initial esti-
mates of the elements of the matrix is given by an $OMEGA record with
option BLOCK(n) - and which includes a correlated noninfluential eta. Call
such a subset a "subset of concern". Such a subset might consist of all
etas in the model (and then there is only a single $OMEGA record). How-
ever, by definition, a subset of size n=1 cannot be "of concern".
Let the n etas be renamed zetas, and consider a new set of etas, related to
the zetas as follows:
zeta1 = eta1 + (eta2 + eta3 + ... + etan)/2
zeta2 = eta2 + (eta1 + eta3 + ... + etan)/2
.
.
.
zetan = etan + (eta1 + eta2 + ... + etam)/2
where m=n-1.
If an individual's data are influenced by at least one zeta, then because
all zetas depend on all etas, it follows that there can only be influential
etas for the individual's data, and so there cannot be a correlated nonin-
fluential eta for these data. If an individual's data are not influenced
by any zeta, then these data are also not influenced by *any* eta, and by
definition, there can be no *correlated* noninfluential eta. Therefore, in
both cases the bug will not be active. (NONMEM accepts a model such that
the data from some individuals are not influenced by any etas in the
model.)
We call the above transformation of new etas to old etas, the zeta
transformation. This transformation should be applied separately to all
subsets of concern.
So basically, the NONMEM user need only take abbreviated code such as the
following,
CL =THETA(1)*EXP(ETA(1))
V =THETA(2)*EXP(ETA(2))
KA =THETA(3)*EXP(ETA(3))
ALAG=THETA(4)+EXP(ETA(4))
and assuming that {eta1, eta2, eta3} contains a correlated noninfluential
eta and that the variance-covariance matrix of these etas is a diagonal
block of OMEGA, replace this code with:
ZETA1=ETA(1)+HALF*(ETA(2)+ETA(3))
ZETA2=ETA(2)+HALF*(ETA(1)+ETA(3))
ZETA3=ETA(3)+HALF*(ETA(1)+ETA(2))
CL=THETA(1)*EXP(ZETA1)
V =THETA(2)*EXP(ZETA2)
KA=THETA(3)*EXP(ETA(3))
ALAG=THETA(4)+EXP(ETA(4))
In the original code, we have used an eta on the ALAG parameter simply for
illustrative purposes. Most often it is not possible to obtain an estimate
of interindividual variability in the absorption lag parameter. Note that
if indeed, the only etas are the four shown above, then because the subset
{eta4} is a set of size 1, it cannot be of concern.
It is easy to implement the type of abbreviated code illustrated above.
However, there is further complication. The initial estimates given for
the diagonal block of OMEGA, i.e. in the $OMEGA BLOCK(n) record, must now
be those for the new etas. How will they be obtained?
Using matrix notation, the new etas are given by
eta = A zeta
where
Aij= -1/(1+(n-1)/2) if i!=j
= (2+(n-2))/(1+(n-1)/2) if i=j
If OMO is the original matrix of initial estimates, then (A OMO A) is the
new matrix of initial estimates.
Here is a FORTRAN program that will produce (A OMO A). It will prompt the
user to key in the dimension of OMO, and then OMO itself. The program will
output the new matrix of initial estimates.
double precision omega(10,10),od,d,s,b(10,10),a(10,10)
write (6,*) 'input dimension of diagonal block'
read (5,*) n
write (6,*) 'input diagonal block itself in lower-triangular form'
read (5,*) ((omega(i,j),j=1,i),i=1,n)
od=-1/(1+.5*(n-1))
d=(2+(n-2))/(1+.5*(n-1))
do 5 j=1,n
do 5 i=1,n
if (i.eq.j) then
a(i,j)=d
else
a(i,j)=od
endif
5 continue
do 10 j=1,n
do 10 i=1,j
10 omega(i,j)=omega(j,i)
do 20 j=1,n
do 20 i=1,j
s=0.
do 15 k=1,n
do 15 l=1,n
15 s=s+a(i,k)*omega(k,l)*a(j,l)
b(i,j)=s
20 b(j,i)=s
write (6,*) n,((b(i,j),j=1,i),i=1,n)
end
If OMO is, e.g.
.1 .01 .01
.01 .1 .01
.01 .01 .1
then the lower triangle is
.1
.01 .1
.01 .01 .1
and one keys in: .1 .01 .1 .01 .01 .1
This program would be used for all subsets of concern.
Then, after running NONMEM and obtaining final population estimates, one
needs to obtain the final estimate for the diagonal block of the original
OMEGA matrix. We have
zeta = B eta
where
Bij= . 5 if i!=j
= 1.0 if i=j
If OMN is the final estimate of the new diagonal block, then (B OMN B) is
the final estimate of the original diagonal block.
Here is a FORTRAN program that will produce (B OMN B). It will prompt the
user to key in the dimension of OMN, and then OMN itself. The program will
output the final estimate of the original diagonal block.
double precision omega(10,10),od,d,s,b(10,10),a(10,10)
write (6,*) 'input dimension of diagonal block'
read (5,*) n
write (6,*) 'input diagonal block itself in lower-triangular form'
read (5,*) ((omega(i,j),j=1,i),i=1,n)
od=.5
d=1.
do 5 j=1,n
do 5 i=1,n
if (i.eq.j) then
a(i,j)=d
else
a(i,j)=od
endif
5 continue
do 10 j=1,n
do 10 i=1,j
10 omega(i,j)=omega(j,i)
do 20 j=1,n
do 20 i=1,j
s=0.
do 15 k=1,n
do 15 l=1,n
15 s=s+a(i,k)*omega(k,l)*a(j,l)
b(i,j)=s
20 b(j,i)=s
write (6,*) n,((b(i,j),j=1,i),i=1,n)
end
This program would be used for all subsets of concern.
______________________________________________________________________