NONMEM Bug - Noninfluential Correlated Eta

From: Stuart Beal Date: November 14, 2002 technical Source: cognigencorp.com
From:"S.Beal" Subject:NONMEM Bug - Noninfluential Correlated Eta Date: Thu, 14 Nov 2002 12:08:33 -0800 (PST) Nick Holford and Diane Mould have recently helped identify an interesting bug in NONMEM. It affects results obtained with conditional estimation only, and it has been present in all NONMEM releases since conditional estimation was introduced in 1992. It cannot be easily fixed in NONMEM Version V. It is identifiable in a fairly easily way, and chances are that if you had been affected by it, and paying attention, you would have noticed it. However, the circumstances in which it arises (see I below) are rather rare, and to date, Nick and Diane may well have been the only ones affected by this bug. This bug affects NONMEM results in two different ways: A) Population estimates obtained via conditional estimation are not correct. However, limited investigation and informal reasoning (see below) suggest that the discrepancies will not be large. Most analyses obtain population estimates using the First-Order estimation method, and because this method is not a conditional estimation method, such estimates are unaffected by the bug. B) Conditional on whatever values are used for the population parameters, estimates of individuals' parameters are not correct. So if, for example, the First-Order estimation method is used to obtain population estimates, but then the POSTHOC option is used to obtain estimates of individuals' parameters, these latter estimates are affected by the bug. Certain such estimates are quite noticably affected, allowing the bug to be clearly observed. Certain other such estimates are far less adversely affected. The discussion below contains: (I) a description of the circumstances in which the bug arises (II) a description of the bug's effects and how one might notice that there is indeed some problem (III) a description of a method with which one might revisit an analysis where this bug occurred and investigate to what extent the results were affected by the bug This method might also allow one to redo the entire analysis "correctly", and it can be used to more correctly implement future analyses where it is known that the bug will occur. I. Circumstances for the Bug to Appear There are three conditions, all of which must occur together. A. As indicated above, conditional estimation (including POSTHOC estimation) is used. B. The model contains an eta whose value for some individual has no influence on the individual's data. For example, suppose that PK and PD responses are being fit simultaneously, that the PD model is an EMAX model with an eta (eta1) on EMAX, but no PD responses are available from some individual. Then the value of eta1 for the individual in question has no influence on the individual's data. We shall call the eta "a noninfluential eta" (for the data from the given individual). With most studies where multiple types of response are observed, all types are observed with all patients, and so a noninfluential eta does not exist. C. The noninfluential eta is correlated with another eta that does influence the data. That is, the structure of OMEGA matrix must be specified such that a nonzero covariance element between the two eta's exists. We shall call such a noninfluential eta "a correlated noninfluential eta". Com- monly, but by no means invariably, population modelers have constrained the OMEGA matrix so that it has only zero-valued covariance elements. But even when a noninfluential eta is correlated with another eta, usually this other eta is also noninfluential; e.g. etas affecting PK are usually not taken to be correlated with etas affecting PD. II. The Bug's Effects If there are too few data available from an individual, an estimate of the vector of etas variables pertaining to the individual cannot be obtained from these data alone. So in general, along with the individual's data, a multivariate normal prior distribution on the eta vector is used to further constrain the estimate. The parameters of this distribution are the popu- lation parameters: the mean (which is the same as the mode) and variance- covariance matrix of the distribution are taken to be the 0 vector and the OMEGA matrix, respectively. Then the estimate of the eta vector is taken to be the mode of the posterior Bayes distribution, given the individual's data. If an eta in noninfluential, so that there are no data at all which bear directly on its value, then its estimate will depend heavily on the prior constraint. The estimate is identical to the corresponding element of the mode of the conditional prior distribution, given that the values of the influential etas are equal to their estimates. If the eta is nonin- fluential and noncorrelated, the conditioning has no effect, and the esti- mate is simply 0. If the eta is a correlated noninfluential eta, the correlation affects the conditioning in such a way that estimate is gen- erally different from 0. A scatterplot between the estimates of a corre- lated noninfluential eta and those of an eta with which it is correlated should reflect the correlation. The effect of the bug is that this is not actually what happens. If an eta is used in the control stream, which, for some individual is a correlated noninfluential eta, then the estimate of the eta vector for this individual is not exactly the posterior mode; it lies closer to the 0 vector than it should. (Of course, if there are no actual observations for the indivi- dual, then the posterior mode is in fact 0.) The error in the estimate of an influential eta is not likely to be large. As it turns out, for the purpose of computing the estimate, a value of OMEGA is taken which is too small, and this is what leads to the incorrect estimate. It may be seen analytically that this results in a relative error in the expected value of the estimate which, as the amount of data from the individual increases, decreases inversely proportional to this amount. Also, because the incorrect estimate lies between the correct estimate and 0, the relative error cannot be larger than 100%. It is pos- sible that as the amount of data decreases, this bound is approached, but at the same time the correct estimate itself becomes small. On the other hand, with the bug the estimate of a correlated noninfluential eta is always exactly 0 (even if it very highly correlated with an influen- tial eta whose estimate is far from 0). If one displays the values for this eta in a table or scatterplot, one should see this spurious zero value. More generally, consider a parameter modeled in terms of eta variables. For a given individual, we usually call the value of the parameter, the "individual's parameter" (e.g. the "individual's CL"). To estimate this value, the estimate of the eta vector is substituted into the model for the parameter. The typical individual is the (hypothetical) individual whose eta vector is 0, and the parameter for this individual is obtained by sub- stituting the 0 vector into the model for the parameter; the resulting value is regarded as a population parameter. If an eta is used in the con- trol stream, which, for some individual is a correlated noninfluential eta, then the estimate of the individual's parameter lies closer to the parame- ter for the typical individual than it should. If the parameter itself is modeled with only influential etas, the discrepancy is likely not to be much. If it is modeled with a noninfluential correlated eta, the discrepancy can be much greater. If it is modeled only with noninfluential correlated etas, the estimate is exactly the typical individual's parame- ter. Using the First-Order estimation method, estimates of the population param- eters THETA, OMEGA (and SIGMA) are obtained independently from estimates of the eta vector. Using conditional estimation methods, estimates of these parameters depend on the estimates of the eta vector from each individual. (Of course, an estimate of the eta vector itself depends on values for the population parameters; see NONMEM Users Guide VII for how this circularity is handled.) If the latter estimates are not quite correct, then the former estimates are also not quite correct. (Of course, only estimates of influential elements of the eta vector affect the estimates of the popula- tion parameters.) However, compare the First-Order estimation method (FO) with the First-Order Conditional Estimation method (FOCE). Both methods may be viewed as using the same general objective function, which involves values for eta vectors for all individuals, but each method uses different values (again; see NONMEM Users Guide VII). FO uses the 0 vector for all of these values, and FOCE, in an attempt to reduce bias, uses the posterior modes (described above). If, due to the bug, the values actually used are closer to 0 than they should be, then FOCE is simply "more FO-like" than it should be. In this case FOCE may not be entirely as effective in reducing bias as one hopes it will be, but at worst, its estimates are probably acceptable in the same way that FO estimates often are. When the correct FOCE results are already close to the FO results, as when the amounts of data from all but an insignificant fraction of individuals are small, what discrepancies exist due to the bug must be small. The correct FOCE results can differ substantially from FO results only when there are at least moderate amounts of data from a significant fraction of individuals, and so it is only in this circumstance that FOCE is really helpful. As explained above, discrepancies in the computation of the pos- terior modes become small as the amounts of data from the individuals increase, and therefore, so do the discrepancies in the computation of the FOCE objective function. Therefore, it is unlikely that descrepancies in FOCE estimates will be large. Generally, it will not be possible to spot the bug by examining only popu- lation estimates. One will need to look closely at estimates of individual parameters or of etas, as explained above. III. Zeta Transformation There is a way to rewrite a model that has used etas, so that all etas become influential for every individual, and thus the bug will not occur. However, this technique is somewhat unwieldy, and one would not want to use it as a matter of course. I suggest that it be tried only when it is desired to check the effects of the bug on an old analysis, or to avoid the bug in a future analysis where one knows the bug will arise. Consider a subset of n etas whose variance-covariance matrix is a diagonal block of OMEGA with nonzero covariance elements - i.e. the initial esti- mates of the elements of the matrix is given by an $OMEGA record with option BLOCK(n) - and which includes a correlated noninfluential eta. Call such a subset a "subset of concern". Such a subset might consist of all etas in the model (and then there is only a single $OMEGA record). How- ever, by definition, a subset of size n=1 cannot be "of concern". Let the n etas be renamed zetas, and consider a new set of etas, related to the zetas as follows: zeta1 = eta1 + (eta2 + eta3 + ... + etan)/2 zeta2 = eta2 + (eta1 + eta3 + ... + etan)/2 . . . zetan = etan + (eta1 + eta2 + ... + etam)/2 where m=n-1. If an individual's data are influenced by at least one zeta, then because all zetas depend on all etas, it follows that there can only be influential etas for the individual's data, and so there cannot be a correlated nonin- fluential eta for these data. If an individual's data are not influenced by any zeta, then these data are also not influenced by *any* eta, and by definition, there can be no *correlated* noninfluential eta. Therefore, in both cases the bug will not be active. (NONMEM accepts a model such that the data from some individuals are not influenced by any etas in the model.) We call the above transformation of new etas to old etas, the zeta transformation. This transformation should be applied separately to all subsets of concern. So basically, the NONMEM user need only take abbreviated code such as the following, CL =THETA(1)*EXP(ETA(1)) V =THETA(2)*EXP(ETA(2)) KA =THETA(3)*EXP(ETA(3)) ALAG=THETA(4)+EXP(ETA(4)) and assuming that {eta1, eta2, eta3} contains a correlated noninfluential eta and that the variance-covariance matrix of these etas is a diagonal block of OMEGA, replace this code with: ZETA1=ETA(1)+HALF*(ETA(2)+ETA(3)) ZETA2=ETA(2)+HALF*(ETA(1)+ETA(3)) ZETA3=ETA(3)+HALF*(ETA(1)+ETA(2)) CL=THETA(1)*EXP(ZETA1) V =THETA(2)*EXP(ZETA2) KA=THETA(3)*EXP(ETA(3)) ALAG=THETA(4)+EXP(ETA(4)) In the original code, we have used an eta on the ALAG parameter simply for illustrative purposes. Most often it is not possible to obtain an estimate of interindividual variability in the absorption lag parameter. Note that if indeed, the only etas are the four shown above, then because the subset {eta4} is a set of size 1, it cannot be of concern. It is easy to implement the type of abbreviated code illustrated above. However, there is further complication. The initial estimates given for the diagonal block of OMEGA, i.e. in the $OMEGA BLOCK(n) record, must now be those for the new etas. How will they be obtained? Using matrix notation, the new etas are given by eta = A zeta where Aij= -1/(1+(n-1)/2) if i!=j = (2+(n-2))/(1+(n-1)/2) if i=j If OMO is the original matrix of initial estimates, then (A OMO A) is the new matrix of initial estimates. Here is a FORTRAN program that will produce (A OMO A). It will prompt the user to key in the dimension of OMO, and then OMO itself. The program will output the new matrix of initial estimates. double precision omega(10,10),od,d,s,b(10,10),a(10,10) write (6,*) 'input dimension of diagonal block' read (5,*) n write (6,*) 'input diagonal block itself in lower-triangular form' read (5,*) ((omega(i,j),j=1,i),i=1,n) od=-1/(1+.5*(n-1)) d=(2+(n-2))/(1+.5*(n-1)) do 5 j=1,n do 5 i=1,n if (i.eq.j) then a(i,j)=d else a(i,j)=od endif 5 continue do 10 j=1,n do 10 i=1,j 10 omega(i,j)=omega(j,i) do 20 j=1,n do 20 i=1,j s=0. do 15 k=1,n do 15 l=1,n 15 s=s+a(i,k)*omega(k,l)*a(j,l) b(i,j)=s 20 b(j,i)=s write (6,*) n,((b(i,j),j=1,i),i=1,n) end If OMO is, e.g. .1 .01 .01 .01 .1 .01 .01 .01 .1 then the lower triangle is .1 .01 .1 .01 .01 .1 and one keys in: .1 .01 .1 .01 .01 .1 This program would be used for all subsets of concern. Then, after running NONMEM and obtaining final population estimates, one needs to obtain the final estimate for the diagonal block of the original OMEGA matrix. We have zeta = B eta where Bij= . 5 if i!=j = 1.0 if i=j If OMN is the final estimate of the new diagonal block, then (B OMN B) is the final estimate of the original diagonal block. Here is a FORTRAN program that will produce (B OMN B). It will prompt the user to key in the dimension of OMN, and then OMN itself. The program will output the final estimate of the original diagonal block. double precision omega(10,10),od,d,s,b(10,10),a(10,10) write (6,*) 'input dimension of diagonal block' read (5,*) n write (6,*) 'input diagonal block itself in lower-triangular form' read (5,*) ((omega(i,j),j=1,i),i=1,n) od=.5 d=1. do 5 j=1,n do 5 i=1,n if (i.eq.j) then a(i,j)=d else a(i,j)=od endif 5 continue do 10 j=1,n do 10 i=1,j 10 omega(i,j)=omega(j,i) do 20 j=1,n do 20 i=1,j s=0. do 15 k=1,n do 15 l=1,n 15 s=s+a(i,k)*omega(k,l)*a(j,l) b(i,j)=s 20 b(j,i)=s write (6,*) n,((b(i,j),j=1,i),i=1,n) end This program would be used for all subsets of concern. ______________________________________________________________________