RE: Funny behaviour with MCETA>1 and parallel computation
Paolo,
Just to confuse matters a little further, it should be born in mind that the
function that is optimized to get the ‘optimal’ ETA value is the
joint likelihood , while the individual contribution to the overall OBJ
function is based on the marginal likelihood. The marginal likelihood
integrates the
Joint likelihood over eta space, and in the FOCE approximation, the marginal
likelihood is a function of both the maximum value of the objective function
used to find eta, as well as the FOCE approximation to the hessian at this
optimal eta. Thus it is perfectly possibly
That if you run the eta optimization from two different starting points, and
get diiferent result ETA1 and ETA2, with ETA2 being better in the sense of
having a better
Joint likelihood objective function than ETA1, ETA1 may still have the better
overall FOCE marginal likelihood. Also in this case where there are
apparently
multiple optima in the joint likelihood function, the FOCE approximation itself
is extremely dubious. You might want to try one of the EM methods here.
Quoted reply history
From: [email protected] [mailto:[email protected]] On
Behalf Of Bauer, Robert
Sent: Wednesday, July 30, 2014 11:51 AM
To: Paolo Denti; nmusers
Subject: RE: [NMusers] Funny behaviour with MCETA>1 and parallel computation
Paolo:
This may not be a matter of MCETA, it might have something to do with that
particular individual’s data plus your model, and is there some unusual
evaluation that can accidentally occur, causing the optimization for that
subject to fail. Would you mind sharing with me your control stream file and
data set, that I might give it a try.
Robert J. Bauer, Ph.D.
Vice President, Pharmacometrics, R&D
ICON Development Solutions
7740 Milestone Parkway
Suite 150
Hanover, MD 21076
Tel: (215) 616-6428
Mob: (925) 286-0769
Email: [email protected]<mailto:[email protected]>
Web: http://www.iconplc.com/
From: Paolo Denti [mailto:[email protected]]
Sent: Wednesday, July 30, 2014 3:54 AM
To: Bauer, Robert; nmusers
Subject: Re: [NMusers] Funny behaviour with MCETA>1 and parallel computation
Hi Bob,
thanks for the prompt response and suggestions.
I have tried to implement with RANMETHOD=P and increasing MCETA to 1000, but
unfortunately without success.
Even using MCETA=1000, the result is the same: when I use the parallel
computation feature, it performs worse than using MCETA=0. With one processor,
things work as expected.
Maybe I did not explain the situation well. The issue is not that the
optimisation takes a slightly different path and it reaches a different
minimum, that would not surprise me, as I know different rounding and other
random factor can influence that.
The problem here is that when using parallel computation even on the first
iteration (using MAXEVAL=0) MCETA>1 gives a worse OFV that MCETA=0. This still
does not make any sense to me, irrespectively of random number generators,
numerical approximation,etc. My understanding is that, in each individual,
NONMEM will try 0 and other initial estimates to find the optimal ETAs, and
then it will choose the solution giving the lowest individual OFV. Even if this
is done with different seeds and on different CPUs, they will all try 0, so
whatever MCETA=0 gives out, it should be the upper bound for the OFV for that
individual. Then all these individual OFVs are summed together to find the
total OFV. Since NONMEM is trying 0 in each subject - plus other random values
that may vary - it should at least be able to use those results. In each
individual it can only do better by trying extra values, and if all the
individual OFVs are lower or at worst the same as the ones provided by MCETA=0,
then the total can only be better.
Am I missing something? Is NONMEM maybe minimising only some other individual
likelihood in the MAP step, and that does not coincide with the individual OFV?
To better understand I have looked at the values of individual OFVs in the two
runs (MCETA=1000 and MCETA=0, both with parallel computaion) and all the tables
are exactly the same cell by cell (to the 5th digit or so), except for the
records of that outlying individual. In spite of trying 1000 initial estimates
(including 0) MCETA=1000 gets the individual ETA for that subject wrong, and
gives a worse iOFV than MCETA=0. And it's not a matter of numerical noise, the
individual OFV is 100 points worse and the individual parameters are very
different..
MCETA=0 is a sub-case of MECTA>1, so MCETA>1 should not do any worse. I have
tried and re-tried, and it is unlikely that the estimator was unlucky all the
times with that subject, even trying 1000 initial estimates..
I don't know how to explain this. But maybe I am misunderstanding how this
works?
Any explanation?
Thank you,
Paolo