RE: Funny behaviour with MCETA>1 and parallel computation

From: Bob Leary Date: July 30, 2014 technical Source: mail-archive.com
Paolo, Just to confuse matters a little further, it should be born in mind that the function that is optimized to get the ‘optimal’ ETA value is the joint likelihood , while the individual contribution to the overall OBJ function is based on the marginal likelihood. The marginal likelihood integrates the Joint likelihood over eta space, and in the FOCE approximation, the marginal likelihood is a function of both the maximum value of the objective function used to find eta, as well as the FOCE approximation to the hessian at this optimal eta. Thus it is perfectly possibly That if you run the eta optimization from two different starting points, and get diiferent result ETA1 and ETA2, with ETA2 being better in the sense of having a better Joint likelihood objective function than ETA1, ETA1 may still have the better overall FOCE marginal likelihood. Also in this case where there are apparently multiple optima in the joint likelihood function, the FOCE approximation itself is extremely dubious. You might want to try one of the EM methods here.
Quoted reply history
From: [email protected] [mailto:[email protected]] On Behalf Of Bauer, Robert Sent: Wednesday, July 30, 2014 11:51 AM To: Paolo Denti; nmusers Subject: RE: [NMusers] Funny behaviour with MCETA>1 and parallel computation Paolo: This may not be a matter of MCETA, it might have something to do with that particular individual’s data plus your model, and is there some unusual evaluation that can accidentally occur, causing the optimization for that subject to fail. Would you mind sharing with me your control stream file and data set, that I might give it a try. Robert J. Bauer, Ph.D. Vice President, Pharmacometrics, R&D ICON Development Solutions 7740 Milestone Parkway Suite 150 Hanover, MD 21076 Tel: (215) 616-6428 Mob: (925) 286-0769 Email: [email protected]<mailto:[email protected]> Web: http://www.iconplc.com/ From: Paolo Denti [mailto:[email protected]] Sent: Wednesday, July 30, 2014 3:54 AM To: Bauer, Robert; nmusers Subject: Re: [NMusers] Funny behaviour with MCETA>1 and parallel computation Hi Bob, thanks for the prompt response and suggestions. I have tried to implement with RANMETHOD=P and increasing MCETA to 1000, but unfortunately without success. Even using MCETA=1000, the result is the same: when I use the parallel computation feature, it performs worse than using MCETA=0. With one processor, things work as expected. Maybe I did not explain the situation well. The issue is not that the optimisation takes a slightly different path and it reaches a different minimum, that would not surprise me, as I know different rounding and other random factor can influence that. The problem here is that when using parallel computation even on the first iteration (using MAXEVAL=0) MCETA>1 gives a worse OFV that MCETA=0. This still does not make any sense to me, irrespectively of random number generators, numerical approximation,etc. My understanding is that, in each individual, NONMEM will try 0 and other initial estimates to find the optimal ETAs, and then it will choose the solution giving the lowest individual OFV. Even if this is done with different seeds and on different CPUs, they will all try 0, so whatever MCETA=0 gives out, it should be the upper bound for the OFV for that individual. Then all these individual OFVs are summed together to find the total OFV. Since NONMEM is trying 0 in each subject - plus other random values that may vary - it should at least be able to use those results. In each individual it can only do better by trying extra values, and if all the individual OFVs are lower or at worst the same as the ones provided by MCETA=0, then the total can only be better. Am I missing something? Is NONMEM maybe minimising only some other individual likelihood in the MAP step, and that does not coincide with the individual OFV? To better understand I have looked at the values of individual OFVs in the two runs (MCETA=1000 and MCETA=0, both with parallel computaion) and all the tables are exactly the same cell by cell (to the 5th digit or so), except for the records of that outlying individual. In spite of trying 1000 initial estimates (including 0) MCETA=1000 gets the individual ETA for that subject wrong, and gives a worse iOFV than MCETA=0. And it's not a matter of numerical noise, the individual OFV is 100 points worse and the individual parameters are very different.. MCETA=0 is a sub-case of MECTA>1, so MCETA>1 should not do any worse. I have tried and re-tried, and it is unlikely that the estimator was unlucky all the times with that subject, even trying 1000 initial estimates.. I don't know how to explain this. But maybe I am misunderstanding how this works? Any explanation? Thank you, Paolo
Jul 29, 2014 Paolo Denti Funny behaviour with MCETA>1 and parallel computation
Jul 29, 2014 Robert Bauer RE: Funny behaviour with MCETA>1 and parallel computation
Jul 30, 2014 Paolo Denti Re: Funny behaviour with MCETA>1 and parallel computation
Jul 30, 2014 Robert Bauer RE: Funny behaviour with MCETA>1 and parallel computation
Jul 30, 2014 Bob Leary RE: Funny behaviour with MCETA>1 and parallel computation