RE: Simple parallel benchmark for Nonmem 7.2 with large Bayes problem

From: Mark Sale Date: May 20, 2011 technical Source: mail-archive.com
Dieter, We never expected the parallel NONMEM to perform well with problems of this size. The benefit, in our early benchmarks, really starts with problems that are at least 20 minutes. The math is pretty simple, basically, if a function evaluation takes more than a about a half second (not that a "typical" nonmem run may have 3000 function evaluations), it is worth sending out to multiple processes. That was our conclusion with the file-based method, the MPI might be more efficient (but, I'm told that behind the curtains, they both do pretty much the same thing, the OS buffers data blocks of this size very well, the data never actually goes to the physical disc). Our early benchmark were also with multiple computers, across a 100 Mb/s LAN. Likely there is also better performance with the very clever load balancing and dynamic sizing that Bob Bauer has put into the new release. But, don't expect any benefit with 1 minute runs, there is I/O overhead involved with sending out the data, even on the same CPU. Note that our benchmarks had a base run time of 6 hours. See our poster at http://2009.go-acop.org/acop2009/posters . Mark Mark Sale MD President, Next Level Solutions, LLC www.NextLevelSolns.com 919-846-9185 A carbon-neutral company See our real time solar energy production at: http://enlighten.enphaseenergy.com/public/systems/aSDz2458
Quoted reply history
-------- Original Message -------- Subject: [NMusers] Simple parallel benchmark for Nonmem 7.2 with large Bayes problem From: "Dieter Menne " < [email protected] > ; Date: Fri, May 20, 2011 3:36 pm To: "nmuser list" < [email protected] > Here some quick-and-dirty results of my first benchmark with parallel processing in NONMEM 7.2 Running Win7, 64 bit, intel i7, with 4 CPU (and 4 hyperthreading cores). One computer only. Using file message passing. Could not get mpi to work in this configuration. call nmfe72 mtl_KPreM2Pre_T2L2_.ctl -parafile= fpiwini8.pnm [nodes]= (1 or 4 or 8) 10 iterations of a very large Bayes problem (which should not profit from multiple cores, according to the manual) nodes time 1 45 s 4 25 s 8 40 s So about a factor of 2 between 1 and 4 cores. It is not surprising that 8 gives worse values because these are no real CPUs. More surprising is the fact that with 8 "CPU", I have 100 load on all of them (huh?), while with 4 CPUs, I have the expected 50%. Dieter
May 20, 2011 Dieter Menne Simple parallel benchmark for Nonmem 7.2 with large Bayes problem
May 20, 2011 Mark Sale RE: Simple parallel benchmark for Nonmem 7.2 with large Bayes problem
May 21, 2011 Ron Keizer Re: Simple parallel benchmark for Nonmem 7.2 with large Bayes problem
May 21, 2011 Nick Holford Re: Simple parallel benchmark for Nonmem 7.2 with large Bayes problem
May 23, 2011 Ron Keizer Re: Simple parallel benchmark for Nonmem 7.2 with large Bayes problem
May 23, 2011 Xavier Woot de Trixhe RE: Simple parallel benchmark for Nonmem 7.2 with large Bayes problem