RE: setup of parallel processing and supporting software - help wanted

From: Ruben Faelens Date: December 09, 2015 job Source: mail-archive.com

Hi Pavel, In general, parallelization discussions always revolve around the following question: “Can you create independent blocks of work?” You should make a clear distinction here between parallelizing nonmem, and running several nonmem runs in parallel. Let’s talk about estimating of a single model, doing a covariate search and doing a bootstrap. Very roughly speaking, Nonmem works as follows: 1) Pick a THETA 2) Estimate the probability curve of all ETA’s for all subjects 3) Compute the integral over all probability curves to find a probability for THETA 4) Pick a more likely THETA, rinse and repeat Parallelizing NONMEM means parallelizing step #2. Step #1, #3 and #4 cannot be parallelized. In practice, we simply split up the subjects in N groups. Each worker calculates the probability curve for all subject in its group and sends the results back to the main worker, who can then calculate step #3 and step #4. This works well for very complex models with a considerable estimation step: ODE systems. If you have a very fast model (e.g. simple $PRED section) and a huge dataset, it might be faster to run all of this locally on a single node. Note that Nonmem 7.3 does not parallelize anything other than the subject ETA estimation step! Nonmem 7.4 will parallelize also the TABLE and/or COVARIANCE estimation step. Conclusion: Parallelizing nonmem only works well in specific cases. You can execute the parallelization by specifying a parafile. If your system administrator specified a parametric parafile, you can also choose the number of CPU’s to parallelize over using [nodes]=x in nmfe, or with -nodes=xxx in PsN execute. Let’s now talk about a covariate search. In this case, we want to evaluate 12 models; we can evaluate them concurrently, as there is no dependence between them. PsN works wonderfully here: you can configure the amount of parallel runs using the -threads=xxx switch. 1) Estimate the base model 2) Create 12 instances of the base model, adding a single covariate to each instance. Launch all of these instances in parallel. 3) Once these 12 instances completed, select the most significant covariate. 4) Create 11 instances of the model from step #3, adding a single covariate to each instance. Launch all of these instances in parallel. 5) etc. As you can see, there is still some dependence: we need all results from step #2 to evaluate step #3. On top of that, parallelizing step #2 means you will have to collect all of those results back over the network to do step #3 (I/O impact). If you do not parallelize, they will already be sitting in main memory. In practice, we use the following calculation: · #threads = #max_covariate_steps · #CPU_available / #max_covariate_steps = #nodes So for a cluster of 20 CPU’s: · A covariate search with 5 covariates would be launched using: scm myModel.scm -threads=5 -nodes=4 · A covariate search with 20 covariates would be launched using: scm myModel.scm -threads=20 -nodes=1 Remember that running multiple nonmem runs concurrently should always be preferred over parallelizing a single nonmem run. Finally, let’s talk about bootstrapping: In this case, there is no dependence between the results. This problem can be perfectly parallelized. In this case, always prefer to run multiple nonmem runs concurrently, instead of parallelizing a single nonmem run. bootstrap myModel -samples=2000 -threads=2000 -nodes=1 Final summary: For a single nonmem run, parallelization may or may not work, depending on how complex your model $PRED code is. For a covariate search, try to prefer running multiple runs at the same time, rather than parallelizing single runs. For bootstraps, always run multiple runs at the same time. Never parallelize a single nonmem run. Kind regards, Ruben

Quoted reply history

From: [email protected] [mailto:[email protected]] On Behalf Of Pavel Belo Sent: dinsdag 8 december 2015 22:54 To: [email protected] Subject: [NMusers] setup of parallel processing and supporting software - help wanted Hello The Team, We hear different opinions about effectiveness of parallel processing with NONMEM from very helpful to less helpful. It can be task dependent. How useful is it in phase 3 for basic and covariate models, as well as for bootstrapping? We reached a non-exploratory (production) point when popPK is on a critical path and sophisticated but slow home-made utilities may be insufficient. Are there efficient/quick companies/institutions, which setup parallel processing, supporting software and, possibly, some other utilities (cloud computing, ...)? A group which used to helped us a while ago disappeared somewhere... Thanks, Pavel Information in this email and any attachments is confidential and intended solely for the use of the individual(s) to whom it is addressed or otherwise directed. Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the Company. Finally, the recipient should check this email and any attachments for the presence of viruses. The Company accepts no liability for any damage caused by any virus transmitted by this email. All SGS services are rendered in accordance with the applicable SGS conditions of service available on request and accessible at http://www.sgs.com/en/Terms-and-Conditions.aspx

Dec 08, 2015	Pavel Belo	setup of parallel processing and supporting software - help wanted
Dec 08, 2015	Mark Sale	Re: setup of parallel processing and supporting software - help wanted
Dec 08, 2015	Robert Bauer	RE: setup of parallel processing and supporting software - help wanted
Dec 09, 2015	Ruben Faelens	RE: setup of parallel processing and supporting software - help wanted
Dec 09, 2015	Bill Knebel	Re: setup of parallel processing and supporting software - help wanted
Dec 09, 2015	Mark Sale	Re: setup of parallel processing and supporting software - help wanted
Dec 09, 2015	Bob Leary	RE: setup of parallel processing and supporting software - help wanted

`j` / `k`	Next / previous message
`o`	Open message
`f`	Search
`s`	Copy link
`t`	Filters
`c`	Copy message body
`r`	Related threads
`?`	This help
`Esc`	Close / clear

RE: setup of parallel processing and supporting software - help wanted

Thread

Keyboard Shortcuts