Re: setup of parallel processing and supporting software - help wanted
Maybe a little more clarification:
Thanks to Bob for pointing out that the
PARSE_TYPE=2 or 4
option implements some code for load balancing, and there really is no
downside, so should probably always be used.
Contrary to other comments, NONMEM 7.3 (and 7.2) does parallelize the
covariance step. Ruben is correct that the $TABLE step is not parallelize in
7.3.
WRT sometimes it works and sometimes it doesn't, we can be more specific than
this. The parallelization takes place at the level of the calculation of the
objective function. The data are split up and the OBJ for the subsets of the
data is sent to multiple processes. When all processes are done, the results
are compiled by the manager program. The total round trip time for one
process then is the calculation time + I/O time. Without parallelization,
there is no I/O time. For each parallel process, the I/O time is essentially
fixed (in our benchmarks maybe 20-40 msec per process on a single machine). The
variable of interest then is the calculation time. If the calculation time is
1 msec and the I/O time is 20 msec, if you parallelize to 2 cores, you cut the
calculation time to 0.5 msec, now have 40 msec (2*20 msec) of I/O time, for a
total of 40.5 msec, much slower. If the calculation time is 500 msec, and you
parallelize to 2 cores, the total time is 250 msec (for calculation) + 2*20
msec (for I/O) = 290 msec. If The key parameter then is the time for a single
objective function evaluation (not the total run time). If the time for a
single function evaluation is > 500 msec, parallelization will be helpful (on a
single machine). There really isn't anything very mystical about when it helps
and when it doesn't. The efficiency depends very little on the size of the data
set, except that the limit of parallelization is the number of subjects (the
data set must be split up by subject).
Mark Sale M.D.
Vice President, Modeling and Simulation
Nuventra, Inc. ™
2525 Meridian Parkway, Suite 280
Research Triangle Park, NC 27713
Office (919)-973-0383
[email protected]<[email protected]>
http://www.nuventra.com
Empower your Pipeline
CONFIDENTIALITY NOTICE The information in this transmittal (including
attachments, if any) may be privileged and confidential and is intended only
for the recipient(s) listed above. Any review, use, disclosure, distribution or
copying of this transmittal, in any form, is prohibited except by or on behalf
of the intended recipient(s). If you have received this transmittal in error,
please notify me immediately by reply email and destroy all copies of the
transmittal.
Quoted reply history
________________________________
From: [email protected] <[email protected]> on behalf of
Faelens, Ruben (Belgium) <[email protected]>
Sent: Wednesday, December 9, 2015 5:42 AM
To: Pavel Belo; [email protected]
Subject: RE: [NMusers] setup of parallel processing and supporting software -
help wanted
Hi Pavel,
In general, parallelization discussions always revolve around the following
question: “Can you create independent blocks of work?”
You should make a clear distinction here between parallelizing nonmem, and
running several nonmem runs in parallel. Let’s talk about estimating of a
single model, doing a covariate search and doing a bootstrap.
Very roughly speaking, Nonmem works as follows:
1) Pick a THETA
2) Estimate the probability curve of all ETA’s for all subjects
3) Compute the integral over all probability curves to find a probability
for THETA
4) Pick a more likely THETA, rinse and repeat
Parallelizing NONMEM means parallelizing step #2. Step #1, #3 and #4 cannot be
parallelized.
In practice, we simply split up the subjects in N groups. Each worker
calculates the probability curve for all subject in its group and sends the
results back to the main worker, who can then calculate step #3 and step #4.
This works well for very complex models with a considerable estimation step:
ODE systems. If you have a very fast model (e.g. simple $PRED section) and a
huge dataset, it might be faster to run all of this locally on a single node.
Note that Nonmem 7.3 does not parallelize anything other than the subject ETA
estimation step! Nonmem 7.4 will parallelize also the TABLE and/or COVARIANCE
estimation step.
Conclusion: Parallelizing nonmem only works well in specific cases. You can
execute the parallelization by specifying a parafile. If your system
administrator specified a parametric parafile, you can also choose the number
of CPU’s to parallelize over using [nodes]=x in nmfe, or with -nodes=xxx in PsN
execute.
Let’s now talk about a covariate search. In this case, we want to evaluate 12
models; we can evaluate them concurrently, as there is no dependence between
them. PsN works wonderfully here: you can configure the amount of parallel runs
using the -threads=xxx switch.
1) Estimate the base model
2) Create 12 instances of the base model, adding a single covariate to
each instance. Launch all of these instances in parallel.
3) Once these 12 instances completed, select the most significant
covariate.
4) Create 11 instances of the model from step #3, adding a single
covariate to each instance. Launch all of these instances in parallel.
5) etc.
As you can see, there is still some dependence: we need all results from step
#2 to evaluate step #3. On top of that, parallelizing step #2 means you will
have to collect all of those results back over the network to do step #3 (I/O
impact). If you do not parallelize, they will already be sitting in main memory.
In practice, we use the following calculation:
· #threads = #max_covariate_steps
· #CPU_available / #max_covariate_steps = #nodes
So for a cluster of 20 CPU’s:
· A covariate search with 5 covariates would be launched using: scm
myModel.scm -threads=5 -nodes=4
· A covariate search with 20 covariates would be launched using: scm
myModel.scm -threads=20 -nodes=1
Remember that running multiple nonmem runs concurrently should always be
preferred over parallelizing a single nonmem run.
Finally, let’s talk about bootstrapping:
In this case, there is no dependence between the results. This problem can be
perfectly parallelized. In this case, always prefer to run multiple nonmem runs
concurrently, instead of parallelizing a single nonmem run.
bootstrap myModel -samples=2000 -threads=2000 -nodes=1
Final summary:
For a single nonmem run, parallelization may or may not work, depending on how
complex your model $PRED code is.
For a covariate search, try to prefer running multiple runs at the same time,
rather than parallelizing single runs.
For bootstraps, always run multiple runs at the same time. Never parallelize a
single nonmem run.
Kind regards,
Ruben
From: [email protected] [mailto:[email protected]] On
Behalf Of Pavel Belo
Sent: dinsdag 8 december 2015 22:54
To: [email protected]
Subject: [NMusers] setup of parallel processing and supporting software - help
wanted
Hello The Team,
We hear different opinions about effectiveness of parallel processing with
NONMEM from very helpful to less helpful. It can be task dependent. How
useful is it in phase 3 for basic and covariate models, as well as for
bootstrapping?
We reached a non-exploratory (production) point when popPK is on a critical
path and sophisticated but slow home-made utilities may be insufficient. Are
there efficient/quick companies/institutions, which setup parallel processing,
supporting software and, possibly, some other utilities (cloud computing, ...)?
A group which used to helped us a while ago disappeared somewhere...
Thanks,
Pavel
Information in this email and any attachments is confidential and intended
solely for the use of the individual(s) to whom it is addressed or otherwise
directed. Please note that any views or opinions presented in this email are
solely those of the author and do not necessarily represent those of the
Company. Finally, the recipient should check this email and any attachments for
the presence of viruses. The Company accepts no liability for any damage caused
by any virus transmitted by this email. All SGS services are rendered in
accordance with the applicable SGS conditions of service available on request
and accessible at http://www.sgs.com/en/Terms-and-Conditions.aspx