Hi,
I'm trying to help our local grid computing guys to get NONMEM running with MPI on a Linux based grid.
We have NONMEM running with a pnm file asking for 8 nodes but the actual run time is 10% longer than with a regular 'transfer' run. The only clue I can see to the problem is the "Exit status=1" after the "MPI TRANSFER TYPE SELECTED" message. The NONMEM runs appears to execute OK except for no evidence that MPI is operating. Can anybody tell me if this exit status is normal under Linux? If not what might it mean?
Nick
Recompiling certain components
USING PARALLEL PROFILE wfn_mpi8.pnm
MPI TRANSFER TYPE SELECTED
Exit status = 1
IN MPI
Starting MPI version of nonmem execution ...
--
Nick Holford, Professor Clinical Pharmacology
Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A
University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand
tel:+64(9)923-6730 fax:+64(9)373-7090 mobile:+64(21)46 23 53
email: [email protected]
http://www.fmhs.auckland.ac.nz/sms/pharmacology/holford
MPI on linux Exit status=1
5 messages
4 people
Latest: Sep 29, 2012
Hi Nick,
We have a working nm7 parallel setup using MPI on linux over our SGE cluster.
Based on your e-mail I found a match somewhere in the SGE stdout capture:
CREATING MUMODEL ROUTINE...
Recompiling certain components
USING PARALLEL PROFILE mpihydra.pnm
MPI TRANSFER TYPE SELECTED
Exit status = 1
IN MPI
Starting MPI version of nonmem execution ...
And thereafter I had a decent parallel run. Does the system create
subdirectories for each worker etc in your situation?
You might also want to look for this kind of statement in nonmem's report:
#PARA: PARAFILE=mpihydra.pnm, PROTOCOL=MPI, NODES= 16
Hope this helps,
Jeroen
J. Elassaiss-Schaap Senior Principal Scientist
Phone: + 31 412 66 9320
MSD | PK, PD and Drug Metabolism | Clinical PK-PD Mail stop KR
4406 | PO Box 20, 5340 BH Oss, NL
Quoted reply history
-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Nick Holford
Sent: Friday, September 28, 2012 10:25
To: nmusers
Subject: [NMusers] MPI on linux Exit status=1
Hi,
I'm trying to help our local grid computing guys to get NONMEM running
with MPI on a Linux based grid.
We have NONMEM running with a pnm file asking for 8 nodes but the actual
run time is 10% longer than with a regular 'transfer' run. The only clue
I can see to the problem is the "Exit status=1" after the "MPI TRANSFER
TYPE SELECTED" message. The NONMEM runs appears to execute OK except for
no evidence that MPI is operating. Can anybody tell me if this exit
status is normal under Linux? If not what might it mean?
Nick
Recompiling certain components
USING PARALLEL PROFILE wfn_mpi8.pnm
MPI TRANSFER TYPE SELECTED
Exit status = 1
IN MPI
Starting MPI version of nonmem execution ...
--
Nick Holford, Professor Clinical Pharmacology
Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A
University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand
tel:+64(9)923-6730 fax:+64(9)373-7090 mobile:+64(21)46 23 53
email: [email protected]
http://www.fmhs.auckland.ac.nz/sms/pharmacology/holford
Notice: This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
New Jersey, USA 08889), and/or its affiliates Direct contact information
for affiliates is available at
http://www.merck.com/contact/contacts.html) that may be confidential,
proprietary copyrighted and/or legally privileged. It is intended solely
for the use of the individual or entity named on this message. If you are
not the intended recipient, and have received this message in error,
please notify us immediately by reply e-mail and then delete it from
your system.
Nick,
There is also a run#.log file that shows the parallelization as it occurs. You
can verify that the running is working as expected by looking at the file and
seeing that parts of the data are being sent out to the nodes and back. We also
use SGE with linux for parallel runs and have not had any major issues.
Bill
~~~~~~~~~~~~~~~~~~~~~~~~
Bill Knebel, PharmD, PhD
Principal Scientist II
Metrum Research Group LLC
2 Tunxis Road, Suite 112
Tariffville, CT 06081
O: 860.735.7043 (tel:860.735.7043)
C: 860.930.1370 (tel:860.930.1370)
F: 860.760.6014 (tel:860.760.6014)
Quoted reply history
On Friday, September 28, 2012 at 7:41 AM, Elassaiss - Schaap, J (Jeroen) wrote:
> Hi Nick,
>
> We have a working nm7 parallel setup using MPI on linux over our SGE cluster.
> Based on your e-mail I found a match somewhere in the SGE stdout capture:
>
> CREATING MUMODEL ROUTINE...
> Recompiling certain components
>
> USING PARALLEL PROFILE mpihydra.pnm
> MPI TRANSFER TYPE SELECTED
> Exit status = 1
> IN MPI
> Starting MPI version of nonmem execution ...
>
> And thereafter I had a decent parallel run. Does the system create
> subdirectories for each worker etc in your situation?
> You might also want to look for this kind of statement in nonmem's report:
> #PARA: PARAFILE=mpihydra.pnm, PROTOCOL=MPI, NODES= 16
>
> Hope this helps,
> Jeroen
>
> J. Elassaiss-Schaap Senior Principal Scientist Phone: + 31 412 66 9320
> MSD | PK, PD and Drug Metabolism | Clinical PK-PD Mail stop KR 4406 | PO Box
> 20, 5340 BH Oss, NL
>
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On
> Behalf Of Nick Holford
> Sent: Friday, September 28, 2012 10:25
> To: nmusers
> Subject: [NMusers] MPI on linux Exit status=1
>
> Hi,
>
> I'm trying to help our local grid computing guys to get NONMEM running
> with MPI on a Linux based grid.
>
> We have NONMEM running with a pnm file asking for 8 nodes but the actual
> run time is 10% longer than with a regular 'transfer' run. The only clue
> I can see to the problem is the "Exit status=1" after the "MPI TRANSFER
> TYPE SELECTED" message. The NONMEM runs appears to execute OK except for
> no evidence that MPI is operating. Can anybody tell me if this exit
> status is normal under Linux? If not what might it mean?
>
> Nick
>
> Recompiling certain components
>
> USING PARALLEL PROFILE wfn_mpi8.pnm
> MPI TRANSFER TYPE SELECTED
> Exit status = 1
> IN MPI
> Starting MPI version of nonmem execution ...
>
> --
> Nick Holford, Professor Clinical Pharmacology
> Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A
> University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand
> tel:+64(9)923-6730 fax:+64(9)373-7090 mobile:+64(21)46 23 53
> email: [email protected] (mailto:[email protected])
> http://www.fmhs.auckland.ac.nz/sms/pharmacology/holford
>
>
> Notice: This e-mail message, together with any attachments, contains
> information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
> New Jersey, USA 08889), and/or its affiliates Direct contact information
> for affiliates is available at
> http://www.merck.com/contact/contacts.html) that may be confidential,
> proprietary copyrighted and/or legally privileged. It is intended solely
> for the use of the individual or entity named on this message. If you are
> not the intended recipient, and have received this message in error,
> please notify us immediately by reply e-mail and then delete it from
> your system.
>
>
The exit status=1 just means MPI is selected rather than FPI.
Robert J. Bauer, Ph.D.
Vice President, Pharmacometrics, R&D
ICON Development Solutions
7740 Milestone Parkway
Suite 150
Hanover, MD 21076
Tel: (215) 616-6428
Mob: (925) 286-0769
Email: [email protected]
Web: http://www.iconplc.com/
Quoted reply history
________________________________
From: [email protected] [mailto:[email protected]] On
Behalf Of Bill Knebel
Sent: Friday, September 28, 2012 8:54 AM
To: Elassaiss - Schaap, J (Jeroen)
Cc: Nick Holford; nmusers
Subject: Re: [NMusers] MPI on linux Exit status=1
Nick,
There is also a run#.log file that shows the parallelization as it occurs. You
can verify that the running is working as expected by looking at the file and
seeing that parts of the data are being sent out to the nodes and back. We also
use SGE with linux for parallel runs and have not had any major issues.
Bill
~~~~~~~~~~~~~~~~~~~~~~~~
Bill Knebel, PharmD, PhD
Principal Scientist II
Metrum Research Group LLC
2 Tunxis Road, Suite 112
Tariffville, CT 06081
O: 860.735.7043<tel:860.735.7043>
C: 860.930.1370<tel:860.930.1370>
F: 860.760.6014<tel:860.760.6014>
On Friday, September 28, 2012 at 7:41 AM, Elassaiss - Schaap, J (Jeroen) wrote:
Hi Nick,
We have a working nm7 parallel setup using MPI on linux over our SGE cluster.
Based on your e-mail I found a match somewhere in the SGE stdout capture:
CREATING MUMODEL ROUTINE...
Recompiling certain components
USING PARALLEL PROFILE mpihydra.pnm
MPI TRANSFER TYPE SELECTED
Exit status = 1
IN MPI
Starting MPI version of nonmem execution ...
And thereafter I had a decent parallel run. Does the system create
subdirectories for each worker etc in your situation?
You might also want to look for this kind of statement in nonmem's report:
#PARA: PARAFILE=mpihydra.pnm, PROTOCOL=MPI, NODES= 16
Hope this helps,
Jeroen
J. Elassaiss-Schaap Senior Principal Scientist Phone: + 31 412 66 9320
MSD | PK, PD and Drug Metabolism | Clinical PK-PD Mail stop KR 4406 | PO Box
20, 5340 BH Oss, NL
-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Nick Holford
Sent: Friday, September 28, 2012 10:25
To: nmusers
Subject: [NMusers] MPI on linux Exit status=1
Hi,
I'm trying to help our local grid computing guys to get NONMEM running
with MPI on a Linux based grid.
We have NONMEM running with a pnm file asking for 8 nodes but the actual
run time is 10% longer than with a regular 'transfer' run. The only clue
I can see to the problem is the "Exit status=1" after the "MPI TRANSFER
TYPE SELECTED" message. The NONMEM runs appears to execute OK except for
no evidence that MPI is operating. Can anybody tell me if this exit
status is normal under Linux? If not what might it mean?
Nick
Recompiling certain components
USING PARALLEL PROFILE wfn_mpi8.pnm
MPI TRANSFER TYPE SELECTED
Exit status = 1
IN MPI
Starting MPI version of nonmem execution ...
--
Nick Holford, Professor Clinical Pharmacology
Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A
University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand
tel:+64(9)923-6730 fax:+64(9)373-7090 mobile:+64(21)46 23 53
email: [email protected]<mailto:[email protected]>
http://www.fmhs.auckland.ac.nz/sms/pharmacology/holford
Notice: This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
New Jersey, USA 08889), and/or its affiliates Direct contact information
for affiliates is available at
http://www.merck.com/contact/contacts.html) that may be confidential,
proprietary copyrighted and/or legally privileged. It is intended solely
for the use of the individual or entity named on this message. If you are
not the intended recipient, and have received this message in error,
please notify us immediately by reply e-mail and then delete it from
your system.
Bill, Elassiais, Bob,
Thanks for your helpful replies informing me that Exit status=1 is normal behaviour with MPI.
I can see no signs of a run#.log file.
I see this #PARA: PARAFILE=wfn_mpi8.pnm, PROTOCOL=MPI, NODES= 8 in the NONMEM output listing.
Using PARAPRINT=1 I get this sort of thing in stdout.
MONITORING OF SEARCH:
STARTING SUBJECTS 1 TO 202 ON MANAGER: OK
STARTING SUBJECTS 203 TO 404 ON WORKER1: OK
STARTING SUBJECTS 405 TO 605 ON WORKER2: OK
STARTING SUBJECTS 606 TO 806 ON WORKER3: OK
STARTING SUBJECTS 807 TO 1007 ON WORKER4: OK
STARTING SUBJECTS 1008 TO 1208 ON WORKER5: OK
STARTING SUBJECTS 1209 TO 1409 ON WORKER6: OK
STARTING SUBJECTS 1410 TO 1610 ON WORKER7: OK
COLLECTING SUBJECTS 1 TO 202 ON MANAGER
COLLECTING SUBJECTS 203 TO 404 ON WORKER1
COLLECTING SUBJECTS 405 TO 605 ON WORKER2
COLLECTING SUBJECTS 606 TO 806 ON WORKER3
COLLECTING SUBJECTS 807 TO 1007 ON WORKER4
COLLECTING SUBJECTS 1008 TO 1208 ON WORKER5
COLLECTING SUBJECTS 1209 TO 1409 ON WORKER6
COLLECTING SUBJECTS 1410 TO 1610 ON WORKER7
...
My local MPI expert suggested the following for the COMMANDS section of the pnm file. Our system has a file listing available hosts pointed to by $LOADL_HOSTFILE. The -n 1 is intended to spread the job over different hosts with just one node session per host. I've tried with and without -n 1 and used mpirun instead of mpiexec but it seems to produce the same results.
$COMMANDS ;each node gets a command line, used to launch the node session
; $@ sends all arguments on the user's command line to the manager process
1:mpiexec -wdir "$(pwd)" -f "$LOADL_HOSTFILE" -n 1 $(pwd)/<<nmexec>> $@
; Only specific arguments should be sent to the workers, which are identified by reserved variable names
2-[nodes]: -wdir "$(pwd)/worker{#-1}" -n 1 $(pwd)/<<nmexec>>
Any comments?
Thanks
Nick
Bob wrote:
> The exit status=1 just means MPI is selected rather than FPI.
> Robert J. Bauer, Ph.D.
Quoted reply history
On 29/09/2012 12:54 a.m., Bill Knebel wrote:
> Nick,
>
> There is also a run#.log file that shows the parallelization as it occurs. You can verify that the running is working as expected by looking at the file and seeing that parts of the data are being sent out to the nodes and back. We also use SGE with linux for parallel runs and have not had any major issues.
>
> Bill
>
> /~~~~~~~~~~~~~~~~~~~~~~~~/
>
> /Bill Knebel, PharmD, PhD/
>
> /Principal Scientist II/
>
> /Metrum Research Group LLC/
>
> /2 Tunxis Road, Suite 112/
>
> /Tariffville//, CT 06081/
>
> /O: 860.735.7043 <tel:860.735.7043>/
>
> /C: 860.930.1370 <tel:860.930.1370>/
>
> /F: 860.760.6014 <tel:860.760.6014>/
>
> On Friday, September 28, 2012 at 7:41 AM, Elassaiss - Schaap, J (Jeroen) wrote:
>
> > Hi Nick,
> >
> > We have a working nm7 parallel setup using MPI on linux over our SGE cluster. Based on your e-mail I found a match somewhere in the SGE stdout capture:
> >
> > CREATING MUMODEL ROUTINE...
> > Recompiling certain components
> > USING PARALLEL PROFILE mpihydra.pnm
> > MPI TRANSFER TYPE SELECTED
> > Exit status = 1
> > IN MPI
> > Starting MPI version of nonmem execution ...
> >
> > And thereafter I had a decent parallel run. Does the system create subdirectories for each worker etc in your situation? You might also want to look for this kind of statement in nonmem's report:
> >
> > #PARA: PARAFILE=mpihydra.pnm, PROTOCOL=MPI, NODES= 16
> >
> > Hope this helps,
> > Jeroen
> >
> > J. Elassaiss-Schaap Senior Principal Scientist Phone: + 31 412 66 9320
> >
> > MSD | PK, PD and Drug Metabolism | Clinical PK-PD Mail stop KR 4406 | PO Box 20, 5340 BH Oss, NL
> >
> > -----Original Message-----
> >
> > From: [email protected] [ mailto: [email protected] ] On Behalf Of Nick Holford
> >
> > Sent: Friday, September 28, 2012 10:25
> > To: nmusers
> > Subject: [NMusers] MPI on linux Exit status=1
> >
> > Hi,
> >
> > I'm trying to help our local grid computing guys to get NONMEM running
> > with MPI on a Linux based grid.
> >
> > We have NONMEM running with a pnm file asking for 8 nodes but the actual
> > run time is 10% longer than with a regular 'transfer' run. The only clue
> > I can see to the problem is the "Exit status=1" after the "MPI TRANSFER
> > TYPE SELECTED" message. The NONMEM runs appears to execute OK except for
> > no evidence that MPI is operating. Can anybody tell me if this exit
> > status is normal under Linux? If not what might it mean?
> >
> > Nick
> >
> > Recompiling certain components
> >
> > USING PARALLEL PROFILE wfn_mpi8.pnm
> > MPI TRANSFER TYPE SELECTED
> > Exit status = 1
> > IN MPI
> > Starting MPI version of nonmem execution ...
> >
> > --
> > Nick Holford, Professor Clinical Pharmacology
> > Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A
> > University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand
> > tel:+64(9)923-6730 fax:+64(9)373-7090 mobile:+64(21)46 23 53
> > email: [email protected] <mailto:[email protected]>
> > http://www.fmhs.auckland.ac.nz/sms/pharmacology/holford
> >
> > Notice: This e-mail message, together with any attachments, contains
> > information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
> > New Jersey, USA 08889), and/or its affiliates Direct contact information
> > for affiliates is available at
> > http://www.merck.com/contact/contacts.html) that may be confidential,
> > proprietary copyrighted and/or legally privileged. It is intended solely
> > for the use of the individual or entity named on this message. If you are
> > not the intended recipient, and have received this message in error,
> > please notify us immediately by reply e-mail and then delete it from
> > your system.
--
Nick Holford, Professor Clinical Pharmacology
Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A
University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand
tel:+64(9)923-6730 fax:+64(9)373-7090 mobile:+64(21)46 23 53
email: [email protected]
http://www.fmhs.auckland.ac.nz/sms/pharmacology/holford