MPI on linux Exit status=1

5 messages 4 people Latest: Sep 29, 2012

MPI on linux Exit status=1

From: Nick Holford Date: September 28, 2012 technical
Hi, I'm trying to help our local grid computing guys to get NONMEM running with MPI on a Linux based grid. We have NONMEM running with a pnm file asking for 8 nodes but the actual run time is 10% longer than with a regular 'transfer' run. The only clue I can see to the problem is the "Exit status=1" after the "MPI TRANSFER TYPE SELECTED" message. The NONMEM runs appears to execute OK except for no evidence that MPI is operating. Can anybody tell me if this exit status is normal under Linux? If not what might it mean? Nick Recompiling certain components USING PARALLEL PROFILE wfn_mpi8.pnm MPI TRANSFER TYPE SELECTED Exit status = 1 IN MPI Starting MPI version of nonmem execution ... -- Nick Holford, Professor Clinical Pharmacology Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand tel:+64(9)923-6730 fax:+64(9)373-7090 mobile:+64(21)46 23 53 email: [email protected] http://www.fmhs.auckland.ac.nz/sms/pharmacology/holford

RE: MPI on linux Exit status=1

From: Jeroen Elassaiss-Schaap Date: September 28, 2012 technical
Hi Nick, We have a working nm7 parallel setup using MPI on linux over our SGE cluster. Based on your e-mail I found a match somewhere in the SGE stdout capture: CREATING MUMODEL ROUTINE... Recompiling certain components USING PARALLEL PROFILE mpihydra.pnm MPI TRANSFER TYPE SELECTED Exit status = 1 IN MPI Starting MPI version of nonmem execution ... And thereafter I had a decent parallel run. Does the system create subdirectories for each worker etc in your situation? You might also want to look for this kind of statement in nonmem's report: #PARA: PARAFILE=mpihydra.pnm, PROTOCOL=MPI, NODES= 16 Hope this helps, Jeroen J. Elassaiss-Schaap Senior Principal Scientist Phone: + 31 412 66 9320 MSD | PK, PD and Drug Metabolism | Clinical PK-PD Mail stop KR 4406 | PO Box 20, 5340 BH Oss, NL
Quoted reply history
-----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Nick Holford Sent: Friday, September 28, 2012 10:25 To: nmusers Subject: [NMusers] MPI on linux Exit status=1 Hi, I'm trying to help our local grid computing guys to get NONMEM running with MPI on a Linux based grid. We have NONMEM running with a pnm file asking for 8 nodes but the actual run time is 10% longer than with a regular 'transfer' run. The only clue I can see to the problem is the "Exit status=1" after the "MPI TRANSFER TYPE SELECTED" message. The NONMEM runs appears to execute OK except for no evidence that MPI is operating. Can anybody tell me if this exit status is normal under Linux? If not what might it mean? Nick Recompiling certain components USING PARALLEL PROFILE wfn_mpi8.pnm MPI TRANSFER TYPE SELECTED Exit status = 1 IN MPI Starting MPI version of nonmem execution ... -- Nick Holford, Professor Clinical Pharmacology Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand tel:+64(9)923-6730 fax:+64(9)373-7090 mobile:+64(21)46 23 53 email: [email protected] http://www.fmhs.auckland.ac.nz/sms/pharmacology/holford Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates Direct contact information for affiliates is available at http://www.merck.com/contact/contacts.html) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system.

Re: MPI on linux Exit status=1

From: Bill Knebel Date: September 28, 2012 technical
Nick, There is also a run#.log file that shows the parallelization as it occurs. You can verify that the running is working as expected by looking at the file and seeing that parts of the data are being sent out to the nodes and back. We also use SGE with linux for parallel runs and have not had any major issues. Bill ~~~~~~~~~~~~~~~~~~~~~~~~ Bill Knebel, PharmD, PhD Principal Scientist II Metrum Research Group LLC 2 Tunxis Road, Suite 112 Tariffville, CT 06081 O: 860.735.7043 (tel:860.735.7043) C: 860.930.1370 (tel:860.930.1370) F: 860.760.6014 (tel:860.760.6014)
Quoted reply history
On Friday, September 28, 2012 at 7:41 AM, Elassaiss - Schaap, J (Jeroen) wrote: > Hi Nick, > > We have a working nm7 parallel setup using MPI on linux over our SGE cluster. > Based on your e-mail I found a match somewhere in the SGE stdout capture: > > CREATING MUMODEL ROUTINE... > Recompiling certain components > > USING PARALLEL PROFILE mpihydra.pnm > MPI TRANSFER TYPE SELECTED > Exit status = 1 > IN MPI > Starting MPI version of nonmem execution ... > > And thereafter I had a decent parallel run. Does the system create > subdirectories for each worker etc in your situation? > You might also want to look for this kind of statement in nonmem's report: > #PARA: PARAFILE=mpihydra.pnm, PROTOCOL=MPI, NODES= 16 > > Hope this helps, > Jeroen > > J. Elassaiss-Schaap Senior Principal Scientist Phone: + 31 412 66 9320 > MSD | PK, PD and Drug Metabolism | Clinical PK-PD Mail stop KR 4406 | PO Box > 20, 5340 BH Oss, NL > > > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On > Behalf Of Nick Holford > Sent: Friday, September 28, 2012 10:25 > To: nmusers > Subject: [NMusers] MPI on linux Exit status=1 > > Hi, > > I'm trying to help our local grid computing guys to get NONMEM running > with MPI on a Linux based grid. > > We have NONMEM running with a pnm file asking for 8 nodes but the actual > run time is 10% longer than with a regular 'transfer' run. The only clue > I can see to the problem is the "Exit status=1" after the "MPI TRANSFER > TYPE SELECTED" message. The NONMEM runs appears to execute OK except for > no evidence that MPI is operating. Can anybody tell me if this exit > status is normal under Linux? If not what might it mean? > > Nick > > Recompiling certain components > > USING PARALLEL PROFILE wfn_mpi8.pnm > MPI TRANSFER TYPE SELECTED > Exit status = 1 > IN MPI > Starting MPI version of nonmem execution ... > > -- > Nick Holford, Professor Clinical Pharmacology > Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A > University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand > tel:+64(9)923-6730 fax:+64(9)373-7090 mobile:+64(21)46 23 53 > email: [email protected] (mailto:[email protected]) > http://www.fmhs.auckland.ac.nz/sms/pharmacology/holford > > > Notice: This e-mail message, together with any attachments, contains > information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, > New Jersey, USA 08889), and/or its affiliates Direct contact information > for affiliates is available at > http://www.merck.com/contact/contacts.html) that may be confidential, > proprietary copyrighted and/or legally privileged. It is intended solely > for the use of the individual or entity named on this message. If you are > not the intended recipient, and have received this message in error, > please notify us immediately by reply e-mail and then delete it from > your system. > >

RE: MPI on linux Exit status=1

From: Robert Bauer Date: September 28, 2012 technical
The exit status=1 just means MPI is selected rather than FPI. Robert J. Bauer, Ph.D. Vice President, Pharmacometrics, R&D ICON Development Solutions 7740 Milestone Parkway Suite 150 Hanover, MD 21076 Tel: (215) 616-6428 Mob: (925) 286-0769 Email: [email protected] Web: http://www.iconplc.com/
Quoted reply history
________________________________ From: [email protected] [mailto:[email protected]] On Behalf Of Bill Knebel Sent: Friday, September 28, 2012 8:54 AM To: Elassaiss - Schaap, J (Jeroen) Cc: Nick Holford; nmusers Subject: Re: [NMusers] MPI on linux Exit status=1 Nick, There is also a run#.log file that shows the parallelization as it occurs. You can verify that the running is working as expected by looking at the file and seeing that parts of the data are being sent out to the nodes and back. We also use SGE with linux for parallel runs and have not had any major issues. Bill ~~~~~~~~~~~~~~~~~~~~~~~~ Bill Knebel, PharmD, PhD Principal Scientist II Metrum Research Group LLC 2 Tunxis Road, Suite 112 Tariffville, CT 06081 O: 860.735.7043<tel:860.735.7043> C: 860.930.1370<tel:860.930.1370> F: 860.760.6014<tel:860.760.6014> On Friday, September 28, 2012 at 7:41 AM, Elassaiss - Schaap, J (Jeroen) wrote: Hi Nick, We have a working nm7 parallel setup using MPI on linux over our SGE cluster. Based on your e-mail I found a match somewhere in the SGE stdout capture: CREATING MUMODEL ROUTINE... Recompiling certain components USING PARALLEL PROFILE mpihydra.pnm MPI TRANSFER TYPE SELECTED Exit status = 1 IN MPI Starting MPI version of nonmem execution ... And thereafter I had a decent parallel run. Does the system create subdirectories for each worker etc in your situation? You might also want to look for this kind of statement in nonmem's report: #PARA: PARAFILE=mpihydra.pnm, PROTOCOL=MPI, NODES= 16 Hope this helps, Jeroen J. Elassaiss-Schaap Senior Principal Scientist Phone: + 31 412 66 9320 MSD | PK, PD and Drug Metabolism | Clinical PK-PD Mail stop KR 4406 | PO Box 20, 5340 BH Oss, NL -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Nick Holford Sent: Friday, September 28, 2012 10:25 To: nmusers Subject: [NMusers] MPI on linux Exit status=1 Hi, I'm trying to help our local grid computing guys to get NONMEM running with MPI on a Linux based grid. We have NONMEM running with a pnm file asking for 8 nodes but the actual run time is 10% longer than with a regular 'transfer' run. The only clue I can see to the problem is the "Exit status=1" after the "MPI TRANSFER TYPE SELECTED" message. The NONMEM runs appears to execute OK except for no evidence that MPI is operating. Can anybody tell me if this exit status is normal under Linux? If not what might it mean? Nick Recompiling certain components USING PARALLEL PROFILE wfn_mpi8.pnm MPI TRANSFER TYPE SELECTED Exit status = 1 IN MPI Starting MPI version of nonmem execution ... -- Nick Holford, Professor Clinical Pharmacology Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand tel:+64(9)923-6730 fax:+64(9)373-7090 mobile:+64(21)46 23 53 email: [email protected]<mailto:[email protected]> http://www.fmhs.auckland.ac.nz/sms/pharmacology/holford Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates Direct contact information for affiliates is available at http://www.merck.com/contact/contacts.html) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system.

Re: MPI on linux Exit status=1

From: Nick Holford Date: September 29, 2012 technical
Bill, Elassiais, Bob, Thanks for your helpful replies informing me that Exit status=1 is normal behaviour with MPI. I can see no signs of a run#.log file. I see this #PARA: PARAFILE=wfn_mpi8.pnm, PROTOCOL=MPI, NODES= 8 in the NONMEM output listing. Using PARAPRINT=1 I get this sort of thing in stdout. MONITORING OF SEARCH: STARTING SUBJECTS 1 TO 202 ON MANAGER: OK STARTING SUBJECTS 203 TO 404 ON WORKER1: OK STARTING SUBJECTS 405 TO 605 ON WORKER2: OK STARTING SUBJECTS 606 TO 806 ON WORKER3: OK STARTING SUBJECTS 807 TO 1007 ON WORKER4: OK STARTING SUBJECTS 1008 TO 1208 ON WORKER5: OK STARTING SUBJECTS 1209 TO 1409 ON WORKER6: OK STARTING SUBJECTS 1410 TO 1610 ON WORKER7: OK COLLECTING SUBJECTS 1 TO 202 ON MANAGER COLLECTING SUBJECTS 203 TO 404 ON WORKER1 COLLECTING SUBJECTS 405 TO 605 ON WORKER2 COLLECTING SUBJECTS 606 TO 806 ON WORKER3 COLLECTING SUBJECTS 807 TO 1007 ON WORKER4 COLLECTING SUBJECTS 1008 TO 1208 ON WORKER5 COLLECTING SUBJECTS 1209 TO 1409 ON WORKER6 COLLECTING SUBJECTS 1410 TO 1610 ON WORKER7 ... My local MPI expert suggested the following for the COMMANDS section of the pnm file. Our system has a file listing available hosts pointed to by $LOADL_HOSTFILE. The -n 1 is intended to spread the job over different hosts with just one node session per host. I've tried with and without -n 1 and used mpirun instead of mpiexec but it seems to produce the same results. $COMMANDS ;each node gets a command line, used to launch the node session ; $@ sends all arguments on the user's command line to the manager process 1:mpiexec -wdir "$(pwd)" -f "$LOADL_HOSTFILE" -n 1 $(pwd)/<<nmexec>> $@ ; Only specific arguments should be sent to the workers, which are identified by reserved variable names 2-[nodes]: -wdir "$(pwd)/worker{#-1}" -n 1 $(pwd)/<<nmexec>> Any comments? Thanks Nick Bob wrote: > The exit status=1 just means MPI is selected rather than FPI. > Robert J. Bauer, Ph.D.
Quoted reply history
On 29/09/2012 12:54 a.m., Bill Knebel wrote: > Nick, > > There is also a run#.log file that shows the parallelization as it occurs. You can verify that the running is working as expected by looking at the file and seeing that parts of the data are being sent out to the nodes and back. We also use SGE with linux for parallel runs and have not had any major issues. > > Bill > > /~~~~~~~~~~~~~~~~~~~~~~~~/ > > /Bill Knebel, PharmD, PhD/ > > /Principal Scientist II/ > > /Metrum Research Group LLC/ > > /2 Tunxis Road, Suite 112/ > > /Tariffville//, CT 06081/ > > /O: 860.735.7043 <tel:860.735.7043>/ > > /C: 860.930.1370 <tel:860.930.1370>/ > > /F: 860.760.6014 <tel:860.760.6014>/ > > On Friday, September 28, 2012 at 7:41 AM, Elassaiss - Schaap, J (Jeroen) wrote: > > > Hi Nick, > > > > We have a working nm7 parallel setup using MPI on linux over our SGE cluster. Based on your e-mail I found a match somewhere in the SGE stdout capture: > > > > CREATING MUMODEL ROUTINE... > > Recompiling certain components > > USING PARALLEL PROFILE mpihydra.pnm > > MPI TRANSFER TYPE SELECTED > > Exit status = 1 > > IN MPI > > Starting MPI version of nonmem execution ... > > > > And thereafter I had a decent parallel run. Does the system create subdirectories for each worker etc in your situation? You might also want to look for this kind of statement in nonmem's report: > > > > #PARA: PARAFILE=mpihydra.pnm, PROTOCOL=MPI, NODES= 16 > > > > Hope this helps, > > Jeroen > > > > J. Elassaiss-Schaap Senior Principal Scientist Phone: + 31 412 66 9320 > > > > MSD | PK, PD and Drug Metabolism | Clinical PK-PD Mail stop KR 4406 | PO Box 20, 5340 BH Oss, NL > > > > -----Original Message----- > > > > From: [email protected] [ mailto: [email protected] ] On Behalf Of Nick Holford > > > > Sent: Friday, September 28, 2012 10:25 > > To: nmusers > > Subject: [NMusers] MPI on linux Exit status=1 > > > > Hi, > > > > I'm trying to help our local grid computing guys to get NONMEM running > > with MPI on a Linux based grid. > > > > We have NONMEM running with a pnm file asking for 8 nodes but the actual > > run time is 10% longer than with a regular 'transfer' run. The only clue > > I can see to the problem is the "Exit status=1" after the "MPI TRANSFER > > TYPE SELECTED" message. The NONMEM runs appears to execute OK except for > > no evidence that MPI is operating. Can anybody tell me if this exit > > status is normal under Linux? If not what might it mean? > > > > Nick > > > > Recompiling certain components > > > > USING PARALLEL PROFILE wfn_mpi8.pnm > > MPI TRANSFER TYPE SELECTED > > Exit status = 1 > > IN MPI > > Starting MPI version of nonmem execution ... > > > > -- > > Nick Holford, Professor Clinical Pharmacology > > Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A > > University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand > > tel:+64(9)923-6730 fax:+64(9)373-7090 mobile:+64(21)46 23 53 > > email: [email protected] <mailto:[email protected]> > > http://www.fmhs.auckland.ac.nz/sms/pharmacology/holford > > > > Notice: This e-mail message, together with any attachments, contains > > information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, > > New Jersey, USA 08889), and/or its affiliates Direct contact information > > for affiliates is available at > > http://www.merck.com/contact/contacts.html) that may be confidential, > > proprietary copyrighted and/or legally privileged. It is intended solely > > for the use of the individual or entity named on this message. If you are > > not the intended recipient, and have received this message in error, > > please notify us immediately by reply e-mail and then delete it from > > your system. -- Nick Holford, Professor Clinical Pharmacology Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand tel:+64(9)923-6730 fax:+64(9)373-7090 mobile:+64(21)46 23 53 email: [email protected] http://www.fmhs.auckland.ac.nz/sms/pharmacology/holford