Error files when using multicore runs and psn ==> Fatal Error: Record SIZES is not valid

7 messages 7 people Latest: Jul 17, 2012
Hi Robert, Thanks for your quick reply. Unfortunately, I could not make it work. I have 17,000 records. So just after $PROB. I inserted your suggestion: $SIZES LIM1 = 20000 and I got the following error : > Fatal Error: Record SIZES is not valid So I tried also the default value for LIM1: $SIZES LIM1=10000 and also the example given on page 166 of Help guide viii $SIZES LIM1=30000 MAXFCN=2000000 NO=500 and always got the "Fatal Error: Record SIZES is not valid" message. I looked into Help guide viii (pp 166-167 and 463-464) but did not find any relevant information how to set $SIZES. What would be your suggestion? Thanks again for your help, because it's highly frustrating not being able to use the multi-cores when you have such long runs. Kind regards Pascal Girard, PhD [email protected] Head of Modeling & Simulation - Oncology Global Exploratory Medicine Merck Serono S.A. · Geneva Tel: +41.22.414.3549 Cell: +41.79.508.7898
Quoted reply history
From: "Bauer, Robert" <[email protected]> To: "[email protected]" <[email protected]>, "[email protected]" <[email protected]> Cc: "[email protected]" <[email protected]>, "[email protected]" <[email protected]> Date: 11/07/2012 18:55 Subject: RE: [NMusers] Error files when using multicore runs and psn Sent by: [email protected] Pascal: I cannot help regarding having all console messages sent to the proper files in the PSN environment, but I can assist in avoiding your present NONMEM error. If you insert at the beginning of the control stream file $SIZES LIM1=?? and insert a large enough value for ??, then file buffer 10 will not be used, and the error is avoided. The value should be at least as large as the number of data records (lines) in your data file (see section I.6 of ..\guides\nm720.pdf). Although nmfe72 in parallel mode has been tested successfully in our hands to use the file buffers for large data sets, it may not work in all grid environments. Setting the LIM values large enough avoids using buffer files, and utilizes only memory. The problem also runs faster when buffer files are not used. Robert J. Bauer, Ph.D. Vice President, Pharmacometrics, R&D ICON Development Solutions 7740 Milestone Parkway Suite 150 Hanover, MD 21076 Tel: (215) 616-6428 Mob: (925) 286-0769 Email: [email protected] Web: www.iconplc.com From: [email protected] [mailto:[email protected]] On Behalf Of [email protected] Sent: Wednesday, July 11, 2012 11:19 AM To: [email protected] Cc: [email protected]; [email protected] Subject: [NMusers] Error files when using multicore runs and psn Dear All, We are using psn version: 3.4.2 together with NONMEM 7.2.0 on a Linux Sun Grid Engine (SGE). When using multi-cores run on SGE, it happens sometimes that NONMEM returns a log file where the "MONITORING OF SEARCH" starts and nothing is reported. Looking into the psn directory, I found files which have the name of my script file + an extension made of letters and numbers that contains an error message that is not shown on the log file. For example my nm-tran script file is run003.mod and my log file run003.lst ends with: MONITORING OF SEARCH: Stop Time: Wed Jul 10 21:05:18 CEST 2012 Then I recover a file named run003.mod.o9501 in run003/NM_run1 directory created by psn. Sometimes this file contains an explicit error message, sometimes more cabalistic information as: WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1 (WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION. CREATING MUMODEL ROUTINE... Recompiling certain components USING PARALLEL PROFILE mpi_12cores.pnm MPI TRANSFER TYPE SELECTED Exit status = 1 IN MPI Starting MPI version of nonmem execution ... License Registered to: Merck KGaA Expiration Date: 14 SEP 2013 Current Date: 11 JUL 2012 Days until program expires : 428 Iterative Two Stage (No Prior) MONITORING OF SEARCH: At line 240 of file (unit = 10, file = 'WK1_FILE10') Fortran runtime error: End of file Fatal error in MPI_Send: Other MPI error, error stack: MPI_Send(174).....................: MPI_Send(buf=0xde71a0, count=80030, MPI_INTEGER, dest=1, tag=1, MPI_COMM_WORLD) failed MPIDI_CH3I_Progress(150)..........: MPID_nem_mpich2_blocking_recv(948): MPID_nem_tcp_connpoll(1720).......: state_commrdy_handler(1556).......: MPID_nem_tcp_recv_handler(1446)...: socket closed rank 1 in job 1 deda1x0481_36189 caused collective abort of all ranks exit status of rank 1: return code 2 Questions: 1) Is there a way to force psn and/or NONMEM to collect the error message in the log file when using multi-cores run ? 2) What about "cabalistic" error messages as the one above? Thank you for your help, Kind regards Pascal Girard, PhD [email protected] Head of Modeling & Simulation - Oncology Global Exploratory Medicine Merck Serono S.A. · Geneva Tel: +41.22.414.3549 Cell: +41.79.508.7898 This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or attachment or disclose the contents to any other person. If you have received this transmission in error, please notify the sender immediately and delete the message and any attachment from your system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept liability for any omissions or errors in this message which may arise as a result of E-Mail-transmission or for damages resulting from any unauthorized changes of the content of this message and any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not guarantee that this message is free of viruses and does not accept liability for any damages caused by any virus transmitted therewith. Click http://www.merckgroup.com/disclaimer to access the German, French, Spanish and Portuguese versions of this disclaimer.
Hi Pascal, $SIZES should be before $PROB. It should be the first uncommented line in the file. Regards, Katya Ekaterina Gibiansky, Ph.D. CEO&CSO, QuantPharm LLC Web: www.quantpharm.com Email: [email protected] Tel: (301)-717-7032 On 7/12/2012 9:21 AM, [email protected] wrote: Hi Robert, Thanks for your quick reply. Unfortunately, I could not make it work. I have 17,000 records. So just after $PROB. I inserted your suggestion: $SIZES LIM1 = 20000 and I got the following error : > Fatal Error: Record SIZES is not valid So I tried also the default value for LIM1: $SIZES LIM1=10000 and also the example given on page 166 of Help guide viii $SIZES LIM1=30000 MAXFCN=2000000 NO=500 and always got the "Fatal Error: Record SIZES is not valid" message. I looked into Help guide viii (pp 166-167 and 463-464) but did not find any relevant information how to set $SIZES. What would be your suggestion? Thanks again for your help, because it's highly frustrating not being able to use the multi-cores when you have such long runs. Kind regards Pascal Girard, PhD [email protected] Head of Modeling & Simulation - Oncology Global Exploratory Medicine Merck Serono S.A. · Geneva Tel: +41.22.414.3549 Cell: +41.79.508.7898
Quoted reply history
From: "Bauer, Robert" < [email protected] > To: " [email protected] " < [email protected] > , " [email protected] " < [email protected] > Cc: " [email protected] " < [email protected] > , " [email protected] " < [email protected] > Date: 11/07/2012 18:55 Subject: RE: [NMusers] Error files when using multicore runs and psn Sent by: [email protected] Pascal: I cannot help regarding having all console messages sent to the proper files in the PSN environment, but I can assist in avoiding your present NONMEM error. If you insert at the beginning of the control stream file $SIZES LIM1=?? and insert a large enough value for ??, then file buffer 10 will not be used, and the error is avoided. The value should be at least as large as the number of data records (lines) in your data file (see section I.6 of ..\guides\nm720.pdf). Although nmfe72 in parallel mode has been tested successfully in our hands to use the file buffers for large data sets, it may not work in all grid environments. Setting the LIM values large enough avoids using buffer files, and utilizes only memory. The problem also runs faster when buffer files are not used. Robert J. Bauer, Ph.D. Vice President, Pharmacometrics, R&D ICON Development Solutions 7740 Milestone Parkway Suite 150 Hanover, MD 21076 Tel: (215) 616-6428 Mob: (925) 286-0769 Email: [email protected] Web: www.iconplc.com From: [email protected] [ mailto: [email protected] ] On Behalf Of [email protected] Sent: Wednesday, July 11, 2012 11:19 AM To: [email protected] Cc: [email protected] ; [email protected] Subject: [NMusers] Error files when using multicore runs and psn Dear All, We are using psn version: 3.4.2 together with NONMEM 7.2.0 on a Linux Sun Grid Engine (SGE). When using multi-cores run on SGE, it happens sometimes that NONMEM returns a log file where the "MONITORING OF SEARCH" starts and nothing is reported. Looking into the psn directory, I found files which have the name of my script file + an extension made of letters and numbers that contains an error message that is not shown on the log file. For example my nm-tran script file is run003.mod and my log file run003.lst ends with: MONITORING OF SEARCH: Stop Time: Wed Jul 10 21:05:18 CEST 2012 Then I recover a file named run003.mod.o9501 in run003/NM_run1 directory created by psn. Sometimes this file contains an explicit error message, sometimes more cabalistic information as: WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1 (WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION. CREATING MUMODEL ROUTINE... Recompiling certain components USING PARALLEL PROFILE mpi_12cores.pnm MPI TRANSFER TYPE SELECTED Exit status = 1 IN MPI Starting MPI version of nonmem execution ... License Registered to: Merck KGaA Expiration Date: 14 SEP 2013 Current Date: 11 JUL 2012 Days until program expires : 428 Iterative Two Stage (No Prior) MONITORING OF SEARCH: At line 240 of file (unit = 10, file = 'WK1_FILE10') Fortran runtime error: End of file Fatal error in MPI_Send: Other MPI error, error stack: MPI_Send(174).....................: MPI_Send(buf=0xde71a0, count=80030, MPI_INTEGER, dest=1, tag=1, MPI_COMM_WORLD) failed MPIDI_CH3I_Progress(150)..........: MPID_nem_mpich2_blocking_recv(948): MPID_nem_tcp_connpoll(1720).......: state_commrdy_handler(1556).......: MPID_nem_tcp_recv_handler(1446)...: socket closed rank 1 in job 1 deda1x0481_36189 caused collective abort of all ranks exit status of rank 1: return code 2 Questions: 1) Is there a way to force psn and/or NONMEM to collect the error message in the log file when using multi-cores run ? 2) What about "cabalistic" error messages as the one above? Thank you for your help, Kind regards Pascal Girard, PhD [email protected] Head of Modeling & Simulation - Oncology Global Exploratory Medicine Merck Serono S.A. · Geneva Tel: +41.22.414.3549 Cell: +41.79.508.7898 This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or attachment or disclose the contents to any other person. If you have received this transmission in error, please notify the sender immediately and delete the message and any attachment from your system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept liability for any omissions or errors in this message which may arise as a result of E-Mail-transmission or for damages resulting from any unauthorized changes of the content of this message and any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not guarantee that this message is free of viruses and does not accept liability for any damages caused by any virus transmitted therewith. Click http://www.merckgroup.com/disclaimer to access the German, French, Spanish and Portuguese versions of this disclaimer.
Hi Pascal, I have already used it with NONMEM 72 and it works, but not in psn. $SIZES LIM6=1000 $PROBLEM XXXXXX Best regards, Nastya
Quoted reply history
________________________________ From: [email protected] [mailto:[email protected]] Sent: Thu 7/12/2012 11:09 To: Kassir Nastya; Bauer, Robert; [email protected] Subject: RE: [NMusers] Error files when using multicore runs and psn ==> Fatal Error: Record SIZES is not valid Hi Nastia, I tried your trick which would have broken the first table law rule "The first NM-TRAN control record must be a $PROBLEM record" and put $SIZES as very first record and got following error: >_read_problems: First non-comment line in modelfile run.mod is not a $PROB >record. NONMEM syntax violation. So the first table law rule still resists or you may have a different NONMEM version. :-) I also checked the bug list ftp://nonmem.iconplc.com/Public/nonmem720/nm720_bug_list.pdf <ftp://nonmem.iconplc.com/Public/nonmem720/nm720_bug_list.pdf> , but nothing is mentioned. Anyway, thanks for the suggestion! Kind regards Pascal From: "Kassir Nastya" <[email protected]> To: <[email protected]>, "Bauer, Robert" <[email protected]> Date: 12/07/2012 16:13 Subject: RE: [NMusers] Error files when using multicore runs and psn ==> Fatal Error: Record SIZES is not valid ________________________________ Hi Pascal, $SIZES goes at the begginning of your control stream, before $PROB. I hope it helps. Best regards, Nastya Nastya Kassir, Pharm.D. Senior Scientist Pharsight Consulting Services(tm) A division of Certara(tm) Email: [email protected] <mailto:[email protected] <mailto:[email protected]> > Phone: 1 (514) 789-2180 # 2157 Mobile: 1 (438) 862-0935 Fax: (514) 789-2192 www.pharsight.com http://www.pharsight.com/ http://www.pharsight.com/ ________________________________ From: [email protected] on behalf of [email protected] Sent: Thu 7/12/2012 09:21 To: Bauer, Robert Cc: [email protected]; "[email protected]"@merck.de; "[email protected]"@merck.de; [email protected] Subject: RE: [NMusers] Error files when using multicore runs and psn ==> Fatal Error: Record SIZES is not valid Hi Robert, Thanks for your quick reply. Unfortunately, I could not make it work. I have 17,000 records. So just after $PROB. I inserted your suggestion: $SIZES LIM1 = 20000 and I got the following error : > Fatal Error: Record SIZES is not valid So I tried also the default value for LIM1: $SIZES LIM1=10000 and also the example given on page 166 of Help guide viii $SIZES LIM1=30000 MAXFCN=2000000 NO=500 and always got the "Fatal Error: Record SIZES is not valid" message. I looked into Help guide viii (pp 166-167 and 463-464) but did not find any relevant information how to set $SIZES. What would be your suggestion? Thanks again for your help, because it's highly frustrating not being able to use the multi-cores when you have such long runs. Kind regards Pascal Girard, PhD [email protected] Head of Modeling & Simulation - Oncology Global Exploratory Medicine Merck Serono S.A. · Geneva Tel: +41.22.414.3549 Cell: +41.79.508.7898 From: "Bauer, Robert" <[email protected]> To: "[email protected]" <[email protected]>, "[email protected]" <[email protected]> Cc: "[email protected]" <[email protected]>, "[email protected]" <[email protected]> Date: 11/07/2012 18:55 Subject: RE: [NMusers] Error files when using multicore runs and psn Sent by: [email protected] ________________________________ Pascal: I cannot help regarding having all console messages sent to the proper files in the PSN environment, but I can assist in avoiding your present NONMEM error. If you insert at the beginning of the control stream file $SIZES LIM1=?? and insert a large enough value for ??, then file buffer 10 will not be used, and the error is avoided. The value should be at least as large as the number of data records (lines) in your data file (see section I.6 of ..\guides\nm720.pdf). Although nmfe72 in parallel mode has been tested successfully in our hands to use the file buffers for large data sets, it may not work in all grid environments. Setting the LIM values large enough avoids using buffer files, and utilizes only memory. The problem also runs faster when buffer files are not used. Robert J. Bauer, Ph.D. Vice President, Pharmacometrics, R&D ICON Development Solutions 7740 Milestone Parkway Suite 150 Hanover, MD 21076 Tel: (215) 616-6428 Mob: (925) 286-0769 Email: [email protected] Web: www.iconplc.com http://www.iconplc.com/ http://www.iconplc.com/ ________________________________ From: [email protected] [mailto:[email protected] <mailto:[email protected]> <mailto:[email protected] <mailto:[email protected]> > ] On Behalf Of [email protected] Sent: Wednesday, July 11, 2012 11:19 AM To: [email protected] Cc: [email protected]; [email protected] Subject: [NMusers] Error files when using multicore runs and psn Dear All, We are using psn version: 3.4.2 together with NONMEM 7.2.0 on a Linux Sun Grid Engine (SGE). When using multi-cores run on SGE, it happens sometimes that NONMEM returns a log file where the "MONITORING OF SEARCH" starts and nothing is reported. Looking into the psn directory, I found files which have the name of my script file + an extension made of letters and numbers that contains an error message that is not shown on the log file. For example my nm-tran script file is run003.mod and my log file run003.lst ends with: MONITORING OF SEARCH: Stop Time: Wed Jul 10 21:05:18 CEST 2012 Then I recover a file named run003.mod.o9501 in run003/NM_run1 directory created by psn. Sometimes this file contains an explicit error message, sometimes more cabalistic information as: WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1 (WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION. CREATING MUMODEL ROUTINE... Recompiling certain components USING PARALLEL PROFILE mpi_12cores.pnm MPI TRANSFER TYPE SELECTED Exit status = 1 IN MPI Starting MPI version of nonmem execution ... License Registered to: Merck KGaA Expiration Date: 14 SEP 2013 Current Date: 11 JUL 2012 Days until program expires : 428 Iterative Two Stage (No Prior) MONITORING OF SEARCH: At line 240 of file (unit = 10, file = 'WK1_FILE10') Fortran runtime error: End of file Fatal error in MPI_Send: Other MPI error, error stack: MPI_Send(174).....................: MPI_Send(buf=0xde71a0, count=80030, MPI_INTEGER, dest=1, tag=1, MPI_COMM_WORLD) failed MPIDI_CH3I_Progress(150)..........: MPID_nem_mpich2_blocking_recv(948): MPID_nem_tcp_connpoll(1720).......: state_commrdy_handler(1556).......: MPID_nem_tcp_recv_handler(1446)...: socket closed rank 1 in job 1 deda1x0481_36189 caused collective abort of all ranks exit status of rank 1: return code 2 Questions: 1) Is there a way to force psn and/or NONMEM to collect the error message in the log file when using multi-cores run ? 2) What about "cabalistic" error messages as the one above? Thank you for your help, Kind regards Pascal Girard, PhD [email protected] Head of Modeling & Simulation - Oncology Global Exploratory Medicine Merck Serono S.A. · Geneva Tel: +41.22.414.3549 Cell: +41.79.508.7898 This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or attachment or disclose the contents to any other person. If you have received this transmission in error, please notify the sender immediately and delete the message and any attachment from your system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept liability for any omissions or errors in this message which may arise as a result of E-Mail-transmission or for damages resulting from any unauthorized changes of the content of this message and any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not guarantee that this message is free of viruses and does not accept liability for any damages caused by any virus transmitted therewith. Click http://www.merckgroup.com/disclaimer http://www.merckgroup.com/disclaimer http://www.merckgroup.com/disclaimer http://www.merckgroup.com/disclaimer to access the German, French, Spanish and Portuguese versions of this disclaimer.
Pascal, That error looks like a PSN error rather than NONMEM. I can confirm that for NM7.2 the $SIZES record must be the first record of the control stream. You might try bypassing PSN if possible and I bet you will not get that error. Bill ~~~~~~~~~~~~~~~~~~~~~~~~ Bill Knebel, PharmD, PhD Principal Scientist II Metrum Research Group LLC 2 Tunxis Road, Suite 112 Tariffville, CT 06081 O: 860.735.7043 C: 860.930.1370 F: 860.760.6014
Quoted reply history
On Thursday, July 12, 2012 at 11:09 AM, [email protected] wrote: > Hi Nastia, > > I tried your trick which would have broken the first table law rule "The > first NM-TRAN control record must be a $PROBLEM record" and put $SIZES as > very first record and got following error: > > >_read_problems: First non-comment line in modelfile run.mod is not a $PROB > >record. NONMEM syntax violation. > > So the first table law rule still resists or you may have a different NONMEM > version. :-) > > I also checked the bug list > ftp://nonmem.iconplc.com/Public/nonmem720/nm720_bug_list.pdf , but nothing is > mentioned. > > Anyway, thanks for the suggestion! > > Kind regards > > Pascal > > > > > From: "Kassir Nastya" <[email protected] > (mailto:[email protected])> > To: <[email protected] > (mailto:[email protected])>, "Bauer, Robert" > <[email protected] (mailto:[email protected])> > Date: 12/07/2012 16:13 > Subject: RE: [NMusers] Error files when using multicore runs and psn > ==> Fatal Error: Record SIZES is not valid > > > > Hi Pascal, > > $SIZES goes at the begginning of your control stream, before $PROB. > > I hope it helps. > > Best regards, > > Nastya > > > Nastya Kassir, Pharm.D. > > Senior Scientist > > Pharsight Consulting Services(tm) > A division of Certara(tm) > Email: [email protected] (mailto:[email protected]) > <mailto:[email protected]> > > Phone: 1 (514) 789-2180 # 2157 > Mobile: 1 (438) 862-0935 > > Fax: (514) 789-2192 > > www.pharsight.com http://www.pharsight.com/ > > > ________________________________ > > From: [email protected] (mailto:[email protected]) on > behalf of [email protected] (mailto:[email protected]) > Sent: Thu 7/12/2012 09:21 > To: Bauer, Robert > Cc: [email protected] (mailto:[email protected]); > "[email protected] > (mailto:[email protected])"@merck.de; > "[email protected] > (mailto:[email protected])"@merck.de; > [email protected] (mailto:[email protected]) > Subject: RE: [NMusers] Error files when using multicore runs and psn ==> > Fatal Error: Record SIZES is not valid > > > Hi Robert, > > Thanks for your quick reply. Unfortunately, I could not make it work. > > I have 17,000 records. So just after $PROB. I inserted your suggestion: > $SIZES LIM1 = 20000 > and I got the following error : > > Fatal Error: Record SIZES is not valid > > So I tried also the default value for LIM1: > $SIZES LIM1=10000 > and also the example given on page 166 of Help guide viii > $SIZES LIM1=30000 MAXFCN=2000000 NO=500 > and always got the "Fatal Error: Record SIZES is not valid" message. > > I looked into Help guide viii (pp 166-167 and 463-464) but did not find any > relevant information how to set $SIZES. What would be your suggestion? > > Thanks again for your help, because it's highly frustrating not being able to > use the multi-cores when you have such long runs. > > Kind regards > > Pascal Girard, PhD > [email protected] (mailto:[email protected]) > Head of Modeling & Simulation - Oncology > Global Exploratory Medicine > Merck Serono S.A. · Geneva > Tel: +41.22.414.3549 > Cell: +41.79.508.7898 > > > > > From: "Bauer, Robert" <[email protected] > (mailto:[email protected])> > To: "[email protected] > (mailto:[email protected])" <[email protected] > (mailto:[email protected])>, "[email protected] > (mailto:[email protected])" <[email protected] > (mailto:[email protected])> > Cc: "[email protected] > (mailto:[email protected])" > <[email protected] > (mailto:[email protected])>, > "[email protected] > (mailto:[email protected])" > <[email protected] > (mailto:[email protected])> > Date: 11/07/2012 18:55 > Subject: RE: [NMusers] Error files when using multicore runs and psn > Sent by: [email protected] > (mailto:[email protected]) > > ________________________________ > > > > > Pascal: > I cannot help regarding having all console messages sent to the proper files > in the PSN environment, but I can assist in avoiding your present NONMEM > error. If you insert at the beginning of the control stream file > $SIZES LIM1=?? > and insert a large enough value for ??, then file buffer 10 will not be used, > and the error is avoided. The value should be at least as large as the > number of data records (lines) in your data file (see section I.6 of > ..\guides\nm720.pdf). > > Although nmfe72 in parallel mode has been tested successfully in our hands to > use the file buffers for large data sets, it may not work in all grid > environments. Setting the LIM values large enough avoids using buffer files, > and utilizes only memory. The problem also runs faster when buffer files are > not used. > > > Robert J. Bauer, Ph.D. > > Vice President, Pharmacometrics, R&D > > ICON Development Solutions > > 7740 Milestone Parkway > > Suite 150 > > Hanover, MD 21076 > > Tel: (215) 616-6428 > > Mob: (925) 286-0769 > > Email: [email protected] (mailto:[email protected]) > > Web: www.iconplc.com http://www.iconplc.com/ > > > > > ________________________________ > > From: [email protected] (mailto:[email protected]) > [mailto:[email protected] <mailto:[email protected]> ] > On Behalf Of [email protected] > (mailto:[email protected]) > Sent: Wednesday, July 11, 2012 11:19 AM > To: [email protected] (mailto:[email protected]) > Cc: [email protected] > (mailto:[email protected]); > [email protected] (mailto:[email protected]) > Subject: [NMusers] Error files when using multicore runs and psn > > Dear All, > > We are using psn version: 3.4.2 together with NONMEM 7.2.0 on a Linux Sun > Grid Engine (SGE). When using multi-cores run on SGE, it happens sometimes > that NONMEM returns a log file where the "MONITORING OF SEARCH" starts and > nothing is reported. > > Looking into the psn directory, I found files which have the name of my > script file + an extension made of letters and numbers that contains an error > message that is not shown on the log file. For example my nm-tran script file > is run003.mod and my log file run003.lst ends with: > > MONITORING OF SEARCH: > > Stop Time: > Wed Jul 10 21:05:18 CEST 2012 > > Then I recover a file named run003.mod.o9501 in run003/NM_run1 directory > created by psn. Sometimes this file contains an explicit error message, > sometimes more cabalistic information as: > WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1 > > (WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION. > CREATING MUMODEL ROUTINE... > Recompiling certain components > > USING PARALLEL PROFILE mpi_12cores.pnm > MPI TRANSFER TYPE SELECTED > Exit status = 1 > IN MPI > Starting MPI version of nonmem execution ... > License Registered to: Merck KGaA > Expiration Date: 14 SEP 2013 > Current Date: 11 JUL 2012 > Days until program expires : 428 > > > Iterative Two Stage (No Prior) > MONITORING OF SEARCH: > > At line 240 of file (unit = 10, file = 'WK1_FILE10') > Fortran runtime error: End of file > Fatal error in MPI_Send: Other MPI error, error stack: > MPI_Send(174).....................: MPI_Send(buf=0xde71a0, count=80030, > MPI_INTEGER, dest=1, tag=1, MPI_COMM_WORLD) failed > MPIDI_CH3I_Progress(150)..........: > MPID_nem_mpich2_blocking_recv(948): > MPID_nem_tcp_connpoll(1720).......: > state_commrdy_handler(1556).......: > MPID_nem_tcp_recv_handler(1446)...: socket closed > rank 1 in job 1 deda1x0481_36189 caused collective abort of all ranks > exit status of rank 1: return code 2 > > Questions: > 1) Is there a way to force psn and/or NONMEM to collect the error message in > the log file when using multi-cores run ? > 2) What about "cabalistic" error messages as the one above? > > Thank you for your help, > > Kind regards > > Pascal Girard, PhD > [email protected] (mailto:[email protected]) > Head of Modeling & Simulation - Oncology > Global Exploratory Medicine > Merck Serono S.A. · Geneva > Tel: +41.22.414.3549 > Cell: +41.79.508.7898 > > This message and any attachment are confidential and may be privileged or > otherwise protected from disclosure. If you are not the intended recipient, > you must not copy this message or attachment or disclose the contents to any > other person. If you have received this transmission in error, please notify > the sender immediately and delete the message and any attachment from your > system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not > accept liability for any omissions or errors in this message which may arise > as a result of E-Mail-transmission or for damages resulting from any > unauthorized changes of the content of this message and any attachment > thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not > guarantee that this message is free of viruses and does not accept liability > for any damages caused by any virus transmitted therewith. > > Click http://www.merckgroup.com/disclaimer > http://www.merckgroup.com/disclaimer to access the German, French, Spanish > and Portuguese versions of this disclaimer. >
Hi Pascal According to the documentation PsN 3.4.8 and later supports $SIZES. http://psn.sourceforge.net/pdfdocs/PsN_and_NONMEM7.pdf We are running version 3.5.3 and that works fine. Regards Julia *"If we knew what it was we were doing, it would not be called research, would it?" Albert Einstein* * * http://nz.linkedin.com/in/juliakorell http://nz.linkedin.com/in/juliakorell [image: Skype] juliakorell
Quoted reply history
On Fri, Jul 13, 2012 at 3:42 AM, <[email protected]> wrote: > Hi Nastya, > > You were right : it is my psn installation that refuses to see something > else than $PROB as first line as I just checked by running it directly from > nmfe72 .... > > You cannot stop the progress! > > Thanks again and Kind regards > > Pascal > > > > From: "Kassir Nastya" <[email protected]> > To: <[email protected]>, "Bauer, Robert" < > [email protected]>, <[email protected]> > Date: 12/07/2012 17:25 > Subject: RE: [NMusers] Error files when using multicore runs and > psn ==> Fatal Error: Record SIZES is not valid > ------------------------------ > > > > Hi Pascal, > > I have already used it with NONMEM 72 and it works, but not in psn. > > $SIZES LIM6=1000 > $PROBLEM XXXXXX > > Best regards, > > Nastya > > > ________________________________ > > From: [email protected] > [mailto:[email protected]<[email protected]> > ] > Sent: Thu 7/12/2012 11:09 > To: Kassir Nastya; Bauer, Robert; [email protected] > Subject: RE: [NMusers] Error files when using multicore runs and psn ==> > Fatal Error: Record SIZES is not valid > > > Hi Nastia, > > I tried your trick which would have broken the first table law rule "The > first NM-TRAN control record must be a $PROBLEM record" and put $SIZES as > very first record and got following error: > > >_read_problems: First non-comment line in modelfile run.mod is not a > $PROB record. NONMEM syntax violation. > > So the first table law rule still resists or you may have a different > NONMEM version. :-) > > I also checked the bug list > ftp://nonmem.iconplc.com/Public/nonmem720/nm720_bug_list.pdf < > ftp://nonmem.iconplc.com/Public/nonmem720/nm720_bug_list.pdf> , but > nothing is mentioned. > > Anyway, thanks for the suggestion! > > Kind regards > > Pascal > > > > > From: "Kassir Nastya" <[email protected]> > To: <[email protected]>, "Bauer, Robert" < > [email protected]> > Date: 12/07/2012 16:13 > Subject: RE: [NMusers] Error files when using multicore runs and > psn ==> Fatal Error: Record SIZES is not valid > > ________________________________ > > > > > Hi Pascal, > > $SIZES goes at the begginning of your control stream, before $PROB. > > I hope it helps. > > Best regards, > > Nastya > > > Nastya Kassir, Pharm.D. > > Senior Scientist > > Pharsight Consulting Services(tm) > A division of Certara(tm) > Email: [email protected] > <mailto:[email protected]<[email protected]>< > mailto:[email protected] <[email protected]>> > > > Phone: 1 (514) 789-2180 # 2157 > Mobile: 1 (438) 862-0935 > > Fax: (514) 789-2192 > > www.pharsight.com http://www.pharsight.com/ http://www.pharsight.com/ > > > > ________________________________ > > From: [email protected] on behalf of > [email protected] > Sent: Thu 7/12/2012 09:21 > To: Bauer, Robert > Cc: [email protected]; "[email protected]"@merck.de; > "[email protected]"@merck.de; > [email protected] > Subject: RE: [NMusers] Error files when using multicore runs and psn ==> > Fatal Error: Record SIZES is not valid > > > Hi Robert, > > Thanks for your quick reply. Unfortunately, I could not make it work. > > I have 17,000 records. So just after $PROB. I inserted your suggestion: > $SIZES LIM1 = 20000 > and I got the following error : > > Fatal Error: Record SIZES is not valid > > So I tried also the default value for LIM1: > $SIZES LIM1=10000 > and also the example given on page 166 of Help guide viii > $SIZES LIM1=30000 MAXFCN=2000000 NO=500 > and always got the "Fatal Error: Record SIZES is not valid" message. > > I looked into Help guide viii (pp 166-167 and 463-464) but did not find > any relevant information how to set $SIZES. What would be your suggestion? > > Thanks again for your help, because it's highly frustrating not being able > to use the multi-cores when you have such long runs. > > Kind regards > > Pascal Girard, PhD > [email protected] > Head of Modeling & Simulation - Oncology > Global Exploratory Medicine > Merck Serono S.A. · Geneva > Tel: +41.22.414.3549 > Cell: +41.79.508.7898 > > > > > From: "Bauer, Robert" <[email protected]> > To: "[email protected]" <[email protected]>, > "[email protected]" <[email protected]> > Cc: "[email protected]" < > [email protected]>, "[email protected]" > <[email protected]> > Date: 11/07/2012 18:55 > Subject: RE: [NMusers] Error files when using multicore runs and > psn > Sent by: [email protected] > > ________________________________ > > > > > Pascal: > I cannot help regarding having all console messages sent to the proper > files in the PSN environment, but I can assist in avoiding your present > NONMEM error. If you insert at the beginning of the control stream file > $SIZES LIM1=?? > and insert a large enough value for ??, then file buffer 10 will not be > used, and the error is avoided. The value should be at least as large as > the number of data records (lines) in your data file (see section I.6 of > ..\guides\nm720.pdf). > > Although nmfe72 in parallel mode has been tested successfully in our hands > to use the file buffers for large data sets, it may not work in all grid > environments. Setting the LIM values large enough avoids using buffer > files, and utilizes only memory. The problem also runs faster when buffer > files are not used. > > > Robert J. Bauer, Ph.D. > > Vice President, Pharmacometrics, R&D > > ICON Development Solutions > > 7740 Milestone Parkway > > Suite 150 > > Hanover, MD 21076 > > Tel: (215) 616-6428 > > Mob: (925) 286-0769 > > Email: [email protected] > > Web: www.iconplc.com http://www.iconplc.com/ http://www.iconplc.com/ > > > > > > ________________________________ > > From: [email protected] > [mailto:[email protected]<[email protected]>< > mailto:[email protected] <[email protected]>> < > mailto:[email protected] <[email protected]> < > mailto:[email protected] <[email protected]>> > ] > On Behalf Of [email protected] > Sent: Wednesday, July 11, 2012 11:19 AM > To: [email protected] > Cc: [email protected]; > [email protected] > Subject: [NMusers] Error files when using multicore runs and psn > > Dear All, > > We are using psn version: 3.4.2 together with NONMEM 7.2.0 on a Linux Sun > Grid Engine (SGE). When using multi-cores run on SGE, it happens sometimes > that NONMEM returns a log file where the "MONITORING OF SEARCH" starts and > nothing is reported. > > Looking into the psn directory, I found files which have the name of my > script file + an extension made of letters and numbers that contains an > error message that is not shown on the log file. For example my nm-tran > script file is run003.mod and my log file run003.lst ends with: > > MONITORING OF SEARCH: > > Stop Time: > Wed Jul 10 21:05:18 CEST 2012 > > Then I recover a file named run003.mod.o9501 in run003/NM_run1 directory > created by psn. Sometimes this file contains an explicit error message, > sometimes more cabalistic information as: > WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1 > > (WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION. > CREATING MUMODEL ROUTINE... > Recompiling certain components > > USING PARALLEL PROFILE mpi_12cores.pnm > MPI TRANSFER TYPE SELECTED > Exit status = 1 > IN MPI > Starting MPI version of nonmem execution ... > License Registered to: Merck KGaA > Expiration Date: 14 SEP 2013 > Current Date: 11 JUL 2012 > Days until program expires : 428 > > > Iterative Two Stage (No Prior) > MONITORING OF SEARCH: > > At line 240 of file (unit = 10, file = 'WK1_FILE10') > Fortran runtime error: End of file > Fatal error in MPI_Send: Other MPI error, error stack: > MPI_Send(174).....................: MPI_Send(buf=0xde71a0, count=80030, > MPI_INTEGER, dest=1, tag=1, MPI_COMM_WORLD) failed > MPIDI_CH3I_Progress(150)..........: > MPID_nem_mpich2_blocking_recv(948): > MPID_nem_tcp_connpoll(1720).......: > state_commrdy_handler(1556).......: > MPID_nem_tcp_recv_handler(1446)...: socket closed > rank 1 in job 1 deda1x0481_36189 caused collective abort of all ranks > exit status of rank 1: return code 2 > > Questions: > 1) Is there a way to force psn and/or NONMEM to collect the error message > in the log file when using multi-cores run ? > 2) What about "cabalistic" error messages as the one above? > > Thank you for your help, > > Kind regards > > Pascal Girard, PhD > [email protected] > Head of Modeling & Simulation - Oncology > Global Exploratory Medicine > Merck Serono S.A. · Geneva > Tel: +41.22.414.3549 > Cell: +41.79.508.7898 > > This message and any attachment are confidential and may be privileged or > otherwise protected from disclosure. If you are not the intended recipient, > you must not copy this message or attachment or disclose the contents to > any other person. If you have received this transmission in error, please > notify the sender immediately and delete the message and any attachment > from your system. Merck KGaA, Darmstadt, Germany and any of its > subsidiaries do not accept liability for any omissions or errors in this > message which may arise as a result of E-Mail-transmission or for damages > resulting from any unauthorized changes of the content of this message and > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its > subsidiaries do not guarantee that this message is free of viruses and does > not accept liability for any damages caused by any virus transmitted > therewith. > > Click http://www.merckgroup.com/disclaimer < > http://www.merckgroup.com/disclaimer> < > http://www.merckgroup.com/disclaimer http://www.merckgroup.com/disclaimer > > to access the German, French, Spanish and Portuguese versions of this > disclaimer. >
> Pascal > I don't know what the "first table law rule" is, but you are correct. > The on-line help entry for $problem is wrong. $sizes may precede $problem. > This help item will be fixed with nonmem 7.3. > (My understanding is that nonmem 7.3 will be released in the summer of 2013. > ) Sent from my iPhone
Quoted reply history
On Jul 12, 2012, at 9:09 AM, [email protected] wrote: > Hi Nastia, > > I tried your trick which would have broken the first table law rule "The > first NM-TRAN control record must be a $PROBLEM record" and put $SIZES as > very first record and got following error: > > >_read_problems: First non-comment line in modelfile run.mod is not a $PROB > >record. NONMEM syntax violation. > > So the first table law rule still resists or you may have a different NONMEM > version. :-) > > I also checked the bug list > ftp://nonmem.iconplc.com/Public/nonmem720/nm720_bug_list.pdf , but nothing is > mentioned. > > Anyway, thanks for the suggestion! > > Kind regards > > Pascal > > > > > From: "Kassir Nastya" <[email protected]> > To: <[email protected]>, "Bauer, Robert" > <[email protected]> > Date: 12/07/2012 16:13 > Subject: RE: [NMusers] Error files when using multicore runs and psn > ==> Fatal Error: Record SIZES is not valid > > > > Hi Pascal, > > $SIZES goes at the begginning of your control stream, before $PROB. > > I hope it helps. > > Best regards, > > Nastya > > > Nastya Kassir, Pharm.D. > > Senior Scientist > > Pharsight Consulting Services(tm) > A division of Certara(tm) > Email: [email protected] <mailto:[email protected]> > > Phone: 1 (514) 789-2180 # 2157 > Mobile: 1 (438) 862-0935 > > Fax: (514) 789-2192 > > www.pharsight.com http://www.pharsight.com/ > > > ________________________________ > > From: [email protected] on behalf of [email protected] > Sent: Thu 7/12/2012 09:21 > To: Bauer, Robert > Cc: [email protected]; "[email protected]"@merck.de; > "[email protected]"@merck.de; [email protected] > Subject: RE: [NMusers] Error files when using multicore runs and psn ==> > Fatal Error: Record SIZES is not valid > > > Hi Robert, > > Thanks for your quick reply. Unfortunately, I could not make it work. > > I have 17,000 records. So just after $PROB. I inserted your suggestion: > $SIZES LIM1 = 20000 > and I got the following error : > > Fatal Error: Record SIZES is not valid > > So I tried also the default value for LIM1: > $SIZES LIM1=10000 > and also the example given on page 166 of Help guide viii > $SIZES LIM1=30000 MAXFCN=2000000 NO=500 > and always got the "Fatal Error: Record SIZES is not valid" message. > > I looked into Help guide viii (pp 166-167 and 463-464) but did not find any > relevant information how to set $SIZES. What would be your suggestion? > > Thanks again for your help, because it's highly frustrating not being able to > use the multi-cores when you have such long runs. > > Kind regards > > Pascal Girard, PhD > [email protected] > Head of Modeling & Simulation - Oncology > Global Exploratory Medicine > Merck Serono S.A. · Geneva > Tel: +41.22.414.3549 > Cell: +41.79.508.7898 > > > > > From: "Bauer, Robert" <[email protected]> > To: "[email protected]" <[email protected]>, > "[email protected]" <[email protected]> > Cc: "[email protected]" > <[email protected]>, "[email protected]" > <[email protected]> > Date: 11/07/2012 18:55 > Subject: RE: [NMusers] Error files when using multicore runs and psn > Sent by: [email protected] > > ________________________________ > > > > > Pascal: > I cannot help regarding having all console messages sent to the proper files > in the PSN environment, but I can assist in avoiding your present NONMEM > error. If you insert at the beginning of the control stream file > $SIZES LIM1=?? > and insert a large enough value for ??, then file buffer 10 will not be used, > and the error is avoided. The value should be at least as large as the > number of data records (lines) in your data file (see section I.6 of > ..\guides\nm720.pdf). > > Although nmfe72 in parallel mode has been tested successfully in our hands to > use the file buffers for large data sets, it may not work in all grid > environments. Setting the LIM values large enough avoids using buffer files, > and utilizes only memory. The problem also runs faster when buffer files are > not used. > > > Robert J. Bauer, Ph.D. > > Vice President, Pharmacometrics, R&D > > ICON Development Solutions > > 7740 Milestone Parkway > > Suite 150 > > Hanover, MD 21076 > > Tel: (215) 616-6428 > > Mob: (925) 286-0769 > > Email: [email protected] > > Web: www.iconplc.com http://www.iconplc.com/ > > > > > ________________________________ > > From: [email protected] [mailto:[email protected] > <mailto:[email protected]> ] On Behalf Of > [email protected] > Sent: Wednesday, July 11, 2012 11:19 AM > To: [email protected] > Cc: [email protected]; [email protected] > Subject: [NMusers] Error files when using multicore runs and psn > > Dear All, > > We are using psn version: 3.4.2 together with NONMEM 7.2.0 on a Linux Sun > Grid Engine (SGE). When using multi-cores run on SGE, it happens sometimes > that NONMEM returns a log file where the "MONITORING OF SEARCH" starts and > nothing is reported. > > Looking into the psn directory, I found files which have the name of my > script file + an extension made of letters and numbers that contains an error > message that is not shown on the log file. For example my nm-tran script file > is run003.mod and my log file run003.lst ends with: > > MONITORING OF SEARCH: > > Stop Time: > Wed Jul 10 21:05:18 CEST 2012 > > Then I recover a file named run003.mod.o9501 in run003/NM_run1 directory > created by psn. Sometimes this file contains an explicit error message, > sometimes more cabalistic information as: > WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1 > > (WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION. > CREATING MUMODEL ROUTINE... > Recompiling certain components > > USING PARALLEL PROFILE mpi_12cores.pnm > MPI TRANSFER TYPE SELECTED > Exit status = 1 > IN MPI > Starting MPI version of nonmem execution ... > License Registered to: Merck KGaA > Expiration Date: 14 SEP 2013 > Current Date: 11 JUL 2012 > Days until program expires : 428 > > > Iterative Two Stage (No Prior) > MONITORING OF SEARCH: > > At line 240 of file (unit = 10, file = 'WK1_FILE10') > Fortran runtime error: End of file > Fatal error in MPI_Send: Other MPI error, error stack: > MPI_Send(174).....................: MPI_Send(buf=0xde71a0, count=80030, > MPI_INTEGER, dest=1, tag=1, MPI_COMM_WORLD) failed > MPIDI_CH3I_Progress(150)..........: > MPID_nem_mpich2_blocking_recv(948): > MPID_nem_tcp_connpoll(1720).......: > state_commrdy_handler(1556).......: > MPID_nem_tcp_recv_handler(1446)...: socket closed > rank 1 in job 1 deda1x0481_36189 caused collective abort of all ranks > exit status of rank 1: return code 2 > > Questions: > 1) Is there a way to force psn and/or NONMEM to collect the error message in > the log file when using multi-cores run ? > 2) What about "cabalistic" error messages as the one above? > > Thank you for your help, > > Kind regards > > Pascal Girard, PhD > [email protected] > Head of Modeling & Simulation - Oncology > Global Exploratory Medicine > Merck Serono S.A. · Geneva > Tel: +41.22.414.3549 > Cell: +41.79.508.7898 > > This message and any attachment are confidential and may be privileged or > otherwise protected from disclosure. If you are not the intended recipient, > you must not copy this message or attachment or disclose the contents to any > other person. If you have received this transmission in error, please notify > the sender immediately and delete the message and any attachment from your > system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not > accept liability for any omissions or errors in this message which may arise > as a result of E-Mail-transmission or for damages resulting from any > unauthorized changes of the content of this message and any attachment > thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not > guarantee that this message is free of viruses and does not accept liability > for any damages caused by any virus transmitted therewith. > > Click http://www.merckgroup.com/disclaimer > http://www.merckgroup.com/disclaimer to access the German, French, Spanish > and Portuguese versions of this disclaimer. >
Hi Pascal This is not an error in PsN, this is a feature! NONMEM, by default, recompiles the sizes component of a run before each execution. This can take time, so there is an option to turn it off and use the default sizes (as in NONMEM versions 7.1.x and 6.x) with the "-prdefault" command, i.e. "nmfe72 run1.mod run1.lst -prdefault". PsN allows you to specify these extra NONMEM command line options in the configuration file "psn.conf". If you look in your psn.conf file I suspect you will find something like: [default_options] nmfe_options=prdefault,xmloff In the case you mention below you actually need to recompile the sizes before your NONMEM run, so this "prdefault" option should be removed. In PsN you can do this on the command line using the "-nmfe_options" argument. For example: execute run1.mod -nmfe_options=xmloff Here, I turn off the "prdefault" NONMEM option but keep the "xmloff" option. You can find this information in the following documentation: http://psn.sourceforge.net/pdfdocs/common_options_defaults_versions_psn.pdf http://psn.sourceforge.net/pdfdocs/psn_configuration.pdf Best regards Andy Andrew Hooker, Ph.D. Associate Professor of Pharmacometrics Dept. of Pharmaceutical Biosciences Uppsala University Box 591, 751 24, Uppsala, Sweden Phone: +46 18 471 4355 www.farmbio.uu.se/research/researchgroups/pharmacometrics/
Quoted reply history
From: [email protected] [mailto:[email protected]] On Behalf Of [email protected] Sent: den 12 juli 2012 17:43 To: Kassir Nastya Cc: [email protected]; Bauer, Robert; Ekaterina Gibiansky Subject: RE: [NMusers] Error files when using multicore runs and psn ==> Fatal Error: Record SIZES is not valid Hi Nastya, You were right : it is my psn installation that refuses to see something else than $PROB as first line as I just checked by running it directly from nmfe72 .... You cannot stop the progress! Thanks again and Kind regards Pascal From: "Kassir Nastya" <[email protected]> To: <[email protected]>, "Bauer, Robert" <[email protected]>, <[email protected]> Date: 12/07/2012 17:25 Subject: RE: [NMusers] Error files when using multicore runs and psn ==> Fatal Error: Record SIZES is not valid _____ Hi Pascal, I have already used it with NONMEM 72 and it works, but not in psn. $SIZES LIM6=1000 $PROBLEM XXXXXX Best regards, Nastya