Hi Robert,
Thanks for your quick reply. Unfortunately, I could not make it work.
I have 17,000 records. So just after $PROB. I inserted your suggestion:
$SIZES LIM1 = 20000
and I got the following error :
> Fatal Error: Record SIZES is not valid
So I tried also the default value for LIM1:
$SIZES LIM1=10000
and also the example given on page 166 of Help guide viii
$SIZES LIM1=30000 MAXFCN=2000000 NO=500
and always got the "Fatal Error: Record SIZES is not valid" message.
I looked into Help guide viii (pp 166-167 and 463-464) but did not find
any relevant information how to set $SIZES. What would be your suggestion?
Thanks again for your help, because it's highly frustrating not being able
to use the multi-cores when you have such long runs.
Kind regards
Pascal Girard, PhD
[email protected]
Head of Modeling & Simulation - Oncology
Global Exploratory Medicine
Merck Serono S.A. · Geneva
Tel: +41.22.414.3549
Cell: +41.79.508.7898
Quoted reply history
From: "Bauer, Robert" <[email protected]>
To: "[email protected]" <[email protected]>,
"[email protected]" <[email protected]>
Cc: "[email protected]"
<[email protected]>,
"[email protected]" <[email protected]>
Date: 11/07/2012 18:55
Subject: RE: [NMusers] Error files when using multicore runs and
psn
Sent by: [email protected]
Pascal:
I cannot help regarding having all console messages sent to the proper
files in the PSN environment, but I can assist in avoiding your present
NONMEM error. If you insert at the beginning of the control stream file
$SIZES LIM1=??
and insert a large enough value for ??, then file buffer 10 will not be
used, and the error is avoided. The value should be at least as large as
the number of data records (lines) in your data file (see section I.6 of
..\guides\nm720.pdf).
Although nmfe72 in parallel mode has been tested successfully in our hands
to use the file buffers for large data sets, it may not work in all grid
environments. Setting the LIM values large enough avoids using buffer
files, and utilizes only memory. The problem also runs faster when buffer
files are not used.
Robert J. Bauer, Ph.D.
Vice President, Pharmacometrics, R&D
ICON Development Solutions
7740 Milestone Parkway
Suite 150
Hanover, MD 21076
Tel: (215) 616-6428
Mob: (925) 286-0769
Email: [email protected]
Web: www.iconplc.com
From: [email protected] [mailto:[email protected]]
On Behalf Of [email protected]
Sent: Wednesday, July 11, 2012 11:19 AM
To: [email protected]
Cc: [email protected];
[email protected]
Subject: [NMusers] Error files when using multicore runs and psn
Dear All,
We are using psn version: 3.4.2 together with NONMEM 7.2.0 on a Linux Sun
Grid Engine (SGE). When using multi-cores run on SGE, it happens sometimes
that NONMEM returns a log file where the "MONITORING OF SEARCH" starts and
nothing is reported.
Looking into the psn directory, I found files which have the name of my
script file + an extension made of letters and numbers that contains an
error message that is not shown on the log file. For example my nm-tran
script file is run003.mod and my log file run003.lst ends with:
MONITORING OF SEARCH:
Stop Time:
Wed Jul 10 21:05:18 CEST 2012
Then I recover a file named run003.mod.o9501 in run003/NM_run1 directory
created by psn. Sometimes this file contains an explicit error message,
sometimes more cabalistic information as:
WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1
(WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION.
CREATING MUMODEL ROUTINE...
Recompiling certain components
USING PARALLEL PROFILE mpi_12cores.pnm
MPI TRANSFER TYPE SELECTED
Exit status = 1
IN MPI
Starting MPI version of nonmem execution ...
License Registered to: Merck KGaA
Expiration Date: 14 SEP 2013
Current Date: 11 JUL 2012
Days until program expires : 428
Iterative Two Stage (No Prior)
MONITORING OF SEARCH:
At line 240 of file (unit = 10, file = 'WK1_FILE10')
Fortran runtime error: End of file
Fatal error in MPI_Send: Other MPI error, error stack:
MPI_Send(174).....................: MPI_Send(buf=0xde71a0, count=80030,
MPI_INTEGER, dest=1, tag=1, MPI_COMM_WORLD) failed
MPIDI_CH3I_Progress(150)..........:
MPID_nem_mpich2_blocking_recv(948):
MPID_nem_tcp_connpoll(1720).......:
state_commrdy_handler(1556).......:
MPID_nem_tcp_recv_handler(1446)...: socket closed
rank 1 in job 1 deda1x0481_36189 caused collective abort of all ranks
exit status of rank 1: return code 2
Questions:
1) Is there a way to force psn and/or NONMEM to collect the error message
in the log file when using multi-cores run ?
2) What about "cabalistic" error messages as the one above?
Thank you for your help,
Kind regards
Pascal Girard, PhD
[email protected]
Head of Modeling & Simulation - Oncology
Global Exploratory Medicine
Merck Serono S.A. · Geneva
Tel: +41.22.414.3549
Cell: +41.79.508.7898
This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended
recipient, you must not copy this message or attachment or disclose the
contents to any other person. If you have received this transmission in
error, please notify the sender immediately and delete the message and any
attachment from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and
does not accept liability for any damages caused by any virus transmitted
therewith.
Click http://www.merckgroup.com/disclaimer to access the German, French,
Spanish and Portuguese versions of this disclaimer.