Error files when using multicore runs and psn
Dear All,
We are using psn version: 3.4.2 together with NONMEM 7.2.0 on a Linux Sun
Grid Engine (SGE). When using multi-cores run on SGE, it happens sometimes
that NONMEM returns a log file where the "MONITORING OF SEARCH" starts and
nothing is reported.
Looking into the psn directory, I found files which have the name of my
script file + an extension made of letters and numbers that contains an
error message that is not shown on the log file. For example my nm-tran
script file is run003.mod and my log file run003.lst ends with:
MONITORING OF SEARCH:
Stop Time:
Wed Jul 10 21:05:18 CEST 2012
Then I recover a file named run003.mod.o9501 in run003/NM_run1 directory
created by psn. Sometimes this file contains an explicit error message,
sometimes more cabalistic information as:
WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1
(WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION.
CREATING MUMODEL ROUTINE...
Recompiling certain components
USING PARALLEL PROFILE mpi_12cores.pnm
MPI TRANSFER TYPE SELECTED
Exit status = 1
IN MPI
Starting MPI version of nonmem execution ...
License Registered to: Merck KGaA
Expiration Date: 14 SEP 2013
Current Date: 11 JUL 2012
Days until program expires : 428
Iterative Two Stage (No Prior)
MONITORING OF SEARCH:
At line 240 of file (unit = 10, file = 'WK1_FILE10')
Fortran runtime error: End of file
Fatal error in MPI_Send: Other MPI error, error stack:
MPI_Send(174).....................: MPI_Send(buf=0xde71a0, count=80030,
MPI_INTEGER, dest=1, tag=1, MPI_COMM_WORLD) failed
MPIDI_CH3I_Progress(150)..........:
MPID_nem_mpich2_blocking_recv(948):
MPID_nem_tcp_connpoll(1720).......:
state_commrdy_handler(1556).......:
MPID_nem_tcp_recv_handler(1446)...: socket closed
rank 1 in job 1 deda1x0481_36189 caused collective abort of all ranks
exit status of rank 1: return code 2
Questions:
1) Is there a way to force psn and/or NONMEM to collect the error message
in the log file when using multi-cores run ?
2) What about "cabalistic" error messages as the one above?
Thank you for your help,
Kind regards
Pascal Girard, PhD
[email protected]
Head of Modeling & Simulation - Oncology
Global Exploratory Medicine
Merck Serono S.A. · Geneva
Tel: +41.22.414.3549
Cell: +41.79.508.7898
This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient, you
must not copy this message or attachment or disclose the contents to any other
person. If you have received this transmission in error, please notify the
sender immediately and delete the message and any attachment from your system.
Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept
liability for any omissions or errors in this message which may arise as a
result of E-Mail-transmission or for damages resulting from any unauthorized
changes of the content of this message and any attachment thereto. Merck KGaA,
Darmstadt, Germany and any of its subsidiaries do not guarantee that this
message is free of viruses and does not accept liability for any damages caused
by any virus transmitted therewith.
Click http://www.merckgroup.com/disclaimer to access the German, French,
Spanish and Portuguese versions of this disclaimer.