Stop During PsN

5 messages 2 people Latest: May 20, 2014

Stop During PsN

From: Xinting Wang Date: May 17, 2014 technical
Dear all, I would very much appreciate it if you could help me solve the below problem. 1. During the running of PsN, I came across a problem where no computation takes place. After the usual conduction, during some step the process seemed to be stopped. Although for parallel processing the worker1 to worker5 directory was created, nothing followed Monitoring of Search in the OUTPUT, psn.lst file. This condition continued for 24 hours. I am not sure if this is a problem coming from the code or some errors with the computation system. Has anybody else come across a similar issue? 2. To reduce the hard disk burden I used -clean=3 in the scm process. However, after the process was stopped, it was impossible to resume the program using the statement -directory. Is it still possible resume the previous task? If so, how to? Repeating 13 rounds of problem is certainly something nobody want to do. Thanks a lot. -- Xinting

Re: Stop During PsN

From: Kajsa Harling Date: May 19, 2014 technical
Dear Xinting, 1. This seems like an error during the NONMEM run itself. I suggest you run the same control stream with parallel nmfe72/73 on its own, without PsN, to see if the problem persists. If the model runs fine in parallel without PsN then please contact me directly and I will help you diagnose the problem. If the errors occurs independently of PsN then I hope someone else can help you. 2. You do not need to rerun the 13 iterations. To resume an interrupted scm it is always best, regardless of clean option, to set [included_relations] in the scm config file to the set of relations that were included at the time of the interruption. To see which those relations are you look in the scm logfile in the top level of the scm directory. Then you start a new scm run in a new directory with the new scm config file and the *same* input control stream as in the original run (no relations added in the input control stream). For details and syntax of the [included_relations] section please see the scm userguide. Best regards, Kajsa On 05/17/2014 05:03 PM, Xinting Wang wrote: Dear all, I would very much appreciate it if you could help me solve the below problem. 1. During the running of PsN, I came across a problem where no computation takes place. After the usual conduction, during some step the process seemed to be stopped. Although for parallel processing the worker1 to worker5 directory was created, nothing followed Monitoring of Search in the OUTPUT, psn.lst file. This condition continued for 24 hours. I am not sure if this is a problem coming from the code or some errors with the computation system. Has anybody else come across a similar issue? 2. To reduce the hard disk burden I used -clean=3 in the scm process. However, after the process was stopped, it was impossible to resume the program using the statement -directory. Is it still possible resume the previous task? If so, how to? Repeating 13 rounds of problem is certainly something nobody want to do. Thanks a lot. -- Xinting -- ----------------------------------------------------------------- Kajsa Harling, PhD System Developer Department of Pharmaceutical Biosciences Uppsala University [email protected] +46-(0)18-471 4308 http://www.farmbio.uu.se/research/researchgroups/pharmacometrics/

Re: Stop During PsN

From: Xinting Wang Date: May 19, 2014 technical
Dear Kajsa, Thanks very much for your kind help. However there's something I am not very clear about the second point you mentioned above and I hope that you could explain a little bit more. Previously when I wanted to resume a PsN run I used the -directory= in the command, and according to the PsN project homepage, this would enable a resume of a PsN run after stop. Such a method works fine without the -clean=3 option, but initiated a new search for covariates if I have -clean=3 option. However, this method seems to be different from the above mentioned. It seems to me that the method you mentioned requires 2 steps. The first one is to have only the included relations at the time of stop in the scm file. Should I use the -directory= option in the command line? Secondly, if I start a new run in a new directory, why should I use the exactly same original input? By doing so, would be program being able to skip the already established covariates and start from there onwards? What if I use the unidentified relationships in the new run, and using a mod file with relations in the scm logfile at the time of stop? One last small question is that a few days ago, an error message saying "psn.lst" is not created [something like this] was put forward by the program during the middle of searching. The cov-par relation was completed successfully before crush. Do you have any idea what might be the possible reasons of this? I am sorry for throw so many questions, but would highly appreciate it if you could help me out of this. Many thanks. Best Regards Xinting
Quoted reply history
On 19 May 2014 15:01, Kajsa Harling <[email protected]> wrote: > Dear Xinting, > > 1. This seems like an error during the NONMEM run itself. I suggest you > run the same control stream with parallel nmfe72/73 on its own, without > PsN, to see if the problem persists. If the model runs fine in parallel > without PsN then please contact me directly and I will help you diagnose > the problem. If the errors occurs independently of PsN then I hope someone > else can help you. > > 2. You do not need to rerun the 13 iterations. To resume an interrupted > scm it is always best, regardless of clean option, to set > [included_relations] in the scm config file to the set of relations that > were included at the time of the interruption. To see which those relations > are you look in the scm logfile in the top level of the scm directory. Then > you start a new scm run in a new directory with the new scm config file and > the *same* input control stream as in the original run (no relations added > in the input control stream). For details and syntax of the > [included_relations] section please see the scm userguide. > > Best regards, > Kajsa > > > > On 05/17/2014 05:03 PM, Xinting Wang wrote: > > Dear all, > > I would very much appreciate it if you could help me solve the below > problem. > > 1. During the running of PsN, I came across a problem where no > computation takes place. After the usual conduction, during some step the > process seemed to be stopped. Although for parallel processing the worker1 > to worker5 directory was created, nothing followed Monitoring of Search in > the OUTPUT, psn.lst file. This condition continued for 24 hours. I am not > sure if this is a problem coming from the code or some errors with the > computation system. Has anybody else come across a similar issue? > > 2. To reduce the hard disk burden I used -clean=3 in the scm process. > However, after the process was stopped, it was impossible to resume the > program using the statement -directory. Is it still possible resume the > previous task? If so, how to? Repeating 13 rounds of problem is certainly > something nobody want to do. > > Thanks a lot. > > -- > Xinting > > > -- > ----------------------------------------------------------------- > Kajsa Harling, PhD > System Developer > Department of Pharmaceutical Biosciences > Uppsala University > [email protected]+46-(0)18-471 4308 > http://www.farmbio.uu.se/research/researchgroups/pharmacometrics/ > ----------------------------------------------------------------- > > -- Xinting

Re: Stop During PsN

From: Kajsa Harling Date: May 20, 2014 technical
Dear Xinting, Previously when I wanted to resume a PsN run I used the -directory= in the command, and according to the PsN project homepage, this would enable a resume of a PsN run after stop. Such a method works fine without the -clean=3 option, but initiated a new search for covariates if I have -clean=3 option. However, this method seems to be different from the above mentioned. Yes, it is a different method. Setting -clean=3 removes files that are needed to resume an interupted run. I will make this more clear in the documentation, it is not explained properly at the moment. It seems to me that the method you mentioned requires 2 steps. The first one is to have only the included relations at the time of stop in the scm file. Should I use the -directory= option in the command line? The steps you need to perform are as follows: 1) Make a copy of your original scm configuration file, give it a new name and save it in the same folder as the original scm config file 2) open the original scm log file in the top level of the orginal scm run folder and locate the *last* place where it says "Relations included after this step", and copy those log-file-relations to the new configuration file under a new [included_relations] section at the end of the new configuration file. An example of a [included_relations] section with correct syntax is found in http://psn.sourceforge.net/pdfdocs/config_template_backward.scm and other details are found in the scm userguide. 3) If you had set option directory in your original config file then change the setting of that option in the new config file to something different 4) start a new scm with the same command as you used for the original scm, except that you use the new config file instead of the old, and either set -directory to the name of a folder that does not yet exist or omit the -directory option completely in which case the program will select a unique name for you. Secondly, if I start a new run in a new directory, why should I use the exactly same original input? You do not use exactly the same input, you use a new setting of [included_relations]. The input model however should be exactly the same. By doing so, would be program being able to skip the already established covariates and start from there onwards? Yes, it will start from the set of relations specified in [included_relations]. What if I use the unidentified relationships in the new run, and using a mod file with relations in the scm logfile at the time of stop? That will usually not work. One last small question is that a few days ago, an error message saying "psn.lst" is not created [something like this] was put forward by the program during the middle of searching. The cov-par relation was completed successfully before crush. Do you have any idea what might be the possible reasons of this? I need the complete error message in its context of the scm run messages to be able to answer your question. Best regards, Kajsa -- ----------------------------------------------------------------- Kajsa Harling, PhD System Developer Department of Pharmaceutical Biosciences Uppsala University [email protected] +46-(0)18-471 4308 http://www.farmbio.uu.se/research/researchgroups/pharmacometrics/

Re: Stop During PsN

From: Xinting Wang Date: May 20, 2014 technical
Dear Kajsa, Thanks very much for your reply. I found the answers you provided cleared many unsolved problems. However, regarding below answer I found there might be something different from the answer you provided. *What if I use the unidentified relationships in the new run, and using a mod file with relations in the scm logfile at the time of stop?* * That will usually not work.* While waiting for answers, I tested this method and found that it is actually working. The procedure is as follows: 1. use the included relations at the time of crash as the base model, copy it from m1 folder to the folder where scm file exists [change $DATA location]. 2. make a copy of the original scm file, and rename it to a different name. Some changes were made to the scm file: - change the mod file to the new base model mod; - delete all the included relations; - change the search direction from both to forward. [A full backward process would be initiated after the full model was established.] 3. submit the scm file without the option of -directory. I am still waiting for the final results. But up until now it seems that this method actually works. New covariate-parameter relations were added on top of the previous ones. From my understanding this should be the same as the method using [included_relations] as you mentioned, although this might be a little bit more complex. Best Regards
Quoted reply history
On 20 May 2014 14:56, Kajsa Harling <[email protected]> wrote: > Dear Xinting, > > > > Previously when I wanted to resume a PsN run I used the -directory= in > the command, and according to the PsN project homepage, this would enable a > resume of a PsN run after stop. Such a method works fine without the > -clean=3 option, but initiated a new search for covariates if I have > -clean=3 option. However, this method seems to be different from the above > mentioned. > > > Yes, it is a different method. Setting -clean=3 removes files that are > needed to resume an interupted run. I will make this more clear in the > documentation, it is not explained properly at the moment. > > > It seems to me that the method you mentioned requires 2 steps. The first > one is to have only the included relations at the time of stop in the scm > file. Should I use the -directory= option in the command line? > > > The steps you need to perform are as follows: > 1) Make a copy of your original scm configuration file, give it a new name > and save it in the same folder as the original scm config file > 2) open the original scm log file in the top level of the orginal scm run > folder and locate the *last* place where it says "Relations included after > this step", and copy those log-file-relations to the new configuration file > under a new [included_relations] section at the end of the new > configuration file. An example of a [included_relations] section with > correct syntax is found in > http://psn.sourceforge.net/pdfdocs/config_template_backward.scm and other > details are found in the scm userguide. > 3) If you had set option directory in your original config file then > change the setting of that option in the new config file to something > different > 4) start a new scm with the same command as you used for the original scm, > except that you use the new config file instead of the old, and either set > -directory to the name of a folder that does not yet exist or omit the > -directory option completely in which case the program will select a unique > name for you. > > > Secondly, if I start a new run in a new directory, why should I use the > exactly same original input? > > You do not use exactly the same input, you use a new setting of > [included_relations]. The input model however should be exactly the same. > > > By doing so, would be program being able to skip the already established > covariates and start from there onwards? > > > Yes, it will start from the set of relations specified in > [included_relations]. > > What if I use the unidentified relationships in the new run, and using a > mod file with relations in the scm logfile at the time of stop? > > That will usually not work. > > > > One last small question is that a few days ago, an error message saying > "psn.lst" is not created [something like this] was put forward by the > program during the middle of searching. The cov-par relation was completed > successfully before crush. Do you have any idea what might be the possible > reasons of this? > > I need the complete error message in its context of the scm run > messages to be able to answer your question. > > Best regards, > Kajsa > > > > -- > ----------------------------------------------------------------- > Kajsa Harling, PhD > System Developer > Department of Pharmaceutical Biosciences > Uppsala University > [email protected]+46-(0)18-471 4308 > http://www.farmbio.uu.se/research/researchgroups/pharmacometrics/ > ----------------------------------------------------------------- > > -- Xinting