Dear all,
I would like to share with the group an issue that I encountered using NONMEM
and which appears to me to be an undesired behavior. Since it is confidential
matter I can't unfortunately share code or data.
I have run a simple PK model with 39 data items in $INPUT. After a successful
run I started a covariate search using PsN. To my surprise the OFVs when
including covariates in the forward step turned out to be all higher than the
OFV of the base model. I mean higher by ~180 units.
I realized that PsN in the scm routine adds =DROP to some variables in $INPUT
that are not used in a given covariate test run.
I then ran the base model again with DROPPING some variables from $INPUT. And
indeed the run with 3 or more variables dropped (using DROP or SKIP) resulted
in a higher OFV (~180 units), otherwise being the same model.
In the lst files of both models I noticed a difference in the line saying
"0FORMAT FOR DATA" and in fact when looking at the temporarily created FDATA
files, it is obvious that the format of the file from the model with DROPped
items is different.
In my concrete case the issue only happens when dropping 3 or more variables. I
get the same behavior with NM 7.3 and 7.4.2. Both on Windows 10 and in a linux
environment.
The problem is fixed by using the WIDE option in $DATE.
I'm not aware of any recommendation or advise to use the WIDE option when using
DROP statements in the dataset. But am happy to learn about it in case I missed
it.
Would be great to hear if anyone else had a similar problem in the past.
Best regards, Andreas.
Andreas Lindauer, PhD
Agriculture, Food and Life
Life Science Services - Exprimo
Senior Consultant
Information in this email and any attachments is confidential and intended
solely for the use of the individual(s) to whom it is addressed or otherwise
directed. Please note that any views or opinions presented in this email are
solely those of the author and do not necessarily represent those of the
Company. Finally, the recipient should check this email and any attachments for
the presence of viruses. The Company accepts no liability for any damage caused
by any virus transmitted by this email. All SGS services are rendered in
accordance with the applicable SGS conditions of service available on request
and accessible at http://www.sgs.com/en/Terms-and-Conditions.aspx
Potential bug in NM 7.3 and 7.4.2
9 messages
6 people
Latest: Nov 21, 2018
Dear Ana,
No, the variables that I dropped where not part of the model. In fact, in my
case, the issue occurs with the 3rd variable that I drop, but it doesn't
actually mater which one the third one is. My guess is that with 3 (or more)
columns less, in this particular case, somehow NONMEM has trouble to find the
right data format.
Regards, Andreas.
Quoted reply history
From: Ruiz, Ana (Clinical Pharmacology) <[email protected]>
Sent: Dienstag, 20. November 2018 15:35
To: Lindauer, Andreas (Barcelona) <[email protected]>
Subject: Re: [EXTERNAL] [NMusers] Potential bug in NM 7.3 and 7.4.2
Hi Andreas,
Do you have Body weight or any other variable in the base model other than DV
?did you include in the config file that variable using do not drop?checking
the easiest option....
Ana
-------- Original Message --------
From: [email protected]<mailto:[email protected]> on
behalf of "Lindauer, Andreas (Barcelona)"
<[email protected]<mailto:[email protected]>>
Date: Tue, November 20, 2018 3:54 AM -0800
To: [email protected]<mailto:[email protected]>
Subject: [EXTERNAL] [NMusers] Potential bug in NM 7.3 and 7.4.2
Dear all,
I would like to share with the group an issue that I encountered using NONMEM
and which appears to me to be an undesired behavior. Since it is confidential
matter I can't unfortunately share code or data.
I have run a simple PK model with 39 data items in $INPUT. After a successful
run I started a covariate search using PsN. To my surprise the OFVs when
including covariates in the forward step turned out to be all higher than the
OFV of the base model. I mean higher by ~180 units.
I realized that PsN in the scm routine adds =DROP to some variables in $INPUT
that are not used in a given covariate test run.
I then ran the base model again with DROPPING some variables from $INPUT. And
indeed the run with 3 or more variables dropped (using DROP or SKIP) resulted
in a higher OFV (~180 units), otherwise being the same model.
In the lst files of both models I noticed a difference in the line saying
"0FORMAT FOR DATA" and in fact when looking at the temporarily created FDATA
files, it is obvious that the format of the file from the model with DROPped
items is different.
In my concrete case the issue only happens when dropping 3 or more variables. I
get the same behavior with NM 7.3 and 7.4.2. Both on Windows 10 and in a linux
environment.
The problem is fixed by using the WIDE option in $DATE.
I'm not aware of any recommendation or advise to use the WIDE option when using
DROP statements in the dataset. But am happy to learn about it in case I missed
it.
Would be great to hear if anyone else had a similar problem in the past.
Best regards, Andreas.
Andreas Lindauer, PhD
Agriculture, Food and Life
Life Science Services - Exprimo
Senior Consultant
Information in this email and any attachments is confidential and intended
solely for the use of the individual(s) to whom it is addressed or otherwise
directed. Please note that any views or opinions presented in this email are
solely those of the author and do not necessarily represent those of the
Company. Finally, the recipient should check this email and any attachments for
the presence of viruses. The Company accepts no liability for any damage caused
by any virus transmitted by this email. All SGS services are rendered in
accordance with the applicable SGS conditions of service available on request
and accessible at http://www.sgs.com/en/Terms-and-Conditions.aspx
Information in this email and any attachments is confidential and intended
solely for the use of the individual(s) to whom it is addressed or otherwise
directed. Please note that any views or opinions presented in this email are
solely those of the author and do not necessarily represent those of the
Company. Finally, the recipient should check this email and any attachments for
the presence of viruses. The Company accepts no liability for any damage caused
by any virus transmitted by this email. All SGS services are rendered in
accordance with the applicable SGS conditions of service available on request
and accessible at http://www.sgs.com/en/Terms-and-Conditions.aspx
Never seen it.
This will not solve the problem, but just for diagnostics, have you found out what is "damaged" in the created data files: is the number of subjects (and number of data records) the same in both versions (reported in the output file)? Among columns used in the base model (ID, TIME, AMT, RATE, DV, EVID, MDV), which are different? (can be checked if printed out to .tab file)? And which of the data file versions is interpreted correctly by the nonmem code, with or without WIDE option?
Thanks
Leonid
Quoted reply history
On 11/20/2018 6:45 AM, Lindauer, Andreas (Barcelona) wrote:
> Dear all,
>
> I would like to share with the group an issue that I encountered using NONMEM
> and which appears to me to be an undesired behavior. Since it is confidential
> matter I can't unfortunately share code or data.
>
> I have run a simple PK model with 39 data items in $INPUT. After a successful
> run I started a covariate search using PsN. To my surprise the OFVs when
> including covariates in the forward step turned out to be all higher than the
> OFV of the base model. I mean higher by ~180 units.
> I realized that PsN in the scm routine adds =DROP to some variables in $INPUT
> that are not used in a given covariate test run.
> I then ran the base model again with DROPPING some variables from $INPUT. And
> indeed the run with 3 or more variables dropped (using DROP or SKIP) resulted
> in a higher OFV (~180 units), otherwise being the same model.
> In the lst files of both models I noticed a difference in the line saying "0FORMAT
> FOR DATA" and in fact when looking at the temporarily created FDATA files, it is
> obvious that the format of the file from the model with DROPped items is different.
> In my concrete case the issue only happens when dropping 3 or more variables. I
> get the same behavior with NM 7.3 and 7.4.2. Both on Windows 10 and in a linux
> environment.
> The problem is fixed by using the WIDE option in $DATE.
> I'm not aware of any recommendation or advise to use the WIDE option when using
> DROP statements in the dataset. But am happy to learn about it in case I missed
> it.
>
> Would be great to hear if anyone else had a similar problem in the past.
>
> Best regards, Andreas.
>
> Andreas Lindauer, PhD
> Agriculture, Food and Life
> Life Science Services - Exprimo
> Senior Consultant
>
> Information in this email and any attachments is confidential and intended
> solely for the use of the individual(s) to whom it is addressed or otherwise
> directed. Please note that any views or opinions presented in this email are
> solely those of the author and do not necessarily represent those of the
> Company. Finally, the recipient should check this email and any attachments for
> the presence of viruses. The Company accepts no liability for any damage caused
> by any virus transmitted by this email. All SGS services are rendered in
> accordance with the applicable SGS conditions of service available on request
> and accessible at http://www.sgs.com/en/Terms-and-Conditions.aspx
Never seen it before. Do you have spaces in any of your variables? I guess
this is not the problem, but you might be dropping variables other than
those ones you want to drop if spaces are shifting your columns.
Quoted reply history
On Tue, Nov 20, 2018 at 10:11 AM Leonid Gibiansky <[email protected]>
wrote:
> Never seen it.
>
> This will not solve the problem, but just for diagnostics, have you
> found out what is "damaged" in the created data files: is the number of
> subjects (and number of data records) the same in both versions
> (reported in the output file)? Among columns used in the base model (ID,
> TIME, AMT, RATE, DV, EVID, MDV), which are different? (can be checked if
> printed out to .tab file)? And which of the data file versions is
> interpreted correctly by the nonmem code, with or without WIDE option?
>
> Thanks
> Leonid
>
>
> On 11/20/2018 6:45 AM, Lindauer, Andreas (Barcelona) wrote:
> > Dear all,
> >
> > I would like to share with the group an issue that I encountered using
> NONMEM and which appears to me to be an undesired behavior. Since it is
> confidential matter I can't unfortunately share code or data.
> >
> > I have run a simple PK model with 39 data items in $INPUT. After a
> successful run I started a covariate search using PsN. To my surprise the
> OFVs when including covariates in the forward step turned out to be all
> higher than the OFV of the base model. I mean higher by ~180 units.
> > I realized that PsN in the scm routine adds =DROP to some variables in
> $INPUT that are not used in a given covariate test run.
> > I then ran the base model again with DROPPING some variables from
> $INPUT. And indeed the run with 3 or more variables dropped (using DROP or
> SKIP) resulted in a higher OFV (~180 units), otherwise being the same model.
> > In the lst files of both models I noticed a difference in the line
> saying "0FORMAT FOR DATA" and in fact when looking at the temporarily
> created FDATA files, it is obvious that the format of the file from the
> model with DROPped items is different.
> > In my concrete case the issue only happens when dropping 3 or more
> variables. I get the same behavior with NM 7.3 and 7.4.2. Both on Windows
> 10 and in a linux environment.
> > The problem is fixed by using the WIDE option in $DATE.
> > I'm not aware of any recommendation or advise to use the WIDE option
> when using DROP statements in the dataset. But am happy to learn about it
> in case I missed it.
> >
> > Would be great to hear if anyone else had a similar problem in the past.
> >
> > Best regards, Andreas.
> >
> > Andreas Lindauer, PhD
> > Agriculture, Food and Life
> > Life Science Services - Exprimo
> > Senior Consultant
> >
> > Information in this email and any attachments is confidential and
> intended solely for the use of the individual(s) to whom it is addressed or
> otherwise directed. Please note that any views or opinions presented in
> this email are solely those of the author and do not necessarily represent
> those of the Company. Finally, the recipient should check this email and
> any attachments for the presence of viruses. The Company accepts no
> liability for any damage caused by any virus transmitted by this email. All
> SGS services are rendered in accordance with the applicable SGS conditions
> of service available on request and accessible at
> http://www.sgs.com/en/Terms-and-Conditions.aspx
> >
>
>
And one more question, do you have long lines - compared to 80 and to 300 characters that become shorter than these thresholds when you drop the third variable?
Regards,
Katya
Quoted reply history
On 11/20/2018 10:01 AM, Leonid Gibiansky wrote:
> Never seen it.
>
> This will not solve the problem, but just for diagnostics, have you found out what is "damaged" in the created data files: is the number of subjects (and number of data records) the same in both versions (reported in the output file)? Among columns used in the base model (ID, TIME, AMT, RATE, DV, EVID, MDV), which are different? (can be checked if printed out to .tab file)? And which of the data file versions is interpreted correctly by the nonmem code, with or without WIDE option?
>
> Thanks
> Leonid
>
> On 11/20/2018 6:45 AM, Lindauer, Andreas (Barcelona) wrote:
>
> > Dear all,
> >
> > I would like to share with the group an issue that I encountered using NONMEM and which appears to me to be an undesired behavior. Since it is confidential matter I can't unfortunately share code or data.
> >
> > I have run a simple PK model with 39 data items in $INPUT. After a successful run I started a covariate search using PsN. To my surprise the OFVs when including covariates in the forward step turned out to be all higher than the OFV of the base model. I mean higher by ~180 units. I realized that PsN in the scm routine adds =DROP to some variables in $INPUT that are not used in a given covariate test run. I then ran the base model again with DROPPING some variables from $INPUT. And indeed the run with 3 or more variables dropped (using DROP or SKIP) resulted in a higher OFV (~180 units), otherwise being the same model. In the lst files of both models I noticed a difference in the line saying "0FORMAT FOR DATA" and in fact when looking at the temporarily created FDATA files, it is obvious that the format of the file from the model with DROPped items is different. In my concrete case the issue only happens when dropping 3 or more variables. I get the same behavior with NM 7.3 and 7.4.2. Both on Windows 10 and in a linux environment.
> >
> > The problem is fixed by using the WIDE option in $DATE.
> >
> > I'm not aware of any recommendation or advise to use the WIDE option when using DROP statements in the dataset. But am happy to learn about it in case I missed it.
> >
> > Would be great to hear if anyone else had a similar problem in the past.
> >
> > Best regards, Andreas.
> >
> > Andreas Lindauer, PhD
> > Agriculture, Food and Life
> > Life Science Services - Exprimo
> > Senior Consultant
> >
> > Information in this email and any attachments is confidential and intended solely for the use of the individual(s) to whom it is addressed or otherwise directed. Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the Company. Finally, the recipient should check this email and any attachments for the presence of viruses. The Company accepts no liability for any damage caused by any virus transmitted by this email. All SGS services are rendered in accordance with the applicable SGS conditions of service available on request and accessible at http://www.sgs.com/en/Terms-and-Conditions.aspx
Dear Andreas
I think your issue needs to be addressed through the PsN configuration, I
don't think it is a NONMEM issue.
When running scm with PsN you need to specify in the command file the
columns that you want/need to keep with a command like
do_not_drop = C,RACE,WGT
Please refer to the PsN users guides or the Uppsala Pharmacometric Group
for more information.
Kind Regards,
Franziska
*Franziska Schaedeli Stark, PhD*Senior Pharmacometrician
Senior Principal Scientist
Pharmaceutical Sciences - Clinical Pharmacology
Roche Pharma Research & Early Development (pRED)
Roche Innovation Center Basel
F. Hoffmann-La Roche Ltd
Grenzacherstrasse 124
Bldg 1 - Floor 17 - Office N661
4070 Basel
Phone: +41 61 688 5819
Mob +41 79 773 12 61
mailto: [email protected]
Quoted reply history
On Tue, Nov 20, 2018 at 8:21 PM Leonid Gibiansky <[email protected]>
wrote:
> Thanks!
> One more question: in the "bad" model, if you look on output (tab) file,
> can you detect the differences, or it is different inside but output
> correct DV values to tab file? I think you describe it in the email
> below (that output is "bad") but I just wanted to be 100% sure. Then at
> least we can compare output with the true data (say, in R, read the tab
> file and csv file and compare) and detect the problem not looking on FDATA.
> Thanks
> Leonid
>
> On 11/20/2018 1:53 PM, Lindauer, Andreas (Barcelona) wrote:
> > @Leonid
> > It is the very DV column that is damaged.
> > In the 'good' model, the one with less than 3 variables dropped or when
> using the WIDE option, DVs show up in sdtab as they are in the input file.
> While the 'bad' model cuts off the decimals, e.g.
> > 3.17, 3.19, 3.74 in the input data file (and the good sdtab) become 3.0,
> 3.0, 3.0 with the bad model
> >
> > @Katya
> > Yes, originally I did have lines longer than 80 characters but not
> longer than 300. I just did a quick test with keeping all lines <80 chars
> and the issue remains.
> >
> > @Alejandro
> > No I don't have spaces in my variables. Neither in the name nor in the
> record itself
> >
> > @Luann
> > Yes I'm using a csv file. As far as I can see all my variables are
> numeric, and do not contain special characters. The datafile is correctly
> opened in Excel and R. But I will double check.
> >
> > Thanks to all to help detecting the problem. I will try to make a
> reproducible example with dummy data that can be shared.
> >
> > Regards, Andreas.
> >
> > -----Original Message-----
> > From: Ekaterina Gibiansky <[email protected]>
> > Sent: Dienstag, 20. November 2018 16:29
> > To: Leonid Gibiansky <[email protected]>; Lindauer, Andreas
> (Barcelona) <[email protected]>; [email protected]
> > Subject: Re: [NMusers] Potential bug in NM 7.3 and 7.4.2
> >
> > And one more question, do you have long lines - compared to 80 and to
> > 300 characters that become shorter than these thresholds when you drop
> the third variable?
> >
> > Regards,
> >
> > Katya
> >
> > On 11/20/2018 10:01 AM, Leonid Gibiansky wrote:
> >> Never seen it.
> >>
> >> This will not solve the problem, but just for diagnostics, have you
> >> found out what is "damaged" in the created data files: is the number
> >> of subjects (and number of data records) the same in both versions
> >> (reported in the output file)? Among columns used in the base model
> >> (ID, TIME, AMT, RATE, DV, EVID, MDV), which are different? (can be
> >> checked if printed out to .tab file)? And which of the data file
> >> versions is interpreted correctly by the nonmem code, with or without
> >> WIDE option?
> >>
> >> Thanks
> >> Leonid
> >>
> >>
> >> On 11/20/2018 6:45 AM, Lindauer, Andreas (Barcelona) wrote:
> >>> Dear all,
> >>>
> >>> I would like to share with the group an issue that I encountered
> >>> using NONMEM and which appears to me to be an undesired behavior.
> >>> Since it is confidential matter I can't unfortunately share code or
> >>> data.
> >>>
> >>> I have run a simple PK model with 39 data items in $INPUT. After a
> >>> successful run I started a covariate search using PsN. To my surprise
> >>> the OFVs when including covariates in the forward step turned out to
> >>> be all higher than the OFV of the base model. I mean higher by ~180
> >>> units.
> >>> I realized that PsN in the scm routine adds =DROP to some variables
> >>> in $INPUT that are not used in a given covariate test run.
> >>> I then ran the base model again with DROPPING some variables from
> >>> $INPUT. And indeed the run with 3 or more variables dropped (using
> >>> DROP or SKIP) resulted in a higher OFV (~180 units), otherwise being
> >>> the same model.
> >>> In the lst files of both models I noticed a difference in the line
> >>> saying "0FORMAT FOR DATA" and in fact when looking at the temporarily
> >>> created FDATA files, it is obvious that the format of the file from
> >>> the model with DROPped items is different.
> >>> In my concrete case the issue only happens when dropping 3 or more
> >>> variables. I get the same behavior with NM 7.3 and 7.4.2. Both on
> >>> Windows 10 and in a linux environment.
> >>> The problem is fixed by using the WIDE option in $DATE.
> >>> I'm not aware of any recommendation or advise to use the WIDE option
> >>> when using DROP statements in the dataset. But am happy to learn
> >>> about it in case I missed it.
> >>>
> >>> Would be great to hear if anyone else had a similar problem in the
> past.
> >>>
> >>> Best regards, Andreas.
> >>>
> >>> Andreas Lindauer, PhD
> >>> Agriculture, Food and Life
> >>> Life Science Services - Exprimo
> >>> Senior Consultant
> >>>
> >>> Information in this email and any attachments is confidential and
> >>> intended solely for the use of the individual(s) to whom it is
> >>> addressed or otherwise directed. Please note that any views or
> >>> opinions presented in this email are solely those of the author and
> >>> do not necessarily represent those of the Company. Finally, the
> >>> recipient should check this email and any attachments for the
> >>> presence of viruses. The Company accepts no liability for any damage
> >>> caused by any virus transmitted by this email. All SGS services are
> >>> rendered in accordance with the applicable SGS conditions of service
> >>> available on request and accessible at
> >>> http://www.sgs.com/en/Terms-and-Conditions.aspx
> >>>
> >>
> >>
> > Information in this email and any attachments is confidential and
> intended solely for the use of the individual(s) to whom it is addressed or
> otherwise directed. Please note that any views or opinions presented in
> this email are solely those of the author and do not necessarily represent
> those of the Company. Finally, the recipient should check this email and
> any attachments for the presence of viruses. The Company accepts no
> liability for any damage caused by any virus transmitted by this email. All
> SGS services are rendered in accordance with the applicable SGS conditions
> of service available on request and accessible at
> http://www.sgs.com/en/Terms-and-Conditions.aspx
> >
>
>
>
@Franziska
It is not an scm issue. The scm routine just made me aware of this problem by
including the DROP statements. I then manually tested it out side of scm with
the same result.
@Leonid
With the 'bad' model indeed incorrect DV's (with decimals cut off) are output
to the tab file.
@all
I have investigated further and think the problem is related to the length of
the lines in the datafile, as Katya suspected.
Turns out that when preparing the dataset in R I did not round derived
variables (eg. LNDV, some covariates) and as a result, some variables have up
to 15 decimal places. May not be a problem from a computational point of view,
but results in some lines in the datafile (when opening in a text editor) are
as long as 160 characters. If I round my variables to say 3 decimals, the issue
is gone.
I suspect, that there is a problem in the way NONMEM generates the FDATA file
when there are data records exceeding a specific number of characters.
I always thought the limit was 300, but apparently it may be less.
Again, thanks to all for thinking through this.
Quoted reply history
-----Original Message-----
From: Leonid Gibiansky <[email protected]>
Sent: Dienstag, 20. November 2018 20:13
To: Lindauer, Andreas (Barcelona) <[email protected]>;
[email protected]
Cc: Ekaterina Gibiansky <[email protected]>
Subject: Re: [NMusers] Potential bug in NM 7.3 and 7.4.2
Thanks!
One more question: in the "bad" model, if you look on output (tab) file, can
you detect the differences, or it is different inside but output correct DV
values to tab file? I think you describe it in the email below (that output is
"bad") but I just wanted to be 100% sure. Then at least we can compare output
with the true data (say, in R, read the tab file and csv file and compare) and
detect the problem not looking on FDATA.
Thanks
Leonid
On 11/20/2018 1:53 PM, Lindauer, Andreas (Barcelona) wrote:
> @Leonid
> It is the very DV column that is damaged.
> In the 'good' model, the one with less than 3 variables dropped or when using
> the WIDE option, DVs show up in sdtab as they are in the input file. While
> the 'bad' model cuts off the decimals, e.g.
> 3.17, 3.19, 3.74 in the input data file (and the good sdtab) become
> 3.0, 3.0, 3.0 with the bad model
>
> @Katya
> Yes, originally I did have lines longer than 80 characters but not longer
> than 300. I just did a quick test with keeping all lines <80 chars and the
> issue remains.
>
> @Alejandro
> No I don't have spaces in my variables. Neither in the name nor in the
> record itself
>
> @Luann
> Yes I'm using a csv file. As far as I can see all my variables are numeric,
> and do not contain special characters. The datafile is correctly opened in
> Excel and R. But I will double check.
>
> Thanks to all to help detecting the problem. I will try to make a
> reproducible example with dummy data that can be shared.
>
> Regards, Andreas.
>
> -----Original Message-----
> From: Ekaterina Gibiansky <[email protected]>
> Sent: Dienstag, 20. November 2018 16:29
> To: Leonid Gibiansky <[email protected]>; Lindauer, Andreas
> (Barcelona) <[email protected]>; [email protected]
> Subject: Re: [NMusers] Potential bug in NM 7.3 and 7.4.2
>
> And one more question, do you have long lines - compared to 80 and to
> 300 characters that become shorter than these thresholds when you drop the
> third variable?
>
> Regards,
>
> Katya
>
> On 11/20/2018 10:01 AM, Leonid Gibiansky wrote:
>> Never seen it.
>>
>> This will not solve the problem, but just for diagnostics, have you
>> found out what is "damaged" in the created data files: is the number
>> of subjects (and number of data records) the same in both versions
>> (reported in the output file)? Among columns used in the base model
>> (ID, TIME, AMT, RATE, DV, EVID, MDV), which are different? (can be
>> checked if printed out to .tab file)? And which of the data file
>> versions is interpreted correctly by the nonmem code, with or without
>> WIDE option?
>>
>> Thanks
>> Leonid
>>
>>
>> On 11/20/2018 6:45 AM, Lindauer, Andreas (Barcelona) wrote:
>>> Dear all,
>>>
>>> I would like to share with the group an issue that I encountered
>>> using NONMEM and which appears to me to be an undesired behavior.
>>> Since it is confidential matter I can't unfortunately share code or
>>> data.
>>>
>>> I have run a simple PK model with 39 data items in $INPUT. After a
>>> successful run I started a covariate search using PsN. To my
>>> surprise the OFVs when including covariates in the forward step
>>> turned out to be all higher than the OFV of the base model. I mean
>>> higher by ~180 units.
>>> I realized that PsN in the scm routine adds =DROP to some variables
>>> in $INPUT that are not used in a given covariate test run.
>>> I then ran the base model again with DROPPING some variables from
>>> $INPUT. And indeed the run with 3 or more variables dropped (using
>>> DROP or SKIP) resulted in a higher OFV (~180 units), otherwise being
>>> the same model.
>>> In the lst files of both models I noticed a difference in the line
>>> saying "0FORMAT FOR DATA" and in fact when looking at the
>>> temporarily created FDATA files, it is obvious that the format of
>>> the file from the model with DROPped items is different.
>>> In my concrete case the issue only happens when dropping 3 or more
>>> variables. I get the same behavior with NM 7.3 and 7.4.2. Both on
>>> Windows 10 and in a linux environment.
>>> The problem is fixed by using the WIDE option in $DATE.
>>> I'm not aware of any recommendation or advise to use the WIDE option
>>> when using DROP statements in the dataset. But am happy to learn
>>> about it in case I missed it.
>>>
>>> Would be great to hear if anyone else had a similar problem in the past.
>>>
>>> Best regards, Andreas.
>>>
>>> Andreas Lindauer, PhD
>>> Agriculture, Food and Life
>>> Life Science Services - Exprimo
>>> Senior Consultant
>>>
>>> Information in this email and any attachments is confidential and
>>> intended solely for the use of the individual(s) to whom it is
>>> addressed or otherwise directed. Please note that any views or
>>> opinions presented in this email are solely those of the author and
>>> do not necessarily represent those of the Company. Finally, the
>>> recipient should check this email and any attachments for the
>>> presence of viruses. The Company accepts no liability for any damage
>>> caused by any virus transmitted by this email. All SGS services are
>>> rendered in accordance with the applicable SGS conditions of service
>>> available on request and accessible at
>>> http://www.sgs.com/en/Terms-and-Conditions.aspx
>>>
>>
>>
> Information in this email and any attachments is confidential and
> intended solely for the use of the individual(s) to whom it is
> addressed or otherwise directed. Please note that any views or
> opinions presented in this email are solely those of the author and do
> not necessarily represent those of the Company. Finally, the recipient
> should check this email and any attachments for the presence of
> viruses. The Company accepts no liability for any damage caused by any
> virus transmitted by this email. All SGS services are rendered in
> accordance with the applicable SGS conditions of service available on
> request and accessible at
> http://www.sgs.com/en/Terms-and-Conditions.aspx
>
Information in this email and any attachments is confidential and intended
solely for the use of the individual(s) to whom it is addressed or otherwise
directed. Please note that any views or opinions presented in this email are
solely those of the author and do not necessarily represent those of the
Company. Finally, the recipient should check this email and any attachments for
the presence of viruses. The Company accepts no liability for any damage caused
by any virus transmitted by this email. All SGS services are rendered in
accordance with the applicable SGS conditions of service available on request
and accessible at http://www.sgs.com/en/Terms-and-Conditions.aspx
Thanks!
Checked on nm 7.4.3, same result. I also output DV and LNDV for the "bad" run (in tab file, in addition to DV that was appended). It is interesting that DV was cut (as in the appended version) but LNDV was intact and printed with correct rounding. Also added digits to AMT and TIME, and those were rounded correctly. So may be this is specific to DV field (worrying about checking whether specific runs are affected: if AMT, TIME, and covariates are not affected, then it is sufficient to check that DV in output = DV in the data file (up to reasonable rounding) in the scripts for diagnostic plots to detect this bug).
Thanks
Leonid
Quoted reply history
On 11/21/2018 6:02 AM, Lindauer, Andreas (Barcelona) wrote:
> Hi all,
>
> I managed to create a dummy dataset and model code without proprietary
> information.
> I've put the files on github for those of you who are interested in
> investigating further.
>
> https://github.com/lindauer1980/NONMEM_DROP_ISSUE
>
> Included in this repository are 4 model files,
>
> 1) without dropping
> 2) 3 variables dropped,
> 3) with the WIDE option and 3 dropped variables
> 4) 3 dropped variables and variables in input file rounded
>
> #1, #3 and #4 give the same OFV, while #2 results in a 220 units higher OFV. If
> you run the models you will observe that in the DV column in the sdtab output
> of #2 the value rounded to the next integer which is clearly in contrast to the
> input dataset.
>
> Again thank you very much for looking into this issue.
>
> Best regards, Andreas.
>
> -----Original Message-----
> From: [email protected] <[email protected]> On Behalf Of
> Lindauer, Andreas (Barcelona)
> Sent: Mittwoch, 21. November 2018 08:40
> To: [email protected]
> Subject: RE: [NMusers] Potential bug in NM 7.3 and 7.4.2
>
> @Franziska
> It is not an scm issue. The scm routine just made me aware of this problem by
> including the DROP statements. I then manually tested it out side of scm with
> the same result.
>
> @Leonid
> With the 'bad' model indeed incorrect DV's (with decimals cut off) are output
> to the tab file.
>
> @all
> I have investigated further and think the problem is related to the length of
> the lines in the datafile, as Katya suspected.
> Turns out that when preparing the dataset in R I did not round derived
> variables (eg. LNDV, some covariates) and as a result, some variables have up
> to 15 decimal places. May not be a problem from a computational point of view,
> but results in some lines in the datafile (when opening in a text editor) are
> as long as 160 characters. If I round my variables to say 3 decimals, the issue
> is gone.
> I suspect, that there is a problem in the way NONMEM generates the FDATA file
> when there are data records exceeding a specific number of characters.
>
> I always thought the limit was 300, but apparently it may be less.
>
> Again, thanks to all for thinking through this.
>
> -----Original Message-----
> From: Leonid Gibiansky <[email protected]>
> Sent: Dienstag, 20. November 2018 20:13
> To: Lindauer, Andreas (Barcelona) <[email protected]>;
> [email protected]
> Cc: Ekaterina Gibiansky <[email protected]>
> Subject: Re: [NMusers] Potential bug in NM 7.3 and 7.4.2
>
> Thanks!
> One more question: in the "bad" model, if you look on output (tab) file, can you detect
> the differences, or it is different inside but output correct DV values to tab file? I think you
> describe it in the email below (that output is "bad") but I just wanted to be 100% sure.
> Then at least we can compare output with the true data (say, in R, read the tab file and csv file
> and compare) and detect the problem not looking on FDATA.
> Thanks
> Leonid
>
> On 11/20/2018 1:53 PM, Lindauer, Andreas (Barcelona) wrote:
>
> > @Leonid
> > It is the very DV column that is damaged.
> > In the 'good' model, the one with less than 3 variables dropped or when using
> > the WIDE option, DVs show up in sdtab as they are in the input file. While the
> > 'bad' model cuts off the decimals, e.g.
> > 3.17, 3.19, 3.74 in the input data file (and the good sdtab) become
> > 3.0, 3.0, 3.0 with the bad model
> >
> > @Katya
> > Yes, originally I did have lines longer than 80 characters but not longer than
> > 300. I just did a quick test with keeping all lines <80 chars and the issue
> > remains.
> >
> > @Alejandro
> > No I don't have spaces in my variables. Neither in the name nor in the
> > record itself
> >
> > @Luann
> > Yes I'm using a csv file. As far as I can see all my variables are numeric, and
> > do not contain special characters. The datafile is correctly opened in Excel
> > and R. But I will double check.
> >
> > Thanks to all to help detecting the problem. I will try to make a reproducible
> > example with dummy data that can be shared.
> >
> > Regards, Andreas.
> >
> > -----Original Message-----
> > From: Ekaterina Gibiansky <[email protected]>
> > Sent: Dienstag, 20. November 2018 16:29
> > To: Leonid Gibiansky <[email protected]>; Lindauer, Andreas
> > (Barcelona) <[email protected]>; [email protected]
> > Subject: Re: [NMusers] Potential bug in NM 7.3 and 7.4.2
> >
> > And one more question, do you have long lines - compared to 80 and to
> > 300 characters that become shorter than these thresholds when you drop the
> > third variable?
> >
> > Regards,
> >
> > Katya
> >
> > On 11/20/2018 10:01 AM, Leonid Gibiansky wrote:
> >
> > > Never seen it.
> > >
> > > This will not solve the problem, but just for diagnostics, have you
> > > found out what is "damaged" in the created data files: is the number
> > > of subjects (and number of data records) the same in both versions
> > > (reported in the output file)? Among columns used in the base model
> > > (ID, TIME, AMT, RATE, DV, EVID, MDV), which are different? (can be
> > > checked if printed out to .tab file)? And which of the data file
> > > versions is interpreted correctly by the nonmem code, with or without
> > > WIDE option?
> > >
> > > Thanks
> > > Leonid
> > >
> > > On 11/20/2018 6:45 AM, Lindauer, Andreas (Barcelona) wrote:
> > >
> > > > Dear all,
> > > >
> > > > I would like to share with the group an issue that I encountered
> > > > using NONMEM and which appears to me to be an undesired behavior.
> > > > Since it is confidential matter I can't unfortunately share code or
> > > > data.
> > > >
> > > > I have run a simple PK model with 39 data items in $INPUT. After a
> > > > successful run I started a covariate search using PsN. To my
> > > > surprise the OFVs when including covariates in the forward step
> > > > turned out to be all higher than the OFV of the base model. I mean
> > > > higher by ~180 units.
> > > > I realized that PsN in the scm routine adds =DROP to some variables
> > > > in $INPUT that are not used in a given covariate test run.
> > > > I then ran the base model again with DROPPING some variables from
> > > > $INPUT. And indeed the run with 3 or more variables dropped (using
> > > > DROP or SKIP) resulted in a higher OFV (~180 units), otherwise being
> > > > the same model.
> > > > In the lst files of both models I noticed a difference in the line
> > > > saying "0FORMAT FOR DATA" and in fact when looking at the
> > > > temporarily created FDATA files, it is obvious that the format of
> > > > the file from the model with DROPped items is different.
> > > > In my concrete case the issue only happens when dropping 3 or more
> > > > variables. I get the same behavior with NM 7.3 and 7.4.2. Both on
> > > > Windows 10 and in a linux environment.
> > > > The problem is fixed by using the WIDE option in $DATE.
> > > > I'm not aware of any recommendation or advise to use the WIDE option
> > > > when using DROP statements in the dataset. But am happy to learn
> > > > about it in case I missed it.
> > > >
> > > > Would be great to hear if anyone else had a similar problem in the past.
> > > >
> > > > Best regards, Andreas.
> > > >
> > > > Andreas Lindauer, PhD
> > > > Agriculture, Food and Life
> > > > Life Science Services - Exprimo
> > > > Senior Consultant
> > > >
> > > > Information in this email and any attachments is confidential and
> > > > intended solely for the use of the individual(s) to whom it is
> > > > addressed or otherwise directed. Please note that any views or
> > > > opinions presented in this email are solely those of the author and
> > > > do not necessarily represent those of the Company. Finally, the
> > > > recipient should check this email and any attachments for the
> > > > presence of viruses. The Company accepts no liability for any damage
> > > > caused by any virus transmitted by this email. All SGS services are
> > > > rendered in accordance with the applicable SGS conditions of service
> > > > available on request and accessible at
> > > > http://www.sgs.com/en/Terms-and-Conditions.aspx
> >
> > Information in this email and any attachments is confidential and
> > intended solely for the use of the individual(s) to whom it is
> > addressed or otherwise directed. Please note that any views or
> > opinions presented in this email are solely those of the author and do
> > not necessarily represent those of the Company. Finally, the recipient
> > should check this email and any attachments for the presence of
> > viruses. The Company accepts no liability for any damage caused by any
> > virus transmitted by this email. All SGS services are rendered in
> > accordance with the applicable SGS conditions of service available on
> > request and accessible at
> > http://www.sgs.com/en/Terms-and-Conditions.aspx
>
> Information in this email and any attachments is confidential and intended
> solely for the use of the individual(s) to whom it is addressed or otherwise
> directed. Please note that any views or opinions presented in this email are
> solely those of the author and do not necessarily represent those of the
> Company. Finally, the recipient should check this email and any attachments for
> the presence of viruses. The Company accepts no liability for any damage caused
> by any virus transmitted by this email. All SGS services are rendered in
> accordance with the applicable SGS conditions of service available on request
> and accessible at http://www.sgs.com/en/Terms-and-Conditions.aspx
> Information in this email and any attachments is confidential and intended
> solely for the use of the individual(s) to whom it is addressed or otherwise
> directed. Please note that any views or opinions presented in this email are
> solely those of the author and do not necessarily represent those of the
> Company. Finally, the recipient should check this email and any attachments for
> the presence of viruses. The Company accepts no liability for any damage caused
> by any virus transmitted by this email. All SGS services are rendered in
> accordance with the applicable SGS conditions of service available on request
> and accessible at http://www.sgs.com/en/Terms-and-Conditions.aspx
Andreas:
Thanks, this should allow me to determine how to resolve the issue.
Robert J. Bauer, Ph.D.
Senior Director
Pharmacometrics R&D
ICON Early Phase
820 W. Diamond Avenue
Suite 100
Gaithersburg, MD 20878
Office: (215) 616-6428
Mobile: (925) 286-0769
[email protected]<mailto:[email protected]>
http://www.iconplc.com/
Quoted reply history
From: [email protected] [mailto:[email protected]] On
Behalf Of Lindauer, Andreas (Barcelona)
Sent: Wednesday, November 21, 2018 3:02 AM
To: [email protected]
Subject: RE: [NMusers] Potential bug in NM 7.3 and 7.4.2
Hi all,
I managed to create a dummy dataset and model code without proprietary
information.
I've put the files on github for those of you who are interested in
investigating further.
https://github.com/lindauer1980/NONMEM_DROP_ISSUE
Included in this repository are 4 model files,
1) without dropping
2) 3 variables dropped,
3) with the WIDE option and 3 dropped variables
4) 3 dropped variables and variables in input file rounded
#1, #3 and #4 give the same OFV, while #2 results in a 220 units higher OFV. If
you run the models you will observe that in the DV column in the sdtab output
of #2 the value rounded to the next integer which is clearly in contrast to the
input dataset.
Again thank you very much for looking into this issue.
Best regards, Andreas.
-----Original Message-----
From: [email protected] <[email protected]> On Behalf Of
Lindauer, Andreas (Barcelona)
Sent: Mittwoch, 21. November 2018 08:40
To: [email protected]
Subject: RE: [NMusers] Potential bug in NM 7.3 and 7.4.2
@Franziska
It is not an scm issue. The scm routine just made me aware of this problem by
including the DROP statements. I then manually tested it out side of scm with
the same result.
@Leonid
With the 'bad' model indeed incorrect DV's (with decimals cut off) are output
to the tab file.
@all
I have investigated further and think the problem is related to the length of
the lines in the datafile, as Katya suspected.
Turns out that when preparing the dataset in R I did not round derived
variables (eg. LNDV, some covariates) and as a result, some variables have up
to 15 decimal places. May not be a problem from a computational point of view,
but results in some lines in the datafile (when opening in a text editor) are
as long as 160 characters. If I round my variables to say 3 decimals, the issue
is gone.
I suspect, that there is a problem in the way NONMEM generates the FDATA file
when there are data records exceeding a specific number of characters.
I always thought the limit was 300, but apparently it may be less.
Again, thanks to all for thinking through this.
-----Original Message-----
From: Leonid Gibiansky <[email protected]>
Sent: Dienstag, 20. November 2018 20:13
To: Lindauer, Andreas (Barcelona) <[email protected]>;
[email protected]
Cc: Ekaterina Gibiansky <[email protected]>
Subject: Re: [NMusers] Potential bug in NM 7.3 and 7.4.2
Thanks!
One more question: in the "bad" model, if you look on output (tab) file, can
you detect the differences, or it is different inside but output correct DV
values to tab file? I think you describe it in the email below (that output is
"bad") but I just wanted to be 100% sure. Then at least we can compare output
with the true data (say, in R, read the tab file and csv file and compare) and
detect the problem not looking on FDATA.
Thanks
Leonid
On 11/20/2018 1:53 PM, Lindauer, Andreas (Barcelona) wrote:
> @Leonid
> It is the very DV column that is damaged.
> In the 'good' model, the one with less than 3 variables dropped or when using
> the WIDE option, DVs show up in sdtab as they are in the input file. While
> the 'bad' model cuts off the decimals, e.g.
> 3.17, 3.19, 3.74 in the input data file (and the good sdtab) become
> 3.0, 3.0, 3.0 with the bad model
>
> @Katya
> Yes, originally I did have lines longer than 80 characters but not longer
> than 300. I just did a quick test with keeping all lines <80 chars and the
> issue remains.
>
> @Alejandro
> No I don't have spaces in my variables. Neither in the name nor in the
> record itself
>
> @Luann
> Yes I'm using a csv file. As far as I can see all my variables are numeric,
> and do not contain special characters. The datafile is correctly opened in
> Excel and R. But I will double check.
>
> Thanks to all to help detecting the problem. I will try to make a
> reproducible example with dummy data that can be shared.
>
> Regards, Andreas.
>
> -----Original Message-----
> From: Ekaterina Gibiansky <[email protected]>
> Sent: Dienstag, 20. November 2018 16:29
> To: Leonid Gibiansky <[email protected]>; Lindauer, Andreas
> (Barcelona) <[email protected]>; [email protected]
> Subject: Re: [NMusers] Potential bug in NM 7.3 and 7.4.2
>
> And one more question, do you have long lines - compared to 80 and to
> 300 characters that become shorter than these thresholds when you drop the
> third variable?
>
> Regards,
>
> Katya
>
> On 11/20/2018 10:01 AM, Leonid Gibiansky wrote:
>> Never seen it.
>>
>> This will not solve the problem, but just for diagnostics, have you
>> found out what is "damaged" in the created data files: is the number
>> of subjects (and number of data records) the same in both versions
>> (reported in the output file)? Among columns used in the base model
>> (ID, TIME, AMT, RATE, DV, EVID, MDV), which are different? (can be
>> checked if printed out to .tab file)? And which of the data file
>> versions is interpreted correctly by the nonmem code, with or without
>> WIDE option?
>>
>> Thanks
>> Leonid
>>
>>
>> On 11/20/2018 6:45 AM, Lindauer, Andreas (Barcelona) wrote:
>>> Dear all,
>>>
>>> I would like to share with the group an issue that I encountered
>>> using NONMEM and which appears to me to be an undesired behavior.
>>> Since it is confidential matter I can't unfortunately share code or
>>> data.
>>>
>>> I have run a simple PK model with 39 data items in $INPUT. After a
>>> successful run I started a covariate search using PsN. To my
>>> surprise the OFVs when including covariates in the forward step
>>> turned out to be all higher than the OFV of the base model. I mean
>>> higher by ~180 units.
>>> I realized that PsN in the scm routine adds =DROP to some variables
>>> in $INPUT that are not used in a given covariate test run.
>>> I then ran the base model again with DROPPING some variables from
>>> $INPUT. And indeed the run with 3 or more variables dropped (using
>>> DROP or SKIP) resulted in a higher OFV (~180 units), otherwise being
>>> the same model.
>>> In the lst files of both models I noticed a difference in the line
>>> saying "0FORMAT FOR DATA" and in fact when looking at the
>>> temporarily created FDATA files, it is obvious that the format of
>>> the file from the model with DROPped items is different.
>>> In my concrete case the issue only happens when dropping 3 or more
>>> variables. I get the same behavior with NM 7.3 and 7.4.2. Both on
>>> Windows 10 and in a linux environment.
>>> The problem is fixed by using the WIDE option in $DATE.
>>> I'm not aware of any recommendation or advise to use the WIDE option
>>> when using DROP statements in the dataset. But am happy to learn
>>> about it in case I missed it.
>>>
>>> Would be great to hear if anyone else had a similar problem in the past.
>>>
>>> Best regards, Andreas.
>>>
>>> Andreas Lindauer, PhD
>>> Agriculture, Food and Life
>>> Life Science Services - Exprimo
>>> Senior Consultant
>>>
>>> Information in this email and any attachments is confidential and
>>> intended solely for the use of the individual(s) to whom it is
>>> addressed or otherwise directed. Please note that any views or
>>> opinions presented in this email are solely those of the author and
>>> do not necessarily represent those of the Company. Finally, the
>>> recipient should check this email and any attachments for the
>>> presence of viruses. The Company accepts no liability for any damage
>>> caused by any virus transmitted by this email. All SGS services are
>>> rendered in accordance with the applicable SGS conditions of service
>>> available on request and accessible at
>>> http://www.sgs.com/en/Terms-and-Conditions.aspx
>>>
>>
>>
> Information in this email and any attachments is confidential and
> intended solely for the use of the individual(s) to whom it is
> addressed or otherwise directed. Please note that any views or
> opinions presented in this email are solely those of the author and do
> not necessarily represent those of the Company. Finally, the recipient
> should check this email and any attachments for the presence of
> viruses. The Company accepts no liability for any damage caused by any
> virus transmitted by this email. All SGS services are rendered in
> accordance with the applicable SGS conditions of service available on
> request and accessible at
> http://www.sgs.com/en/Terms-and-Conditions.aspx
>
Information in this email and any attachments is confidential and intended
solely for the use of the individual(s) to whom it is addressed or otherwise
directed. Please note that any views or opinions presented in this email are
solely those of the author and do not necessarily represent those of the
Company. Finally, the recipient should check this email and any attachments for
the presence of viruses. The Company accepts no liability for any damage caused
by any virus transmitted by this email. All SGS services are rendered in
accordance with the applicable SGS conditions of service available on request
and accessible at
http://www.sgs.com/en/Terms-and-Conditions.aspx
Information in this email and any attachments is confidential and intended
solely for the use of the individual(s) to whom it is addressed or otherwise
directed. Please note that any views or opinions presented in this email are
solely those of the author and do not necessarily represent those of the
Company. Finally, the recipient should check this email and any attachments for
the presence of viruses. The Company accepts no liability for any damage caused
by any virus transmitted by this email. All SGS services are rendered in
accordance with the applicable SGS conditions of service available on request
and accessible at
http://www.sgs.com/en/Terms-and-Conditions.aspx