Hello Nonmem Community,
It seems like NONMEM developers may advise to start with full OMEGA matrix at the beginning of model development. Monolix developers may advise to start with a diagonal matrix. Is there something different in NONMEM SAEM algorithms that makes model stable when a lot of statistically insignificant correlations/covariances are estimated in the model?
It seems like NONMEM SAEM can be very stable in very “hard cases” (a lot of outliers, partially misspecified model, overparameterized model, etc.). The omega matrix is a part of the puzzle.
When it is impossible to test every correlation coefficient for significance due to some limitations, it becomes a regulatory issue. We may need to be able to make a statement that the model is safe and sound even when OMEGA matrix can be overparameterized (tries to estimate too many insignificant parameters within the OMEGA matrix).
Kind regards,
Pavel
OMEGA matrix
14 messages
7 people
Latest: Oct 02, 2014
Hi Pavel,
My question is: Why is it desirable to fit a complete omega matrix if its
physical interpretation is unclear? Etas are variation of unknown origin i.e.
not explained by the structural model. A full omega matrix allows the unknown
variation of one paramater to have a (linear?) relationship with some other
thing that is also unknown. If unknown A is found to have a linear relationship
with unknown B, then what knowlegde is gained? I do think it can be instructive
to to look at correlations and use this information to make a better structural
model. But I think diagonal OMEGA matrix is more desirable if it works ok.
warm regards,
Douglas Eleveld
Quoted reply history
________________________________________
From: [email protected] [[email protected]] on behalf of
Pavel Belo [[email protected]]
Sent: Thursday, September 25, 2014 4:24 PM
To: [email protected]
Subject: [NMusers] OMEGA matrix
Hello Nonmem Community,
It seems like NONMEM developers may advise to start with full OMEGA matrix at
the beginning of model development. Monolix developers may advise to start
with a diagonal matrix. Is there something different in NONMEM SAEM algorithms
that makes model stable when a lot of statistically insignificant
correlations/covariances are estimated in the model?
It seems like NONMEM SAEM can be very stable in very “hard cases” (a lot of
outliers, partially misspecified model, overparameterized model, etc.). The
omega matrix is a part of the puzzle.
When it is impossible to test every correlation coefficient for significance
due to some limitations, it becomes a regulatory issue. We may need to be able
to make a statement that the model is safe and sound even when OMEGA matrix can
be overparameterized (tries to estimate too many insignificant parameters
within the OMEGA matrix).
Kind regards,
Pavel
________________________________
Hi Douglas,
My own thinking is that you should fit the largest omega structure that can
be supported by the data rather than just always assuming a diagonal omega
structure. This does not necessarily mean always fitting a full block omega
structure, as it can often lead to an ill-conditioned model, however, there
may be a reduced block omega structure that is more parsimonious than the
diagonal omega structure. Getting the omega structure right is particularly
important for simulation of individual responses. For example, if you
always simulate from a diagonal omega structure for CL and V when there is
evidence that the random effects are highly positively correlated then you
may end up simulating individual PK profiles for combinations of individual
CLs and Vs that are not represented in your data (i.e., high correlation
would suggest that individuals with high CL will tend to also have high V
and vice versa whereas a simulation assuming that they are independent will
result in simulating for some individuals with high CL and low V and some
individuals with low CL and high V that might not be represented in your
data). This could lead to simulations that over-predict the variation in
the concentration-time profiles even though the diagonal omega may be
sufficient for purposes of predicting central tendency in the PK profile.
You can confirm this by VPC looking at your ability to predict say the 10th
and 90th percentiles in comparison to the observed 10th and 90th percentiles
in your data. That is, if you simulate from the diagonal omega when there
is correlation in the random effects you may find that your prediction of
the 10th and 90th percentiles are more extreme than that in your observed
data. I see this all the time in VPC plots where the majority of the
observed data are well within the predictions of the 10th and 90th
percentiles when we should expect about 10% of our data above the 90th
percentile prediction and 10% below the 10th percentile prediction.
Best regards,
Ken
Kenneth G. Kowalski
President & CEO
A2PG - Ann Arbor Pharmacometrics Group, Inc.
110 Miller Ave., Garden Suite
Ann Arbor, MI 48104
Work: 734-274-8255
Cell: 248-207-5082
Fax: 734-913-0230
[email protected]
www.a2pg.com
Quoted reply history
-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Eleveld, DJ
Sent: Thursday, September 25, 2014 4:36 PM
To: Pavel Belo; [email protected]
Subject: RE: [NMusers] OMEGA matrix
Hi Pavel,
My question is: Why is it desirable to fit a complete omega matrix if its
physical interpretation is unclear? Etas are variation of unknown origin
i.e. not explained by the structural model. A full omega matrix allows the
unknown variation of one paramater to have a (linear?) relationship with
some other thing that is also unknown. If unknown A is found to have a
linear relationship with unknown B, then what knowlegde is gained? I do
think it can be instructive to to look at correlations and use this
information to make a better structural model. But I think diagonal OMEGA
matrix is more desirable if it works ok.
warm regards,
Douglas Eleveld
________________________________________
From: [email protected] [[email protected]] on behalf
of Pavel Belo [[email protected]]
Sent: Thursday, September 25, 2014 4:24 PM
To: [email protected]
Subject: [NMusers] OMEGA matrix
Hello Nonmem Community,
It seems like NONMEM developers may advise to start with full OMEGA matrix
at the beginning of model development. Monolix developers may advise to
start with a diagonal matrix. Is there something different in NONMEM SAEM
algorithms that makes model stable when a lot of statistically insignificant
correlations/covariances are estimated in the model?
It seems like NONMEM SAEM can be very stable in very hard cases (a lot of
outliers, partially misspecified model, overparameterized model, etc.). The
omega matrix is a part of the puzzle.
When it is impossible to test every correlation coefficient for significance
due to some limitations, it becomes a regulatory issue. We may need to be
able to make a statement that the model is safe and sound even when OMEGA
matrix can be overparameterized (tries to estimate too many insignificant
parameters within the OMEGA matrix).
Kind regards,
Pavel
________________________________
Dear Pavel,
To answer your question I suggest you go on Bob Bauer's NONMEM 7 course. The
understanding I gleaned from that course (which I think was enhanced by the
excellent wine we had at lunch in Alicante) was that with appropriate MU
parameterisation there is virtually no computational disadvantage to estimating
the full block with the newer algorithms. So you might as well do it, at least
in early runs where you want an idea of which parameter correlations might be
useful/reasonably estimated.
BW,
Joe
Joseph F Standing
MRC Fellow, UCL Institute of Child Health
Antimicrobial Pharmacist, Great Ormond Street Hospital
Tel: +44(0)207 905 2370
Mobile: +44(0)7970 572435
Quoted reply history
________________________________________
From: [email protected] [[email protected]] On Behalf Of
Ken Kowalski [[email protected]]
Sent: 25 September 2014 22:43
To: 'Eleveld, DJ'; 'Pavel Belo'; [email protected]
Subject: RE: [NMusers] OMEGA matrix
Warning: This message contains unverified links which may not be safe. You
should only click links if you are sure they are from a trusted source.
Hi Douglas,
My own thinking is that you should fit the largest omega structure that can
be supported by the data rather than just always assuming a diagonal omega
structure. This does not necessarily mean always fitting a full block omega
structure, as it can often lead to an ill-conditioned model, however, there
may be a reduced block omega structure that is more parsimonious than the
diagonal omega structure. Getting the omega structure right is particularly
important for simulation of individual responses. For example, if you
always simulate from a diagonal omega structure for CL and V when there is
evidence that the random effects are highly positively correlated then you
may end up simulating individual PK profiles for combinations of individual
CLs and Vs that are not represented in your data (i.e., high correlation
would suggest that individuals with high CL will tend to also have high V
and vice versa whereas a simulation assuming that they are independent will
result in simulating for some individuals with high CL and low V and some
individuals with low CL and high V that might not be represented in your
data). This could lead to simulations that over-predict the variation in
the concentration-time profiles even though the diagonal omega may be
sufficient for purposes of predicting central tendency in the PK profile.
You can confirm this by VPC looking at your ability to predict say the 10th
and 90th percentiles in comparison to the observed 10th and 90th percentiles
in your data. That is, if you simulate from the diagonal omega when there
is correlation in the random effects you may find that your prediction of
the 10th and 90th percentiles are more extreme than that in your observed
data. I see this all the time in VPC plots where the majority of the
observed data are well within the predictions of the 10th and 90th
percentiles when we should expect about 10% of our data above the 90th
percentile prediction and 10% below the 10th percentile prediction.
Best regards,
Ken
Kenneth G. Kowalski
President & CEO
A2PG - Ann Arbor Pharmacometrics Group, Inc.
110 Miller Ave., Garden Suite
Ann Arbor, MI 48104
Work: 734-274-8255
Cell: 248-207-5082
Fax: 734-913-0230
[email protected]
www.a2pg.com
-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Eleveld, DJ
Sent: Thursday, September 25, 2014 4:36 PM
To: Pavel Belo; [email protected]
Subject: RE: [NMusers] OMEGA matrix
Hi Pavel,
My question is: Why is it desirable to fit a complete omega matrix if its
physical interpretation is unclear? Etas are variation of unknown origin
i.e. not explained by the structural model. A full omega matrix allows the
unknown variation of one paramater to have a (linear?) relationship with
some other thing that is also unknown. If unknown A is found to have a
linear relationship with unknown B, then what knowlegde is gained? I do
think it can be instructive to to look at correlations and use this
information to make a better structural model. But I think diagonal OMEGA
matrix is more desirable if it works ok.
warm regards,
Douglas Eleveld
________________________________________
From: [email protected] [[email protected]] on behalf
of Pavel Belo [[email protected]]
Sent: Thursday, September 25, 2014 4:24 PM
To: [email protected]
Subject: [NMusers] OMEGA matrix
Hello Nonmem Community,
It seems like NONMEM developers may advise to start with full OMEGA matrix
at the beginning of model development. Monolix developers may advise to
start with a diagonal matrix. Is there something different in NONMEM SAEM
algorithms that makes model stable when a lot of statistically insignificant
correlations/covariances are estimated in the model?
It seems like NONMEM SAEM can be very stable in very “hard cases” (a lot of
outliers, partially misspecified model, overparameterized model, etc.). The
omega matrix is a part of the puzzle.
When it is impossible to test every correlation coefficient for significance
due to some limitations, it becomes a regulatory issue. We may need to be
able to make a statement that the model is safe and sound even when OMEGA
matrix can be overparameterized (tries to estimate too many insignificant
parameters within the OMEGA matrix).
Kind regards,
Pavel
________________________________
Dear Pavel, others,
The underlying technical difference is that SAEM is in its core a sampling
methodology. Off-diagonal elements (as explained by Bob Bauer) are available as
sample correlations and do not have to be separately computed in contrast to
linearization approaches such as FOCE.
The more interesting question to me, as also eluted to by Ken, is what criteria
to set up for inclusion of an off-diagonal element. I completely support his
argument for simulation performance of the model, as e.g. judged using a VPC.
Whether to score it as an additional degree of freedom may be up to debate. An
off-diagonal element in essence limits the freedom of the model as the random
space in which samples can be generated will be smaller. In that perspective
one could argue to retain any off-diagonal element that is sufficiently
deviating from zero regardless of ofv changes, and to not apply the concept of
over-parametrization (or at least not in comparison to other types of
parameters). In practice inclusion of an important off-diagonal is mostly
accompanied by a sound improvement in ofv anyway.
More can be found in earlier discussions we had on this list, see e.g.
https://www.mail-archive.com/[email protected]/msg02736.html for quite an
extensive one from 2010. Here also an r-script to visualize the parameter space
impact can be found ;-).
In cases where a larger full or banded omega block is found, I would advice to
explore its properties further using matrix decomposition approaches (PCA etc)
to evaluate propagated correlations across the matrix. But also on the basis
of physiology/pharmacology as a data sample may not be informative enough to
support robust interpretation of correlations. A discussion along those lines
in reporting seems the more fruitful to me.
Best regards,
Jeroen
http://pd-value.com
-- More value out of your data!
Quoted reply history
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On
> Behalf Of Standing Joseph (GREAT ORMOND STREET HOSPITAL FOR CHILDREN NHS
> FOUNDATION TRUST)
> Sent: Friday, September 26, 2014 09:15
> To: Kowalski, Ken; 'Eleveld, DJ'; 'Pavel Belo'; [email protected]
> Subject: RE: [NMusers] OMEGA matrix
>
> Dear Pavel,
>
> To answer your question I suggest you go on Bob Bauer's NONMEM 7 course. The
> understanding I gleaned from that course (which I think was enhanced by the
> excellent wine we had at lunch in Alicante) was that with appropriate MU
> parameterisation there is virtually no computational disadvantage to
> estimating the full block with the newer algorithms. So you might as well do
> it, at least in early runs where you want an idea of which parameter
> correlations might be useful/reasonably estimated.
>
> BW,
>
> Joe
>
>
> Joseph F Standing
> MRC Fellow, UCL Institute of Child Health
> Antimicrobial Pharmacist, Great Ormond Street Hospital
> Tel: +44(0)207 905 2370
> Mobile: +44(0)7970 572435
>
> From: [email protected] [[email protected]] On Behalf
> Of Ken Kowalski [[email protected]]
> Sent: 25 September 2014 22:43
> To: 'Eleveld, DJ'; 'Pavel Belo'; [email protected]
> Subject: RE: [NMusers] OMEGA matrix
>
> Warning: This message contains unverified links which may not be safe. You
> should only click links if you are sure they are from a trusted source.
> Hi Douglas,
>
> My own thinking is that you should fit the largest omega structure that can
> be supported by the data rather than just always assuming a diagonal omega
> structure. This does not necessarily mean always fitting a full block omega
> structure, as it can often lead to an ill-conditioned model, however, there
> may be a reduced block omega structure that is more parsimonious than the
> diagonal omega structure. Getting the omega structure right is particularly
> important for simulation of individual responses. For example, if you
> always simulate from a diagonal omega structure for CL and V when there is
> evidence that the random effects are highly positively correlated then you
> may end up simulating individual PK profiles for combinations of individual
> CLs and Vs that are not represented in your data (i.e., high correlation
> would suggest that individuals with high CL will tend to also have high V
> and vice versa whereas a simulation assuming that they are independent will
> result in simulating for some individuals with high CL and low V and some
> individuals with low CL and high V that might not be represented in your
> data). This could lead to simulations that over-predict the variation in
> the concentration-time profiles even though the diagonal omega may be
> sufficient for purposes of predicting central tendency in the PK profile.
> You can confirm this by VPC looking at your ability to predict say the 10th
> and 90th percentiles in comparison to the observed 10th and 90th percentiles
> in your data. That is, if you simulate from the diagonal omega when there
> is correlation in the random effects you may find that your prediction of
> the 10th and 90th percentiles are more extreme than that in your observed
> data. I see this all the time in VPC plots where the majority of the
> observed data are well within the predictions of the 10th and 90th
> percentiles when we should expect about 10% of our data above the 90th
> percentile prediction and 10% below the 10th percentile prediction.
>
> Best regards,
>
> Ken
>
> Kenneth G. Kowalski
> President & CEO
> A2PG - Ann Arbor Pharmacometrics Group, Inc.
> 110 Miller Ave., Garden Suite
> Ann Arbor, MI 48104
> Work: 734-274-8255
> Cell: 248-207-5082
> Fax: 734-913-0230
> [email protected]
> www.a2pg.com
>
>
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On
> Behalf Of Eleveld, DJ
> Sent: Thursday, September 25, 2014 4:36 PM
> To: Pavel Belo; [email protected]
> Subject: RE: [NMusers] OMEGA matrix
>
> Hi Pavel,
> My question is: Why is it desirable to fit a complete omega matrix if its
> physical interpretation is unclear? Etas are variation of unknown origin
> i.e. not explained by the structural model. A full omega matrix allows the
> unknown variation of one paramater to have a (linear?) relationship with
> some other thing that is also unknown. If unknown A is found to have a
> linear relationship with unknown B, then what knowlegde is gained? I do
> think it can be instructive to to look at correlations and use this
> information to make a better structural model. But I think diagonal OMEGA
> matrix is more desirable if it works ok.
> warm regards,
> Douglas Eleveld
>
> From: [email protected] [[email protected]] on behalf
> of Pavel Belo [[email protected]]
> Sent: Thursday, September 25, 2014 4:24 PM
> To: [email protected]
> Subject: [NMusers] OMEGA matrix
>
> Hello Nonmem Community,
>
> It seems like NONMEM developers may advise to start with full OMEGA matrix
> at the beginning of model development. Monolix developers may advise to
> start with a diagonal matrix. Is there something different in NONMEM SAEM
> algorithms that makes model stable when a lot of statistically insignificant
> correlations/covariances are estimated in the model?
>
> It seems like NONMEM SAEM can be very stable in very "hard cases" (a lot of
> outliers, partially misspecified model, overparameterized model, etc.). The
> omega matrix is a part of the puzzle.
>
> When it is impossible to test every correlation coefficient for significance
> due to some limitations, it becomes a regulatory issue. We may need to be
> able to make a statement that the model is safe and sound even when OMEGA
> matrix can be overparameterized (tries to estimate too many insignificant
> parameters within the OMEGA matrix).
>
> Kind regards,
> Pavel
>
>
Dear Joren and the NONMEM Team,
Your email is definitely informative.
1. "Off-diagonal elements (as explained by Bob Bauer) are available as sample correlations and do not have to be separately computed in contrast to linearization approaches such as FOCE." It may explain the stability of the results when very large block matrix is used. On the other hand, it is not clear why Monolix SAEM may not work the same way. Is Monolix estimating correlations? Also, when we deal with "sample correlations", we may be talking about correlations between observed minus individual predicted values. Shrinkage can possibly affect such correlations.
1. "I would advise to explore its properties further using matrix decomposition approaches (PCA etc)". Do you suggest to decompose the omega matrix and explore derived variables instead of original off-diagonal elements? It seems straightforward, but I recall error messages.
If you can point at some publications for both items 1 and 2 above, it will be greatly appreciated.
The importance of improving the OMEGA matrix may come from PD modeling. PD models are frequently more empirical than PK models and strong correlations come from nowhere. They are difficult to interpret, but important to account for when simulations are requested by the agencies. There are correlations, which change from 0 to 0.6 when models are slightly different indicating that they may be insignificant. Monolix allows us to set a single correlation to zero. NONMEM may require a different approach. I am searching for the different approaches because after many years I am emotionally attached to NONMEM and because NONMEM is very flexible.
Kind regards,
Pavel
Quoted reply history
On Mon, Sep 29, 2014 at 07:00 PM, Jeroen Elassaiss-Schaap wrote:
Dear Pavel, others,
The underlying technical difference is that SAEM is in its core a sampling methodology. Off-diagonal elements (as explained by Bob Bauer) are available as sample correlations and do not have to be separately computed in contrast to linearization approaches such as FOCE.
The more interesting question to me, as also eluted to by Ken, is what criteria to set up for inclusion of an off-diagonal element. I completely support his argument for simulation performance of the model, as e.g. judged using a VPC. Whether to score it as an additional degree of freedom may be up to debate. An off-diagonal element in essence limits the freedom of the model as the random space in which samples can be generated will be smaller. In that perspective one could argue to retain any off-diagonal element that is sufficiently deviating from zero regardless of ofv changes, and to not apply the concept of over-parametrization (or at least not in comparison to other types of parameters). In practice inclusion of an important off-diagonal is mostly accompanied by a sound improvement in ofv anyway.
More can be found in earlier discussions we had on this list, see e.g. < https://www.mail-archive.com/ [email protected] /msg02736.html > https://www.mail-archive.com/ [email protected] /msg02736.html < https://www.mail-archive.com/ [email protected] /msg02736.html > for quite an extensive one from 2010. Here also an r-script to visualize the parameter space impact can be found ;-).
In cases where a larger full or banded omega block is found, I would advice to explore its properties further using matrix decomposition approaches (PCA etc) to evaluate propagated correlations across the matrix. But also on the basis of physiology/pharmacology as a data sample may not be informative enough to support robust interpretation of correlations. A discussion along those lines in reporting seems the more fruitful to me.
Best regards,
Jeroen
http://pd-value.com http://pd-value.com/
-- More value out of your data!
-----Original Message-----
From: [email protected] < mailto: [email protected] > [ mailto: [email protected] < mailto: [email protected] > ] On Behalf Of Standing Joseph (GREAT ORMOND STREET HOSPITAL FOR CHILDREN NHS FOUNDATION TRUST)
Sent: Friday, September 26, 2014 09:15
To: Kowalski, Ken; 'Eleveld, DJ'; 'Pavel Belo'; [email protected] < mailto: [email protected] >
Subject: RE: [NMusers] OMEGA matrix
Dear Pavel,
To answer your question I suggest you go on Bob Bauer's NONMEM 7 course. The understanding I gleaned from that course (which I think was enhanced by the excellent wine we had at lunch in Alicante) was that with appropriate MU parameterisation there is virtually no computational disadvantage to estimating the full block with the newer algorithms. So you might as well do it, at least in early runs where you want an idea of which parameter correlations might be useful/reasonably estimated.
BW,
Joe
Joseph F Standing
MRC Fellow, UCL Institute of Child Health
Antimicrobial Pharmacist, Great Ormond Street Hospital
Tel: +44(0)207 905 2370
Mobile: +44(0)7970 572435
From: [email protected] < mailto: [email protected] > [ [email protected] < mailto: [email protected] > ] On Behalf Of Ken Kowalski [ [email protected] < mailto: [email protected] > ]
Sent: 25 September 2014 22:43
To: 'Eleveld, DJ'; 'Pavel Belo'; [email protected] < mailto: [email protected] >
Subject: RE: [NMusers] OMEGA matrix
Warning: This message contains unverified links which may not be safe. You should only click links if you are sure they are from a trusted source.
Hi Douglas,
My own thinking is that you should fit the largest omega structure that can be supported by the data rather than just always assuming a diagonal omega structure. This does not necessarily mean always fitting a full block omega structure, as it can often lead to an ill-conditioned model, however, there may be a reduced block omega structure that is more parsimonious than the diagonal omega structure. Getting the omega structure right is particularly
important for simulation of individual responses. For example, if you
always simulate from a diagonal omega structure for CL and V when there is evidence that the random effects are highly positively correlated then you may end up simulating individual PK profiles for combinations of individual
CLs and Vs that are not represented in your data (i.e., high correlation
would suggest that individuals with high CL will tend to also have high V and vice versa whereas a simulation assuming that they are independent will result in simulating for some individuals with high CL and low V and some
individuals with low CL and high V that might not be represented in your
data). This could lead to simulations that over-predict the variation in
the concentration-time profiles even though the diagonal omega may be
sufficient for purposes of predicting central tendency in the PK profile. You can confirm this by VPC looking at your ability to predict say the 10th and 90th percentiles in comparison to the observed 10th and 90th percentiles in your data. That is, if you simulate from the diagonal omega when there is correlation in the random effects you may find that your prediction of the 10th and 90th percentiles are more extreme than that in your observed
data. I see this all the time in VPC plots where the majority of the
observed data are well within the predictions of the 10th and 90th
percentiles when we should expect about 10% of our data above the 90th
percentile prediction and 10% below the 10th percentile prediction.
Best regards,
Ken
Kenneth G. Kowalski
President & CEO
A2PG - Ann Arbor Pharmacometrics Group, Inc.
110 Miller Ave., Garden Suite
Ann Arbor, MI 48104
Work: 734-274-8255
Cell: 248-207-5082
Fax: 734-913-0230
[email protected] <mailto:[email protected]>
www.a2pg.com http://www.a2pg.com/
-----Original Message-----
From: [email protected] < mailto: [email protected] > [ mailto: [email protected] < mailto: [email protected] > ] On
Behalf Of Eleveld, DJ
Sent: Thursday, September 25, 2014 4:36 PM
To: Pavel Belo; [email protected] <mailto:[email protected]>
Subject: RE: [NMusers] OMEGA matrix
Hi Pavel,
My question is: Why is it desirable to fit a complete omega matrix if its
physical interpretation is unclear? Etas are variation of unknown origin
i.e. not explained by the structural model. A full omega matrix allows the
unknown variation of one paramater to have a (linear?) relationship with
some other thing that is also unknown. If unknown A is found to have a
linear relationship with unknown B, then what knowlegde is gained? I do
think it can be instructive to to look at correlations and use this
information to make a better structural model. But I think diagonal OMEGA
matrix is more desirable if it works ok.
warm regards,
Douglas Eleveld
From: [email protected] < mailto: [email protected] > [ [email protected] < mailto: [email protected] > ] on behalf
of Pavel Belo [[email protected] <mailto:[email protected]> ]
Sent: Thursday, September 25, 2014 4:24 PM
To: [email protected] <mailto:[email protected]>
Subject: [NMusers] OMEGA matrix
Hello Nonmem Community,
It seems like NONMEM developers may advise to start with full OMEGA matrix
at the beginning of model development. Monolix developers may advise to
start with a diagonal matrix. Is there something different in NONMEM SAEM algorithms that makes model stable when a lot of statistically insignificant
correlations/covariances are estimated in the model?
It seems like NONMEM SAEM can be very stable in very "hard cases" (a lot of outliers, partially misspecified model, overparameterized model, etc.). The
omega matrix is a part of the puzzle.
When it is impossible to test every correlation coefficient for significance due to some limitations, it becomes a regulatory issue. We may need to be able to make a statement that the model is safe and sound even when OMEGA matrix can be overparameterized (tries to estimate too many insignificant
parameters within the OMEGA matrix).
Kind regards,
Pavel
Hi Jeroen,
I think we might be on the same page but I wanted to get clarification about
your suggestion that we “not apply the concept of over-parameterization” with
respect to evaluating the omega structure. I’m assuming by
‘over-parameterization’ you mean a model that has more elements in omega than
might be necessary to be parsimonious. If so, I certainly agree but I wouldn’t
call such a model that has more parameters than necessary to be parsimonious as
necessarily over-parameterized. An over-parameterized model is one in which
there can be an infinite set of solutions to the parameter values that yields
the same fit. Such a setting can occur when the R-matrix in NONMEM is
singular. Such over-parameterized models are often also referred to as being
ill-conditioned or not stable. I think we should always avoid
over-parameterization, ill-conditioning and unstable models regardless of the
source (i.e., fixed effects, IIV random effects and omega-structure, or
residual error structure). However, I do agree that parsimony in omega is
probably not as important as say looking for a parsimonious set of covariate
parameter fixed effects when performing covariate modeling to obtain a final
model for prediction purposes. This is why in my earlier response below I
suggested fitting the “largest omega structure that can be supported by the
data”. What I meant by this statement is that we fit the largest number of
elements of omega while avoiding over-parameterization or ill-conditioning.
Such an omega structure might not be parsimonious (i.e., the smallest omega
structure that adequately describes the features in the data). The point I
was trying to make is that the smallest omega structure that adequately
describes the features in the data may not be a diagonal omega structure (i.e.,
when correlations do exist) particularly if we are interested in describing the
variation in the data and not just in predictions of central tendency.
Best,
Ken
Quoted reply history
From: [email protected] [mailto:[email protected]] On
Behalf Of Jeroen Elassaiss-Schaap
Sent: Monday, September 29, 2014 7:00 PM
To: [email protected]; [email protected]; [email protected];
[email protected]; [email protected]
Subject: Re: [NMusers] OMEGA matrix
Dear Pavel, others,
The underlying technical difference is that SAEM is in its core a sampling
methodology. Off-diagonal elements (as explained by Bob Bauer) are available as
sample correlations and do not have to be separately computed in contrast to
linearization approaches such as FOCE.
The more interesting question to me, as also eluted to by Ken, is what criteria
to set up for inclusion of an off-diagonal element. I completely support his
argument for simulation performance of the model, as e.g. judged using a VPC.
Whether to score it as an additional degree of freedom may be up to debate. An
off-diagonal element in essence limits the freedom of the model as the random
space in which samples can be generated will be smaller. In that perspective
one could argue to retain any off-diagonal element that is sufficiently
deviating from zero regardless of ofv changes, and to not apply the concept of
over-parametrization (or at least not in comparison to other types of
parameters). In practice inclusion of an important off-diagonal is mostly
accompanied by a sound improvement in ofv anyway.
More can be found in earlier discussions we had on this list, see e.g.
https://www.mail-archive.com/[email protected]/msg02736.html for quite an
extensive one from 2010. Here also an r-script to visualize the parameter space
impact can be found ;-).
In cases where a larger full or banded omega block is found, I would advice to
explore its properties further using matrix decomposition approaches (PCA etc)
to evaluate propagated correlations across the matrix. But also on the basis
of physiology/pharmacology as a data sample may not be informative enough to
support robust interpretation of correlations. A discussion along those lines
in reporting seems the more fruitful to me.
Best regards,
Jeroen
http://pd-value.com
-- More value out of your data!
-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Standing Joseph (GREAT ORMOND STREET HOSPITAL FOR CHILDREN NHS
FOUNDATION TRUST)
Sent: Friday, September 26, 2014 09:15
To: Kowalski, Ken; 'Eleveld, DJ'; 'Pavel Belo'; [email protected]
Subject: RE: [NMusers] OMEGA matrix
Dear Pavel,
To answer your question I suggest you go on Bob Bauer's NONMEM 7 course. The
understanding I gleaned from that course (which I think was enhanced by the
excellent wine we had at lunch in Alicante) was that with appropriate MU
parameterisation there is virtually no computational disadvantage to estimating
the full block with the newer algorithms. So you might as well do it, at least
in early runs where you want an idea of which parameter correlations might be
useful/reasonably estimated.
BW,
Joe
Joseph F Standing
MRC Fellow, UCL Institute of Child Health
Antimicrobial Pharmacist, Great Ormond Street Hospital
Tel: +44(0)207 905 2370
Mobile: +44(0)7970 572435
_____
From: [email protected] [[email protected]] On Behalf Of
Ken Kowalski [[email protected]]
Sent: 25 September 2014 22:43
To: 'Eleveld, DJ'; 'Pavel Belo'; [email protected]
Subject: RE: [NMusers] OMEGA matrix
Warning: This message contains unverified links which may not be safe. You
should only click links if you are sure they are from a trusted source.
Hi Douglas,
My own thinking is that you should fit the largest omega structure that can
be supported by the data rather than just always assuming a diagonal omega
structure. This does not necessarily mean always fitting a full block omega
structure, as it can often lead to an ill-conditioned model, however, there
may be a reduced block omega structure that is more parsimonious than the
diagonal omega structure. Getting the omega structure right is particularly
important for simulation of individual responses. For example, if you
always simulate from a diagonal omega structure for CL and V when there is
evidence that the random effects are highly positively correlated then you
may end up simulating individual PK profiles for combinations of individual
CLs and Vs that are not represented in your data (i.e., high correlation
would suggest that individuals with high CL will tend to also have high V
and vice versa whereas a simulation assuming that they are independent will
result in simulating for some individuals with high CL and low V and some
individuals with low CL and high V that might not be represented in your
data). This could lead to simulations that over-predict the variation in
the concentration-time profiles even though the diagonal omega may be
sufficient for purposes of predicting central tendency in the PK profile.
You can confirm this by VPC looking at your ability to predict say the 10th
and 90th percentiles in comparison to the observed 10th and 90th percentiles
in your data. That is, if you simulate from the diagonal omega when there
is correlation in the random effects you may find that your prediction of
the 10th and 90th percentiles are more extreme than that in your observed
data. I see this all the time in VPC plots where the majority of the
observed data are well within the predictions of the 10th and 90th
percentiles when we should expect about 10% of our data above the 90th
percentile prediction and 10% below the 10th percentile prediction.
Best regards,
Ken
Kenneth G. Kowalski
President & CEO
A2PG - Ann Arbor Pharmacometrics Group, Inc.
110 Miller Ave., Garden Suite
Ann Arbor, MI 48104
Work: 734-274-8255
Cell: 248-207-5082
Fax: 734-913-0230
[email protected]
www.a2pg.com
-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Eleveld, DJ
Sent: Thursday, September 25, 2014 4:36 PM
To: Pavel Belo; [email protected]
Subject: RE: [NMusers] OMEGA matrix
Hi Pavel,
My question is: Why is it desirable to fit a complete omega matrix if its
physical interpretation is unclear? Etas are variation of unknown origin
i.e. not explained by the structural model. A full omega matrix allows the
unknown variation of one paramater to have a (linear?) relationship with
some other thing that is also unknown. If unknown A is found to have a
linear relationship with unknown B, then what knowlegde is gained? I do
think it can be instructive to to look at correlations and use this
information to make a better structural model. But I think diagonal OMEGA
matrix is more desirable if it works ok.
warm regards,
Douglas Eleveld
_____
From: [email protected] [[email protected]] on behalf
of Pavel Belo [[email protected]]
Sent: Thursday, September 25, 2014 4:24 PM
To: [email protected]
Subject: [NMusers] OMEGA matrix
Hello Nonmem Community,
It seems like NONMEM developers may advise to start with full OMEGA matrix
at the beginning of model development. Monolix developers may advise to
start with a diagonal matrix. Is there something different in NONMEM SAEM
algorithms that makes model stable when a lot of statistically insignificant
correlations/covariances are estimated in the model?
It seems like NONMEM SAEM can be very stable in very "hard cases" (a lot of
outliers, partially misspecified model, overparameterized model, etc.). The
omega matrix is a part of the puzzle.
When it is impossible to test every correlation coefficient for significance
due to some limitations, it becomes a regulatory issue. We may need to be
able to make a statement that the model is safe and sound even when OMEGA
matrix can be overparameterized (tries to estimate too many insignificant
parameters within the OMEGA matrix).
Kind regards,
Pavel
_____
Hi,
As pointed out by others I agree it is essential to consider the existence of random effect correlations if you wish to make model predictions e.g. to use a VPC to evaluate a model.
I agree with Jeroen that this should be primarily be an informed choice based on physiology/pharmacology. 'Blue sky' searches for correlations which when would have no rational explanation or interpretation should be done with a great deal of caution.
It can be tricky to explore all possible combinations using the change in OFV (e.g. with the likelihood ratio test) to guide model selection. A more straightforward approach is to bootstrap the model with a full covariance block for all the random effects you suspect may be correlated.
Bootstrapping today is usually a practical option because runs can be easily performed in parallel on multiple processors on the same machine or on a cluster. I typically use 100 bootstrap replicates for this purpose and look for correlations which include zero in the 95% bootstrap confidence interval. If I find such correlations then I know I should be able to remove those covariances from the covariance block. I can then re-run the bootstrap and obtain confidence intervals on all the parameters including the correlations. Confidence intervals calculated from asymptotic standard errors (if you can get them) are usually unreliable compared with parametric bootstrap confidence intervals ( http://www.page-meeting.org/default.asp?abstract=3143 ).
i don't agree with Ken that "ill-conditioning" or "not stable" based on failure of the $COVARIANCE step should be used to judge the adequacy of the results. Experimentally it has been shown that the bootstrap distribution of parameter uncertainty is not different when comparing runs which terminated and those which were successful or which completed the $COVARIANCE step. http://www.mail-archive.com/nmusers%40globomaxnm.com/msg03401.html . See also http://holford.fmhs.auckland.ac.nz/docs/bootstrap-and-confidence-intervals.pdf slides 24 to 31.
Best wishes,
Nick
Quoted reply history
On 1/10/2014 7:57 a.m., Ken Kowalski wrote:
> Hi Jeroen,
>
> I think we might be on the same page but I wanted to get clarification about your suggestion that we “not apply the concept of over-parameterization” with respect to evaluating the omega structure. I’m assuming by ‘over-parameterization’ you mean a model that has more elements in omega than might be necessary to be parsimonious. If so, I certainly agree but I wouldn’t call such a model that has more parameters than necessary to be parsimonious as necessarily over-parameterized. An over-parameterized model is one in which there can be an infinite set of solutions to the parameter values that yields the same fit. Such a setting can occur when the R-matrix in NONMEM is singular. Such over-parameterized models are often also referred to as being ill-conditioned or not stable. I think we should always avoid over-parameterization, ill-conditioning and unstable models regardless of the source (i.e., fixed effects, IIV random effects and omega-structure, or residual error structure). However, I do agree that parsimony in omega is probably not as important as say looking for a parsimonious set of covariate parameter fixed effects when performing covariate modeling to obtain a final model for prediction purposes. This is why in my earlier response below I suggested fitting the “largest omega structure that can be supported by the data”. What I meant by this statement is that we fit the largest number of elements of omega while avoiding over-parameterization or ill-conditioning. Such an omega structure might not be parsimonious (i.e., the smallest omega structure that adequately describes the features in the data). The point I was trying to make is that the smallest omega structure that adequately describes the features in the data may not be a diagonal omega structure (i.e., when correlations do exist) particularly if we are interested in describing the variation in the data and not just in predictions of central tendency.
>
> Best,
>
> Ken
>
> *From:* [email protected] [ mailto: [email protected] ] *On Behalf Of *Jeroen Elassaiss-Schaap
>
> *Sent:* Monday, September 29, 2014 7:00 PM
>
> *To:* [email protected] ; [email protected] ; [email protected] ; [email protected] ; [email protected]
>
> *Subject:* Re: [NMusers] OMEGA matrix
>
> Dear Pavel, others,
>
> The underlying technical difference is that SAEM is in its core a sampling methodology. Off-diagonal elements (as explained by Bob Bauer) are available as sample correlations and do not have to be separately computed in contrast to linearization approaches such as FOCE.
>
> The more interesting question to me, as also eluted to by Ken, is what criteria to set up for inclusion of an off-diagonal element. I completely support his argument for simulation performance of the model, as e.g. judged using a VPC. Whether to score it as an additional degree of freedom may be up to debate. An off-diagonal element in essence limits the freedom of the model as the random space in which samples can be generated will be smaller. In that perspective one could argue to retain any off-diagonal element that is sufficiently deviating from zero regardless of ofv changes, and to not apply the concept of over-parametrization (or at least not in comparison to other types of parameters). In practice inclusion of an important off-diagonal is mostly accompanied by a sound improvement in ofv anyway.
>
> More can be found in earlier discussions we had on this list, see e.g. https://www.mail-archive.com/ [email protected] /msg02736.html for quite an extensive one from 2010. Here also an r-script to visualize the parameter space impact can be found ;-).
>
> In cases where a larger full or banded omega block is found, I would advice to explore its properties further using matrix decomposition approaches (PCA etc) to evaluate propagated correlations across the matrix. But also on the basis of physiology/pharmacology as a data sample may not be informative enough to support robust interpretation of correlations. A discussion along those lines in reporting seems the more fruitful to me.
>
> Best regards,
> Jeroen
>
> http://pd-value.com
>
> -- More value out of your data!
>
> -----Original Message-----
> From:[email protected] <mailto:[email protected]>
> [mailto:[email protected]] On Behalf Of Standing Joseph (GREAT ORMOND
> STREET HOSPITAL FOR CHILDREN NHS FOUNDATION TRUST)
> Sent: Friday, September 26, 2014 09:15
> To: Kowalski, Ken; 'Eleveld, DJ'; 'Pavel Belo';[email protected]
> <mailto:[email protected]>
> Subject: RE: [NMusers] OMEGA matrix
>
> Dear Pavel,
>
> To answer your question I suggest you go on Bob Bauer's NONMEM 7 course.
> The understanding I gleaned from that course (which I think was enhanced by the
> excellent wine we had at lunch in Alicante) was that with appropriate MU
> parameterisation there is virtually no computational disadvantage to estimating
> the full block with the newer algorithms. So you might as well do it, at least
> in early runs where you want an idea of which parameter correlations might be
> useful/reasonably estimated.
>
> BW,
>
> Joe
>
> Joseph F Standing
> MRC Fellow, UCL Institute of Child Health
> Antimicrobial Pharmacist, Great Ormond Street Hospital
> Tel: +44(0)207 905 2370
> Mobile: +44(0)7970 572435
>
> ------------------------------------------------------------------------
>
> From:[email protected] <mailto:[email protected]>
> [[email protected] <mailto:[email protected]>] On Behalf Of Ken
> Kowalski [[email protected] <mailto:[email protected]>]
> Sent: 25 September 2014 22:43
> To: 'Eleveld, DJ'; 'Pavel Belo';[email protected]
> <mailto:[email protected]>
> Subject: RE: [NMusers] OMEGA matrix
>
> Warning: This message contains unverified links which may not be safe. You
> should only click links if you are sure they are from a trusted source.
> Hi Douglas,
>
> My own thinking is that you should fit the largest omega structure that can
> be supported by the data rather than just always assuming a diagonal omega
> structure. This does not necessarily mean always fitting a full block omega
> structure, as it can often lead to an ill-conditioned model, however, there
> may be a reduced block omega structure that is more parsimonious than the
> diagonal omega structure. Getting the omega structure right is particularly
> important for simulation of individual responses. For example, if you
> always simulate from a diagonal omega structure for CL and V when there is
> evidence that the random effects are highly positively correlated then you
> may end up simulating individual PK profiles for combinations of individual
> CLs and Vs that are not represented in your data (i.e., high correlation
> would suggest that individuals with high CL will tend to also have high V
> and vice versa whereas a simulation assuming that they are independent will
> result in simulating for some individuals with high CL and low V and some
> individuals with low CL and high V that might not be represented in your
> data). This could lead to simulations that over-predict the variation in
> the concentration-time profiles even though the diagonal omega may be
> sufficient for purposes of predicting central tendency in the PK profile.
> You can confirm this by VPC looking at your ability to predict say the 10th
> and 90th percentiles in comparison to the observed 10th and 90th percentiles
> in your data. That is, if you simulate from the diagonal omega when there
> is correlation in the random effects you may find that your prediction of
> the 10th and 90th percentiles are more extreme than that in your observed
> data. I see this all the time in VPC plots where the majority of the
> observed data are well within the predictions of the 10th and 90th
> percentiles when we should expect about 10% of our data above the 90th
> percentile prediction and 10% below the 10th percentile prediction.
>
> Best regards,
>
> Ken
>
> Kenneth G. Kowalski
> President & CEO
> A2PG - Ann Arbor Pharmacometrics Group, Inc.
> 110 Miller Ave., Garden Suite
> Ann Arbor, MI 48104
> Work: 734-274-8255
> Cell: 248-207-5082
> Fax: 734-913-0230
> [email protected] <mailto:[email protected]>
> www.a2pg.com http://www.a2pg.com
>
> -----Original Message-----
> From:[email protected] <mailto:[email protected]>
> [mailto:[email protected]] On
> Behalf Of Eleveld, DJ
> Sent: Thursday, September 25, 2014 4:36 PM
> To: Pavel Belo;[email protected] <mailto:[email protected]>
> Subject: RE: [NMusers] OMEGA matrix
>
> Hi Pavel,
> My question is: Why is it desirable to fit a complete omega matrix if its
> physical interpretation is unclear? Etas are variation of unknown origin
> i.e. not explained by the structural model. A full omega matrix allows the
> unknown variation of one paramater to have a (linear?) relationship with
> some other thing that is also unknown. If unknown A is found to have a
> linear relationship with unknown B, then what knowlegde is gained? I do
> think it can be instructive to to look at correlations and use this
> information to make a better structural model. But I think diagonal OMEGA
> matrix is more desirable if it works ok.
> warm regards,
> Douglas Eleveld
>
> ------------------------------------------------------------------------
>
> From:[email protected] <mailto:[email protected]>
> [[email protected] <mailto:[email protected]>] on behalf
> of Pavel Belo [[email protected] <mailto:[email protected]>]
> Sent: Thursday, September 25, 2014 4:24 PM
> To:[email protected] <mailto:[email protected]>
> Subject: [NMusers] OMEGA matrix
>
> Hello Nonmem Community,
>
> It seems like NONMEM developers may advise to start with full OMEGA matrix
> at the beginning of model development. Monolix developers may advise to
> start with a diagonal matrix. Is there something different in NONMEM SAEM
> algorithms that makes model stable when a lot of statistically insignificant
> correlations/covariances are estimated in the model?
>
> It seems like NONMEM SAEM can be very stable in very "hard cases" (a lot of
> outliers, partially misspecified model, overparameterized model, etc.). The
> omega matrix is a part of the puzzle.
>
> When it is impossible to test every correlation coefficient for significance
> due to some limitations, it becomes a regulatory issue. We may need to be
> able to make a statement that the model is safe and sound even when OMEGA
> matrix can be overparameterized (tries to estimate too many insignificant
> parameters within the OMEGA matrix).
>
> Kind regards,
> Pavel
>
> ------------------------------------------------------------------------
>
>
Dear Pavel,
With regard how to handle this differently in Monolix, you probably better ask
their developers (or perhaps Marc is listening in...). It would not surprise me
if it would because of statistical reasons rather than practical ones. On the
subject of shrinkage: the sampling I was talking about is at the level of
parameters and likelihood evaluation, not data sampling per se and therefore
shrinkage is not a directly affecting these correlations.
Wrt to omega matrix evaluation: similar to the covariance matrix of estimates,
one can further analyze the omega matrix. It actually is a long time ago since
I tried that, I do not recall numerical issues, maybe you need to tweak the
tolerance for SVD sometime.
This type of analysis is, as it appears, textbook material. I searched for some
reference material, you may find this paper useful:
iasri.res.in/ebook/EBADAT/3.../2-regdiagfeb07.pdf
otherwise topics are also discussed on Wikipedia, of course.
Last topic you mention: In nonmem one cannot fix individual correlations to
zero. Similar results however can be obtained by developing a banded omega
matrix. Elements in the bottom left corner of the matrix can be fixed to zero.
Hope this helps,
Jeroen
http://pd-value.com
-- More value out of your data!
Quoted reply history
On Sep 30, 2014, 7:15 PM, at 7:15 PM, Pavel Belo <[email protected]> wrote:
>
>
>
>
>
>Sorry Jeroen, I have to correct you name in the email:
>
>
>
>Dear Jeroen and the NONMEM Team,
>
>
>
>Your email is definitely informative.
>
>
>
> 1. "Off-diagonal elements (as explained by Bob Bauer) are available
>as sample correlations and do not have to be separately computed in
>contrast to linearization approaches such as FOCE." It may explain the
>
>stability of the results when very large block matrix is used. On the
>other hand, it is not clear why Monolix SAEM may not work the same
>way.
>Is Monolix estimating correlations? Also, when we deal with "sample
>correlations", we may be talking about correlations between observed
>minus individual predicted values. Shrinkage can possibly affect such
>correlations.
>
>
>
>
> 1. "I would advise to explore its properties further using matrix
>decomposition approaches (PCA etc)". Do you suggest to decompose the
>omega matrix and explore derived variables instead of original
>off-diagonal elements? It seems straightforward, but I recall error
>messages.
>
>
>
>
>If you can point at some publications for both items 1 and 2 above, it
>will be greatly appreciated.
>
>
>
>The importance of improving the OMEGA matrix may come from PD
>modeling.
>PD models are frequently more empirical than PK models and strong
>correlations come from nowhere. They are difficult to interpret, but
>important to account for when simulations are requested by the
>agencies. There are correlations, which change from 0 to 0.6 when
>models are slightly different indicating that they may be
>insignificant. Monolix allows us to set a single correlation to zero.
>
>NONMEM may require a different approach. I am searching for the
>different approaches because after many years I am emotionally attached
>
>to NONMEM and because NONMEM is very flexible.
>
>
>
>
>
>Kind regards,
>
>Pavel
>
>
>
>
>
>
>On Mon, Sep 29, 2014 at 07:00 PM, Jeroen Elassaiss-Schaap wrote:
>
>
>
>
>
>
>
>
>
>
>
>
>
>Dear Pavel, others,
>
>The underlying technical difference is that SAEM is in its core a
>sampling methodology. Off-diagonal elements (as explained by Bob Bauer)
>
>are available as sample correlations and do not have to be separately
>computed in contrast to linearization approaches such as FOCE.
>
>The more interesting question to me, as also eluted to by Ken, is what
>criteria to set up for inclusion of an off-diagonal element. I
>completely support his argument for simulation performance of the
>model,
>as e.g. judged using a VPC. Whether to score it as an additional degree
>
>of freedom may be up to debate. An off-diagonal element in essence
>limits the freedom of the model as the random space in which samples
>can
>be generated will be smaller. In that perspective one could argue to
>retain any off-diagonal element that is sufficiently deviating from
>zero
>regardless of ofv changes, and to not apply the concept of
>over-parametrization (or at least not in comparison to other types of
>parameters). In practice inclusion of an important off-diagonal is
>mostly accompanied by a sound improvement in ofv anyway.
>
>More can be found in earlier discussions we had on this list, see e.g.
https://www.mail-archive.com/[email protected]/msg02736.html
> https://www.mail-archive.com/[email protected]/msg02736.html
https://www.mail-archive.com/[email protected]/msg02736.html
>for
>quite an extensive one from 2010. Here also an r-script to visualize
>the
>parameter space impact can be found ;-).
>
>In cases where a larger full or banded omega block is found, I would
>advice to explore its properties further using matrix decomposition
>approaches (PCA etc) to evaluate propagated correlations across the
>matrix. But also on the basis of physiology/pharmacology as a data
>sample may not be informative enough to support robust interpretation
>of
>correlations. A discussion along those lines in reporting seems the
>more
>fruitful to me.
>
>Best regards,
>Jeroen
>
> http://pd-value.com http://pd-value.com/
>
>
>-- More value out of your data!
>
>
>
>
>-----Original Message-----
>From: [email protected]
><mailto:[email protected]>
>[mailto:[email protected]
><mailto:[email protected]> ] On Behalf Of Standing Joseph
>(GREAT ORMOND STREET HOSPITAL FOR CHILDREN NHS FOUNDATION TRUST)
>Sent: Friday, September 26, 2014 09:15
>To: Kowalski, Ken; 'Eleveld, DJ'; 'Pavel Belo'; [email protected]
><mailto:[email protected]>
>Subject: RE: [NMusers] OMEGA matrix
>Dear Pavel,
>To answer your question I suggest you go on Bob Bauer's NONMEM 7
>course.
>The understanding I gleaned from that course (which I think was
>enhanced
>by the excellent wine we had at lunch in Alicante) was that with
>appropriate MU parameterisation there is virtually no computational
>disadvantage to estimating the full block with the newer algorithms.
>So
>you might as well do it, at least in early runs where you want an idea
>of which parameter correlations might be useful/reasonably estimated.
>BW,
>Joe
>
>Joseph F Standing
>MRC Fellow, UCL Institute of Child Health
>Antimicrobial Pharmacist, Great Ormond Street Hospital
>Tel: +44(0)207 905 2370
>Mobile: +44(0)7970 572435
>
>
>
>From: [email protected]
><mailto:[email protected]>
>[[email protected] <mailto:[email protected]> ]
>On
>Behalf Of Ken Kowalski [[email protected]
><mailto:[email protected]> ]
>Sent: 25 September 2014 22:43
>To: 'Eleveld, DJ'; 'Pavel Belo'; [email protected]
><mailto:[email protected]>
>Subject: RE: [NMusers] OMEGA matrix
>Warning: This message contains unverified links which may not be safe.
>You should only click links if you are sure they are from a trusted
>source.
>Hi Douglas,
>My own thinking is that you should fit the largest omega structure that
>
>can
>be supported by the data rather than just always assuming a diagonal
>omega
>structure. This does not necessarily mean always fitting a full block
>omega
>structure, as it can often lead to an ill-conditioned model, however,
>there
>may be a reduced block omega structure that is more parsimonious than
>the
>diagonal omega structure. Getting the omega structure right is
>particularly
>important for simulation of individual responses. For example, if you
>always simulate from a diagonal omega structure for CL and V when there
>
>is
>evidence that the random effects are highly positively correlated then
>you
>may end up simulating individual PK profiles for combinations of
>individual
>CLs and Vs that are not represented in your data (i.e., high
>correlation
>would suggest that individuals with high CL will tend to also have high
>
>V
>and vice versa whereas a simulation assuming that they are independent
>will
>result in simulating for some individuals with high CL and low V and
>some
>individuals with low CL and high V that might not be represented in
>your
>data). This could lead to simulations that over-predict the variation
>in
>the concentration-time profiles even though the diagonal omega may be
>sufficient for purposes of predicting central tendency in the PK
>profile.
>You can confirm this by VPC looking at your ability to predict say the
>10th
>and 90th percentiles in comparison to the observed 10th and 90th
>percentiles
>in your data. That is, if you simulate from the diagonal omega when
>there
>is correlation in the random effects you may find that your prediction
>of
>the 10th and 90th percentiles are more extreme than that in your
>observed
>data. I see this all the time in VPC plots where the majority of the
>observed data are well within the predictions of the 10th and 90th
>percentiles when we should expect about 10% of our data above the 90th
>percentile prediction and 10% below the 10th percentile prediction.
>Best regards,
>Ken
>Kenneth G. Kowalski
>President & CEO
>A2PG - Ann Arbor Pharmacometrics Group, Inc.
>110 Miller Ave., Garden Suite
>Ann Arbor, MI 48104
>Work: 734-274-8255
>Cell: 248-207-5082
>Fax: 734-913-0230
>[email protected] <mailto:[email protected]>
>www.a2pg.com http://www.a2pg.com/
>
>
>-----Original Message-----
>From: [email protected]
><mailto:[email protected]>
>[mailto:[email protected]
><mailto:[email protected]> ] On
>Behalf Of Eleveld, DJ
>Sent: Thursday, September 25, 2014 4:36 PM
>To: Pavel Belo; [email protected] <mailto:[email protected]>
>Subject: RE: [NMusers] OMEGA matrix
>Hi Pavel,
>My question is: Why is it desirable to fit a complete omega matrix if
>its
>physical interpretation is unclear? Etas are variation of unknown
>origin
>i.e. not explained by the structural model. A full omega matrix allows
>the
>unknown variation of one paramater to have a (linear?) relationship
>with
>some other thing that is also unknown. If unknown A is found to have a
>linear relationship with unknown B, then what knowlegde is gained? I do
>think it can be instructive to to look at correlations and use this
>information to make a better structural model. But I think diagonal
>OMEGA
>matrix is more desirable if it works ok.
>warm regards,
>Douglas Eleveld
>
>
>
>From: [email protected]
><mailto:[email protected]>
>[[email protected] <mailto:[email protected]> ]
>on
>behalf
>of Pavel Belo [[email protected] <mailto:[email protected]> ]
>Sent: Thursday, September 25, 2014 4:24 PM
>To: [email protected] <mailto:[email protected]>
>Subject: [NMusers] OMEGA matrix
>Hello Nonmem Community,
>It seems like NONMEM developers may advise to start with full OMEGA
>matrix
>at the beginning of model development. Monolix developers may advise to
>start with a diagonal matrix. Is there something different in NONMEM
>SAEM
>algorithms that makes model stable when a lot of statistically
>insignificant
>correlations/covariances are estimated in the model?
>It seems like NONMEM SAEM can be very stable in very "hard cases" (a
>lot
>of
>outliers, partially misspecified model, overparameterized model, etc.).
>
>The
>omega matrix is a part of the puzzle.
>When it is impossible to test every correlation coefficient for
>significance
>due to some limitations, it becomes a regulatory issue. We may need to
>be
>able to make a statement that the model is safe and sound even when
>OMEGA
>matrix can be overparameterized (tries to estimate too many
>insignificant
>parameters within the OMEGA matrix).
>Kind regards,
>Pavel
>
>
>
>
Pavel,
[Apologies if this appears for a 3rd time on nmusers - I sent it earlier but it did not show up yet]
You made some remarks about NONMEM and Monolix which I don't fully understand. You wrote "Monolix allows us to set a single correlation to zero. NONMEM may require a different approach."
Let me try to explain how I understand how correlations may be defined in NONMEM and Monolix. Imagine this NONMEM covariance block:
$OMEGA BLOCK(5)
1
1 1
1 1 1
1 1 1 1
1 1 1 1 1
The Monolix GUI allows you to specify correlations between random effects in a very similar format (I omit the upper triangular off-diagonal elements because they are always the same as the lower triangular off diagonal elements).
1
1 1
1 1 1
1 1 1 1
1 1 1 1 1
Using NONMEM this can be modified to a series of band matrices like this:
$OMEGA BLOCK(5)
1
1 1
1 1 1
1 1 1 1
0 1 1 1 1
$OMEGA BLOCK(5)
1
1 1
1 1 1
0 1 1 1
0 0 1 1 1
$OMEGA BLOCK(5)
1
1 1
0 1 1
0 0 1 1
0 0 0 1 1
If you try to set up a band matrix with Monolix you will get an error:
"The IIV covariance structure is not valid. It must be symmetric and block defined."
If you set correlations to zero with the Monolix GUI you can only do this when you define blocks which are not linked in any way with another correlation (like the band matrices above) e.g.
1
1 1
1 1 1
0 0 0 1
0 0 0 1 1
You would specify this as two blocks for NONMEM like this:
$OMEGA BLOCK(3)
1
1 1
1 1 1
$OMEGA BLOCK(2)
1
1 1
Best wishes,
Nick
--
Nick Holford, Professor Clinical Pharmacology
Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A
University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand
office:+64(9)923-6730 mobile:NZ +64(21)46 23 53
email:[email protected]
http://holford.fmhs.auckland.ac.nz/
Holford SD, Allegaert K, Anderson BJ, Kukanich B, Sousa AB, Steinman A, Pypendop,
B., Mehvar, R., Giorgi, M., Holford,N.H.G. Parent-metabolite pharmacokinetic models
- tests of assumptions and predictions. Journal of Pharmacology & Clinical
Toxicology. 2014;2(2):1023-34.
Ribba B, Holford N, Magni P, Trocóniz I, Gueorguieva I, Girard P, Sarr,C.,
Elishmereni,M., Kloft,C., Friberg,L. A review of mixed-effects models of tumor
growth and effects of anticancer drug treatment for population analysis. CPT:
pharmacometrics & systems pharmacology. 2014;Accepted 15-Mar-2014.
It appears my message did not go through as well. So, I trimmed off part of
the email thread to minimize the length in hopes that this will now go through.
Hi All,
I don’t want to re-hash old ground as Nick and I have agreed to disagree about
the value of the $COV step. I still maintain that the output from the $COV
step provides useful diagnostic information. It has never been my position
that failure or success of the $COV step in and of itself is informative of
ill-conditioning or instability of the model. There are certainly cases where
the COV step fails and it is not related to ill-conditioning and successful COV
steps where the diagnostics from the COV step output suggests that the model is
ill-conditioned. So simple success/failure of the $COV step in and of itself
is not very useful. That being said, I still believe we should avoid
over-fitting, over-parameterization, ill-conditioning, instability, etc. and
acknowledge the limitations of our data. How one goes about that assessment
whether through bootstrapping, inspection of $COV step output, or some other
diagnostic assessments is not as critical to me.
Best,
Ken
Quoted reply history
From: [email protected] [mailto:[email protected]] On
Behalf Of Nick Holford
Sent: Tuesday, September 30, 2014 4:51 PM
To: nmusers
Subject: Re: [NMusers] OMEGA matrix
Hi,
As pointed out by others I agree it is essential to consider the existence of
random effect correlations if you wish to make model predictions e.g. to use a
VPC to evaluate a model.
I agree with Jeroen that this should be primarily be an informed choice based
on physiology/pharmacology. 'Blue sky' searches for correlations which when
would have no rational explanation or interpretation should be done with a
great deal of caution.
It can be tricky to explore all possible combinations using the change in OFV
(e.g. with the likelihood ratio test) to guide model selection. A more
straightforward approach is to bootstrap the model with a full covariance block
for all the random effects you suspect may be correlated.
Bootstrapping today is usually a practical option because runs can be easily
performed in parallel on multiple processors on the same machine or on a
cluster. I typically use 100 bootstrap replicates for this purpose and look
for correlations which include zero in the 95% bootstrap confidence interval.
If I find such correlations then I know I should be able to remove those
covariances from the covariance block. I can then re-run the bootstrap and
obtain confidence intervals on all the parameters including the correlations.
Confidence intervals calculated from asymptotic standard errors (if you can get
them) are usually unreliable compared with parametric bootstrap confidence
intervals ( http://www.page-meeting.org/default.asp?abstract=3143).
i don't agree with Ken that "ill-conditioning" or "not stable" based on failure
of the $COVARIANCE step should be used to judge the adequacy of the results.
Experimentally it has been shown that the bootstrap distribution of parameter
uncertainty is not different when comparing runs which terminated and those
which were successful or which completed the $COVARIANCE step.
http://www.mail-archive.com/nmusers%40globomaxnm.com/msg03401.html. See also
http://holford.fmhs.auckland.ac.nz/docs/bootstrap-and-confidence-intervals.pdf
slides 24 to 31.
Best wishes,
Nick
On 1/10/2014 7:57 a.m., Ken Kowalski wrote:
Hi Jeroen,
I think we might be on the same page but I wanted to get clarification about
your suggestion that we “not apply the concept of over-parameterization” with
respect to evaluating the omega structure. I’m assuming by
‘over-parameterization’ you mean a model that has more elements in omega than
might be necessary to be parsimonious. If so, I certainly agree but I wouldn’t
call such a model that has more parameters than necessary to be parsimonious as
necessarily over-parameterized. An over-parameterized model is one in which
there can be an infinite set of solutions to the parameter values that yields
the same fit. Such a setting can occur when the R-matrix in NONMEM is
singular. Such over-parameterized models are often also referred to as being
ill-conditioned or not stable. I think we should always avoid
over-parameterization, ill-conditioning and unstable models regardless of the
source (i.e., fixed effects, IIV random effects and omega-structure, or
residual error structure). However, I do agree that parsimony in omega is
probably not as important as say looking for a parsimonious set of covariate
parameter fixed effects when performing covariate modeling to obtain a final
model for prediction purposes. This is why in my earlier response below I
suggested fitting the “largest omega structure that can be supported by the
data”. What I meant by this statement is that we fit the largest number of
elements of omega while avoiding over-parameterization or ill-conditioning.
Such an omega structure might not be parsimonious (i.e., the smallest omega
structure that adequately describes the features in the data). The point I
was trying to make is that the smallest omega structure that adequately
describes the features in the data may not be a diagonal omega structure (i.e.,
when correlations do exist) particularly if we are interested in describing the
variation in the data and not just in predictions of central tendency.
Best,
Ken
Hi Jeroen,
I have also seen that adding correlations often got a impressive improvement in
objective function. However, very often when I test that model using
cross-validation the predictive performance is *worse* than the model without
the correlation. I would call that classic over-fitting. Its the same thing you
would expect when adding an unnecessary theta in the model. This was a bit of
an eye-opener for me to see these things as equivalent. Try it, you might be
suprised.
It might depend on how you evaluate and use your models. I care most about how
well a model predicts datasets it has never seen and less about whether its the
absolute best model for the current dataset. Your applications may be different.
As far as whether adding a correlation was an additional degree of freedom or
not was always confusing to me. Now I just let NONMEM decide, it just tells you
how many parameters it is optimizing.
warm regards,
Douglas Eleveld
________________________________
Quoted reply history
Van: Jeroen Elassaiss-Schaap [mailto:[email protected]]
Verzonden: September 30, 2014 1:00 AM
Aan: [email protected]; [email protected]; [email protected];
[email protected]; Eleveld, DJ
Onderwerp: Re: [NMusers] OMEGA matrix
Dear Pavel, others,
The underlying technical difference is that SAEM is in its core a sampling
methodology. Off-diagonal elements (as explained by Bob Bauer) are available as
sample correlations and do not have to be separately computed in contrast to
linearization approaches such as FOCE.
The more interesting question to me, as also eluted to by Ken, is what criteria
to set up for inclusion of an off-diagonal element. I completely support his
argument for simulation performance of the model, as e.g. judged using a VPC.
Whether to score it as an additional degree of freedom may be up to debate. An
off-diagonal element in essence limits the freedom of the model as the random
space in which samples can be generated will be smaller. In that perspective
one could argue to retain any off-diagonal element that is sufficiently
deviating from zero regardless of ofv changes, and to not apply the concept of
over-parametrization (or at least not in comparison to other types of
parameters). In practice inclusion of an important off-diagonal is mostly
accompanied by a sound improvement in ofv anyway.
More can be found in earlier discussions we had on this list, see e.g.
https://www.mail-archive.com/[email protected]/msg02736.html
https://www.mail-archive.com/[email protected]/msg02736.html for quite an
extensive one from 2010. Here also an r-script to visualize the parameter space
impact can be found ;-).
In cases where a larger full or banded omega block is found, I would advice to
explore its properties further using matrix decomposition approaches (PCA etc)
to evaluate propagated correlations across the matrix. But also on the basis
of physiology/pharmacology as a data sample may not be informative enough to
support robust interpretation of correlations. A discussion along those lines
in reporting seems the more fruitful to me.
Best regards,
Jeroen
http://pd-value.com
-- More value out of your data!
-----Original Message-----
From: [email protected]<mailto:[email protected]>
[mailto:[email protected]] On Behalf Of Standing Joseph (GREAT
ORMOND STREET HOSPITAL FOR CHILDREN NHS FOUNDATION TRUST)
Sent: Friday, September 26, 2014 09:15
To: Kowalski, Ken; 'Eleveld, DJ'; 'Pavel Belo';
[email protected]<mailto:[email protected]>
Subject: RE: [NMusers] OMEGA matrix
Dear Pavel,
To answer your question I suggest you go on Bob Bauer's NONMEM 7 course. The
understanding I gleaned from that course (which I think was enhanced by the
excellent wine we had at lunch in Alicante) was that with appropriate MU
parameterisation there is virtually no computational disadvantage to estimating
the full block with the newer algorithms. So you might as well do it, at least
in early runs where you want an idea of which parameter correlations might be
useful/reasonably estimated.
BW,
Joe
Joseph F Standing
MRC Fellow, UCL Institute of Child Health
Antimicrobial Pharmacist, Great Ormond Street Hospital
Tel: +44(0)207 905 2370
Mobile: +44(0)7970 572435
________________________________
From: [email protected]<mailto:[email protected]>
[[email protected]<mailto:[email protected]>] On Behalf
Of Ken Kowalski [[email protected]<mailto:[email protected]>]
Sent: 25 September 2014 22:43
To: 'Eleveld, DJ'; 'Pavel Belo';
[email protected]<mailto:[email protected]>
Subject: RE: [NMusers] OMEGA matrix
Warning: This message contains unverified links which may not be safe. You
should only click links if you are sure they are from a trusted source.
Hi Douglas,
My own thinking is that you should fit the largest omega structure that can
be supported by the data rather than just always assuming a diagonal omega
structure. This does not necessarily mean always fitting a full block omega
structure, as it can often lead to an ill-conditioned model, however, there
may be a reduced block omega structure that is more parsimonious than the
diagonal omega structure. Getting the omega structure right is particularly
important for simulation of individual responses. For example, if you
always simulate from a diagonal omega structure for CL and V when there is
evidence that the random effects are highly positively correlated then you
may end up simulating individual PK profiles for combinations of individual
CLs and Vs that are not represented in your data (i.e., high correlation
would suggest that individuals with high CL will tend to also have high V
and vice versa whereas a simulation assuming that they are independent will
result in simulating for some individuals with high CL and low V and some
individuals with low CL and high V that might not be represented in your
data). This could lead to simulations that over-predict the variation in
the concentration-time profiles even though the diagonal omega may be
sufficient for purposes of predicting central tendency in the PK profile.
You can confirm this by VPC looking at your ability to predict say the 10th
and 90th percentiles in comparison to the observed 10th and 90th percentiles
in your data. That is, if you simulate from the diagonal omega when there
is correlation in the random effects you may find that your prediction of
the 10th and 90th percentiles are more extreme than that in your observed
data. I see this all the time in VPC plots where the majority of the
observed data are well within the predictions of the 10th and 90th
percentiles when we should expect about 10% of our data above the 90th
percentile prediction and 10% below the 10th percentile prediction.
Best regards,
Ken
Kenneth G. Kowalski
President & CEO
A2PG - Ann Arbor Pharmacometrics Group, Inc.
110 Miller Ave., Garden Suite
Ann Arbor, MI 48104
Work: 734-274-8255
Cell: 248-207-5082
Fax: 734-913-0230
[email protected]<mailto:[email protected]>
http://www.a2pg.com
-----Original Message-----
From: [email protected]<mailto:[email protected]>
[mailto:[email protected]] On
Behalf Of Eleveld, DJ
Sent: Thursday, September 25, 2014 4:36 PM
To: Pavel Belo; [email protected]<mailto:[email protected]>
Subject: RE: [NMusers] OMEGA matrix
Hi Pavel,
My question is: Why is it desirable to fit a complete omega matrix if its
physical interpretation is unclear? Etas are variation of unknown origin
i.e. not explained by the structural model. A full omega matrix allows the
unknown variation of one paramater to have a (linear?) relationship with
some other thing that is also unknown. If unknown A is found to have a
linear relationship with unknown B, then what knowlegde is gained? I do
think it can be instructive to to look at correlations and use this
information to make a better structural model. But I think diagonal OMEGA
matrix is more desirable if it works ok.
warm regards,
Douglas Eleveld
________________________________
From: [email protected]<mailto:[email protected]>
[[email protected]<mailto:[email protected]>] on behalf
of Pavel Belo [[email protected]<mailto:[email protected]>]
Sent: Thursday, September 25, 2014 4:24 PM
To: [email protected]<mailto:[email protected]>
Subject: [NMusers] OMEGA matrix
Hello Nonmem Community,
It seems like NONMEM developers may advise to start with full OMEGA matrix
at the beginning of model development. Monolix developers may advise to
start with a diagonal matrix. Is there something different in NONMEM SAEM
algorithms that makes model stable when a lot of statistically insignificant
correlations/covariances are estimated in the model?
It seems like NONMEM SAEM can be very stable in very "hard cases" (a lot of
outliers, partially misspecified model, overparameterized model, etc.). The
omega matrix is a part of the puzzle.
When it is impossible to test every correlation coefficient for significance
due to some limitations, it becomes a regulatory issue. We may need to be
able to make a statement that the model is safe and sound even when OMEGA
matrix can be overparameterized (tries to estimate too many insignificant
parameters within the OMEGA matrix).
Kind regards,
Pavel
________________________________
Douglas makes important point in this discussion. That is, the method used
to judge parsimony of the model must consider the performance of the model
for intended purpose.
Consider the parsimony principle: "all things being equal, choose the
simpler model". The key is in how to judge the first part of that
statement.
A model developed based on goodness of fit metrics such as AIC, BIC, or
repeated likelihood ratio tests, may be the most parsimonious model for
predicting the current data set. This doesn't ensure that the model will be
"equal" in performance to more complex models for the purpose of predicting
the typical value in an external data set - external cross validation might
be required for that conclusion. Further, if the purpose is to develop a
model that is a reliable stochastic simulation tool, a simulation-based
model checking method should be part of the assessment of "equal"
performance when arriving at a parsimonious model.
Since most of our modeling goals go far beyond prediction of the current
data set, it's necessary to move beyond metrics solely based on objective
function and degrees of freedom when selecting a model. In other words, it
may be perfectly fine (and even parsimonious) for a model to include more
parameters than the likelihood ratio test tells you to, if those parameters
improve performance for the intended purpose.
Best regards,
Marc
Hi All,
I agree with everything that Marc and Douglas have pointed out. I too do not
advise building the omega structure based on repeated likelihood ratio tests.
The approach I take is more akin to what Joe had suggested earlier using SAEM
to fit the full block omega structure and then look for patterns in the
estimated omega matrix. Even with FOCE estimation I will often fit a full
block omega structure just to look for such patterns. The full block omega
structure may be over-parameterized and sometimes may not even converge.
Nevertheless, as a diagnostic run it can be useful for uncovering patterns that
may lead to reduced omega structures with more stable model fits (i.e., not
over-parameterized). I’m not necessarily driven to find a parsimonious omega
structure as I’ll certainly err on the side of including additional elements in
omega provided there is sufficient support to estimate these parameters (i.e.,
a stable model fit). For example, I will select a full omega structure
regardless of the magnitude of the correlations if the model is stable and not
over-parameterized. I have no issue with those who want to identify a
parsimonious omega structure, however, I still maintain that a diagonal omega
structure often is not the most parsimonious.
I also agree with Marc’s comment that we must judge parsimony relative to the
intended purpose of the model. If we are only interested in our model to
predict central tendency, then a diagonal omega structure may be all that is
needed. I would contend, however, that we often want to use our models for
more than just predicting central tendency. If we perform VPCs,
cross-validation, or external validations on independent datasets, but the
statistics we summarize to assess predictive performance are only those
involving central tendency then we’re not really going to get a robust
assessment of the omega structure. To evaluate the omega structure we need to
use VPC statistics that describe variation and other percentiles besides the
median. My impression is that we aren’t as rigorous in our assessments of
whether our models can adequately describe the variation in our data. As I
stated earlier, I see so many standard VPC plots where virtually 100% of the
observed data are contained well within the 5th and 95th percentiles. The
presenter will often claim that these VPC plots support the adequacy of the
predictions but clearly the model is over-predicting the variation. The
over-prediction of the variation may or may not be related to the omega
structure as it could also be related to skewed or non-normal random effect
distributions. However, if a diagonal omega structure was used and I saw
this over-prediction in the variation in a VPC plot, one of the first things I
would do is re-evaluate the omega structure and see if an alternative omega
structure can lead to improvements in predicting these percentiles.
Best,
Ken
Quoted reply history
From: Gastonguay, Marc [mailto:[email protected]]
Sent: Thursday, October 02, 2014 7:03 AM
To: Eleveld, DJ; [email protected]; [email protected];
[email protected]; [email protected]; Jeroen Elassaiss-Schaap
Subject: Re: [NMusers] OMEGA matrix
Douglas makes important point in this discussion. That is, the method used to
judge parsimony of the model must consider the performance of the model for
intended purpose.
Consider the parsimony principle: "all things being equal, choose the simpler
model". The key is in how to judge the first part of that statement.
A model developed based on goodness of fit metrics such as AIC, BIC, or
repeated likelihood ratio tests, may be the most parsimonious model for
predicting the current data set. This doesn't ensure that the model will be
"equal" in performance to more complex models for the purpose of predicting the
typical value in an external data set - external cross validation might be
required for that conclusion. Further, if the purpose is to develop a model
that is a reliable stochastic simulation tool, a simulation-based model
checking method should be part of the assessment of "equal" performance when
arriving at a parsimonious model.
Since most of our modeling goals go far beyond prediction of the current data
set, it's necessary to move beyond metrics solely based on objective function
and degrees of freedom when selecting a model. In other words, it may be
perfectly fine (and even parsimonious) for a model to include more parameters
than the likelihood ratio test tells you to, if those parameters improve
performance for the intended purpose.
Best regards,
Marc