General question on modeling

19 messages 13 people Latest: Mar 21, 2007

General question on modeling

From: Mark Sale Date: March 19, 2007 technical
Dear Colleagues, I've lately been reviewing the literature on model building/selection algorithms. I have been unable to find any even remotely rigorous discussion of the way we all build NONMEM models. The structural first, then variances/forward addition/backward elimination is generally mentioned in a number of places (Ene Ettes in Ann Pharmacother, 2004, Jaap Mandemas series on POP PK series J PK Biopharm in 1992, Jose Pinheiros paper from the Joint Stats meeting in 1994, Peter Bonates AAPS journal article in 2005, Mats Karlsons AAPS PharmSci, 2002, the FDA guidance on Pop PK). It is most explicitly stated in the NONMEM manuals (Vol 5, figure 11.1) - without any reference. From the NONMEM manuals it is reproduced in many courses, and has become axiomatic. I've looked at the stats literature on forward addition/backwards elimination in both linear and logistic regression, where it is at least formally discussed (with some disagreement about whether it is "correct"). But, I am unable to find any justification for the structural first, then covariates (drive by post-hoc plots), then variance effects approach we use (I'm sure many people will point out that it is not nearly that linear a process, although in figure 11.1, Vol 5 of the NONMEM manuals, it is depicted as a step-by-step algorithm, without any looping back). Can anyone point me to any rigorous discussion of this model building strategy? Mark Sale MD Next Level Solutions, LLC www.NextLevelSolns.com

Re: General question on modeling

From: Anthony J. Rossini Date: March 19, 2007 technical
I'd highly recommend reading Frank Harrell's book on Regression Modeling if you think that stepwise regression makes any sense. While much of the book applies to linear and generalized linear (i.e. categorical, etc) regression models, nonlinear models (and mixed effects models) would generally fall into the "well, if the simple case was like that, it can't be any simpler for the harder cases..."... Frank demonstrates some of the reasons that p-values from models generated using stepwise modeling are fairly useless (i.e. don't follow the behavior you'd expect from p-values). The literature to start looking at would be modern variable selection techniques for linear regression, i.e. work at Stanford Statistics by Hastie, Tibshirani, and their collaborators and former grad students (LASSO, LARS, elastic nets, and similar approaches).
Quoted reply history
On Monday 19 March 2007 19:32, Mark Sale - Next Level Solutions wrote: > Dear Colleagues, > I've lately been reviewing the literature on model building/selection > algorithms. I have been unable to find any even remotely rigorous > discussion of the way we all build NONMEM models. The structural > first, then variances/forward addition/backward elimination is > generally mentioned in a number of places (Ene Ettes in Ann > Pharmacother, 2004, Jaap Mandemas series on POP PK series J PK Biopharm > in 1992, Jose Pinheiros paper from the Joint Stats meeting in 1994, > Peter Bonates AAPS journal article in 2005, Mats Karlsons AAPS > PharmSci, 2002, the FDA guidance on Pop PK). It is most explicitly > stated in the NONMEM manuals (Vol 5, figure 11.1) - without any > reference. From the NONMEM manuals it is reproduced in many courses, > and has become axiomatic. I've looked at the stats literature on > forward addition/backwards elimination in both linear and logistic > regression, where it is at least formally discussed (with some > disagreement about whether it is "correct"). But, I am unable to find > any justification for the structural first, then covariates (drive by > post-hoc plots), then variance effects approach we use (I'm sure many > people will point out that it is not nearly that linear a process, > although in figure 11.1, Vol 5 of the NONMEM manuals, it is depicted as > a step-by-step algorithm, without any looping back). Can anyone point > me to any rigorous discussion of this model building strategy? > > Mark Sale MD > Next Level Solutions, LLC > www.NextLevelSolns.com -- best, -tony [EMAIL PROTECTED] Muttenz, Switzerland. "Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes" (AJR, 4Jan05). pgpKbj3BEdf3V.pgp Description: PGP signature

Re: General question on modeling

From: Nick Holford Date: March 19, 2007 technical
Mark, If we are talking about science then we are not talking about regulatory decision making. The criteria used for regulatory approval and labelling are based on pragmatism not science e.g. using intention to treat analysis (use effectiveness rather than method effectiveness), dividing a continuous variable like renal function into two categories for dose adjustment. This kind of pragmatism is more art than science because it does not correctly describe the drug properties (ITT typically underestimates the true effect size) nor rationally treat the patient with extreme renal function values. As Steve reminded us all models are wrong. The issue is not whether some ad hoc model building algorithm is correct or has the right type 1 error properties under some null that is largely irrelevant to the purpose. The issue is does the model work well enough to satisfy its purpose. Metrics of model performance should be used to decide if a model is adequate not a string of dubiously applied P values. The search process is up to you. I think from your knowledge of computer search methods you will appreciate that those methods that involve more randomness/wild jumps in the algorithm generally have a better chance of approaching a global minimum. IMHO the covariate search process is like the search for the Holy Grail. Its fundamentally a process for those with a religious belief that there is some special set of as yet unidentified covariates that will explain between subject variability. As a non believer I think that all the major leaps in explaining BSV comes from prior knowledge (weight, renal function, drug interactions, genetic polymorphisms) and none have been discovered by trying all the available covariates during a blind search. If you have a counter example then please let me know and tell me how much the BSV variance was reduced when this unsuspected covariate was added to a model with appropriate prior knowledge covariates. Nick Mark Sale - Next Level Solutions wrote: > > Steve, > I was pretty sure I'd get skewered for the suggestion that this was a > linear decision making process (please note the disclaimer in my > question). Wasn't sure if it would be Nick or you. As a devout > Bayesian, I certainly support the idea of letting prior knowledge (any > prior knowledge, not just knowledge of biology) drive the model > buildling, or at least the models that are considered justifiable. > But, I have to admit that I'm uncomfortable with the concept of the > "art" of modeling. Beauty is, after all in the eye of the beholder, > and how can we possibly base regulatory decisions on art? Shouldn't we > be striving for something more objective than art? If this is art, how > do we deal with the reality that two modelers will get different > answers (I know,... neither of which is right), but in the end we do > need to recommend only one dosing regimen. If I were taking the drug, > I'd like that decision based on science, not on art. (although in the > 19th centruy, tubercolis was refered to as "the beautiful death" - > maybe that is what you mean? ;-) ). > But, that is all off the subject, still not sure if there is any > rigorous justification for the way we build models, use of prior > knowledge not-with-standing. > You suggest (I think) that we should select our model based on what > inference we want to examine. I agree. But that is not the question > either. There are volumes written about how to identify the > best/better model once you've found it. I'm interest in how we find > it. > > Mark Sale MD > Next Level Solutions, LLC > www.NextLevelSolns.com > > > -------- Original Message -------- > > Subject: RE: [NMusers] General question on modeling > > From: "Stephen Duffull" <stephen.duffull > > Date: Mon, March 19, 2007 5:52 pm > > To: "'Mark Sale - Next Level Solutions'" <mark > > Cc: <nmusers > > > > Mark > > > > > I've lately been reviewing the literature on model > > > building/selection algorithms. I have been unable to find > > > any even remotely rigorous discussion of the way we all build > > > NONMEM models. The structural first, then variances/forward > > > addition/backward elimination is generally mentioned in a > > > number of places > > > > I sort of hope that there is no prescriptive approach to model building for > > nonlinear mixed effects models since this would suggest that if you follow a > > set recipe you will end up with a model that works everytime. > > > > I'm sure everyone has anecdotes where a "nonlinear" approach to model > > building worked best, e.g. adding covariates prior to completion of building > > the structural PK model as is sometimes necessary to be able to build an > > adequate structural model. > > > > Surely the idea is to let the sciences of biological systems and statistics > > inform the modeller on how to best go about making their model (I have even > > heard some refer to this as the "art" of model building :-) ). > > > > Afterall if we believe that all models are wrong then all we really want > > from our model is one that performs well for the inference we wish to draw > > from it. > > > > Steve > > -- > > Professor Stephen Duffull > > Chair of Clinical Pharmacy > > School of Pharmacy > > University of Otago > > PO Box 913 Dunedin > > New Zealand > > E: stephen.duffull > > P: +64 3 479 5044 > > F: +64 3 479 7034 > > > > Design software: www.winpopt.com -- Nick Holford, Dept Pharmacology & Clinical Pharmacology University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand email:n.holford http://www.health.auckland.ac.nz/pharmacology/staff/nholford/

Re: General question on modeling

From: Paul Hutson Date: March 19, 2007 technical
Joga Gobburu: In the context of Nick, Mark, and Steve's comments, can you provide any insight to us about the FDA's current attitude, preferred methodolgy, or a reference for model construction and testing? Thanks! Paul Nick Holford wrote: Mark, If we are talking about science then we are not talking about regulatory decision making. The criteria used for regulatory approval and labelling are based on pragmatism not science e.g. using intention to treat analysis (use effectiveness rather than method effectiveness), dividing a continuous variable like renal function into two categories for dose adjustment. This kind of pragmatism is more art than science because it does not correctly describe the drug properties (ITT typically underestimates the true effect size) nor rationally treat the patient with extreme renal function values. As Steve reminded us all models are wrong. The issue is not whether some ad hoc model building algorithm is correct or has the right type 1 error properties under some null that is largely irrelevant to the purpose. The issue is does the model work well enough to satisfy its purpose. Metrics of model performance should be used to decide if a model is adequate not a string of dubiously applied P values. The search process is up to you. I think from your knowledge of computer search methods you will appreciate that those methods that involve more randomness/wild jumps in the algorithm generally have a better chance of approaching a global minimum. IMHO the covariate search process is like the search for the Holy Grail. Its fundamentally a process for those with a religious belief that there is some special set of as yet unidentified covariates that will explain between subject variability. As a non believer I think that all the major leaps in explaining BSV comes from prior knowledge (weight, renal function, drug interactions, genetic polymorphisms) and none have been discovered by trying all the available covariates during a blind search. If you have a counter example then please let me know and tell me how much the BSV variance was reduced when this unsuspected covariate was added to a model with appropriate prior knowledge covariates. Nick Mark Sale - Next Level Solutions wrote: Steve, I was pretty sure I'd get skewered for the suggestion that this was a linear decision making process (please note the disclaimer in my question). Wasn't sure if it would be Nick or you. As a devout Bayesian, I certainly support the idea of letting prior knowledge (any prior knowledge, not just knowledge of biology) drive the model buildling, or at least the models that are considered justifiable. But, I have to admit that I'm uncomfortable with the concept of the "art" of modeling. Beauty is, after all in the eye of the beholder, and how can we possibly base regulatory decisions on art? Shouldn't we be striving for something more objective than art? If this is art, how do we deal with the reality that two modelers will get different answers (I know,... neither of which is right), but in the end we do need to recommend only one dosing regimen. If I were taking the drug, I'd like that decision based on science, not on art. (although in the 19th centruy, tubercolis was refered to as "the beautiful death" - maybe that is what you mean? ;-) ). But, that is all off the subject, still not sure if there is any rigorous justification for the way we build models, use of prior knowledge not-with-standing. You suggest (I think) that we should select our model based on what inference we want to examine. I agree. But that is not the question either. There are volumes written about how to identify the best/better model once you've found it. I'm interest in how we find it. Mark Sale MD Next Level Solutions, LLC www.NextLevelSolns.com
Quoted reply history
-------- Original Message -------- Subject: RE: [NMusers] General question on modeling From: Stephen Duffull I've lately been reviewing the literature on model building/selection algorithms. I have been unable to find any even remotely rigorous discussion of the way we all build NONMEM models. The structural first, then variances/forward addition/backward elimination is generally mentioned in a number of places I sort of hope that there is no prescriptive approach to model building for nonlinear mixed effects models since this would suggest that if you follow a set recipe you will end up with a model that works everytime. I'm sure everyone has anecdotes where a "nonlinear" approach to model building worked best, e.g. adding covariates prior to completion of building the structural PK model as is sometimes necessary to be able to build an adequate structural model. Surely the idea is to let the sciences of biological systems and statistics inform the modeller on how to best go about making their model (I have even heard some refer to this as the "art" of model building :-) ). Afterall if we believe that all models are wrong then all we really want from our model is one that performs well for the inference we wish to draw from it. Steve -- Professor Stephen Duffull Chair of Clinical Pharmacy School of Pharmacy University of Otago PO Box 913 Dunedin New Zealand E: www.winpopt.com -- Nick Holford, Dept Pharmacology & Clinical Pharmacology University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand http://www.health.auckland.ac.nz/pharmacology/staff/nholford/ -- Paul R Paul R. Hutson, Pharm.D. Associate Professor UW School of Pharmacy 777 Highland Avenue Madison WI 53705-2222 Tel 608.263.2496 Fax 608.265.5421 Pager 608.265.7000, p7856

RE: General question on modeling

From: Stephen Duffull Date: March 19, 2007 technical
Mark > I've lately been reviewing the literature on model > building/selection algorithms. I have been unable to find > any even remotely rigorous discussion of the way we all build > NONMEM models. The structural first, then variances/forward > addition/backward elimination is generally mentioned in a > number of places I sort of hope that there is no prescriptive approach to model building for nonlinear mixed effects models since this would suggest that if you follow a set recipe you will end up with a model that works everytime. I'm sure everyone has anecdotes where a "nonlinear" approach to model building worked best, e.g. adding covariates prior to completion of building the structural PK model as is sometimes necessary to be able to build an adequate structural model. Surely the idea is to let the sciences of biological systems and statistics inform the modeller on how to best go about making their model (I have even heard some refer to this as the "art" of model building :-) ). Afterall if we believe that all models are wrong then all we really want from our model is one that performs well for the inference we wish to draw from it. Steve -- Professor Stephen Duffull Chair of Clinical Pharmacy School of Pharmacy University of Otago PO Box 913 Dunedin New Zealand E: [EMAIL PROTECTED] P: +64 3 479 5044 F: +64 3 479 7034 Design software: www.winpopt.com

Re: General question on modeling

From: Nick Holford Date: March 20, 2007 technical
Mark, If we are talking about science then we are not talking about regulatory decision making. The criteria used for regulatory approval and labelling are based on pragmatism not science e.g. using intention to treat analysis (use effectiveness rather than method effectiveness), dividing a continuous variable like renal function into two categories for dose adjustment. This kind of pragmatism is more art than science because it does not correctly describe the drug properties (ITT typically underestimates the true effect size) nor rationally treat the patient with extreme renal function values. As Steve reminded us all models are wrong. The issue is not whether some ad hoc model building algorithm is correct or has the right type 1 error properties under some null that is largely irrelevant to the purpose. The issue is does the model work well enough to satisfy its purpose. Metrics of model performance should be used to decide if a model is adequate not a string of dubiously applied P values. The search process is up to you. I think from your knowledge of computer search methods you will appreciate that those methods that involve more randomness/wild jumps in the algorithm generally have a better chance of approaching a global minimum. IMHO the covariate search process is like the search for the Holy Grail. Its fundamentally a process for those with a religious belief that there is some special set of as yet unidentified covariates that will explain between subject variability. As a non believer I think that all the major leaps in explaining BSV comes from prior knowledge (weight, renal function, drug interactions, genetic polymorphisms) and none have been discovered by trying all the available covariates during a blind search. If you have a counter example then please let me know and tell me how much the BSV variance was reduced when this unsuspected covariate was added to a model with appropriate prior knowledge covariates. Nick Mark Sale - Next Level Solutions wrote: > > Steve, > I was pretty sure I'd get skewered for the suggestion that this was a > linear decision making process (please note the disclaimer in my > question). Wasn't sure if it would be Nick or you. As a devout > Bayesian, I certainly support the idea of letting prior knowledge (any > prior knowledge, not just knowledge of biology) drive the model > buildling, or at least the models that are considered justifiable. > But, I have to admit that I'm uncomfortable with the concept of the > "art" of modeling. Beauty is, after all in the eye of the beholder, > and how can we possibly base regulatory decisions on art? Shouldn't we > be striving for something more objective than art? If this is art, how > do we deal with the reality that two modelers will get different > answers (I know,... neither of which is right), but in the end we do > need to recommend only one dosing regimen. If I were taking the drug, > I'd like that decision based on science, not on art. (although in the > 19th centruy, tubercolis was refered to as "the beautiful death" - > maybe that is what you mean? ;-) ). > But, that is all off the subject, still not sure if there is any > rigorous justification for the way we build models, use of prior > knowledge not-with-standing. > You suggest (I think) that we should select our model based on what > inference we want to examine. I agree. But that is not the question > either. There are volumes written about how to identify the > best/better model once you've found it. I'm interest in how we find > it. > > Mark Sale MD > Next Level Solutions, LLC > www.NextLevelSolns.com > > > -------- Original Message -------- > > Subject: RE: [NMusers] General question on modeling > > From: "Stephen Duffull" <[EMAIL PROTECTED]> > > Date: Mon, March 19, 2007 5:52 pm > > To: "'Mark Sale - Next Level Solutions'" <[EMAIL PROTECTED]> > > Cc: <[email protected]> > > > > Mark > > > > > I've lately been reviewing the literature on model > > > building/selection algorithms. I have been unable to find > > > any even remotely rigorous discussion of the way we all build > > > NONMEM models. The structural first, then variances/forward > > > addition/backward elimination is generally mentioned in a > > > number of places > > > > I sort of hope that there is no prescriptive approach to model building for > > nonlinear mixed effects models since this would suggest that if you follow a > > set recipe you will end up with a model that works everytime. > > > > I'm sure everyone has anecdotes where a "nonlinear" approach to model > > building worked best, e.g. adding covariates prior to completion of building > > the structural PK model as is sometimes necessary to be able to build an > > adequate structural model. > > > > Surely the idea is to let the sciences of biological systems and statistics > > inform the modeller on how to best go about making their model (I have even > > heard some refer to this as the "art" of model building :-) ). > > > > Afterall if we believe that all models are wrong then all we really want > > from our model is one that performs well for the inference we wish to draw > > from it. > > > > Steve > > -- > > Professor Stephen Duffull > > Chair of Clinical Pharmacy > > School of Pharmacy > > University of Otago > > PO Box 913 Dunedin > > New Zealand > > E: [EMAIL PROTECTED] > > P: +64 3 479 5044 > > F: +64 3 479 7034 > > > > Design software: www.winpopt.com -- Nick Holford, Dept Pharmacology & Clinical Pharmacology University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand email:[EMAIL PROTECTED] tel:+64(9)373-7599x86730 fax:373-7556 http://www.health.auckland.ac.nz/pharmacology/staff/nholford/

RE: General question on modeling

From: Stephen Duffull Date: March 20, 2007 technical
Mark > But, I have to admit that I'm uncomfortable with the concept > of the "art" of modeling. I agree - I like to think of it as a science of modelling - but I have heard (at conferences) the "science" of modelling referred to as the "art" of modelling. > decisions on art? Shouldn't we be striving for something > more objective than art? We have that now. The model should perform well in the area that it's supposed to. There are a number of diagnostic and evaluation techniques that one can use to ask the question "Is my model any good for the purpose for which I built it?". I think the underlying concept of striving for a single method for building models is inherently flawed. > If this is art, how do we deal with > the reality that two modelers will get different answers (I > know,... neither of which is right), but in the end we do > need to recommend only one dosing regimen. By different answers - are you referring to different models? In which case the models would presumably be sufficiently confluent that their predictions of the substantive inference (e.g. dosing regimen) would be the same or at least very similar (to within an acceptable dose size). IMHO, a mistake is made in drug development when we try and find the best single model at every stage of the process. Why not have a selection of plausible models which all provide essentially the same inferences. In this case when we design the next study our design will incorporate a quantitative measure of our uncertainty in the model, rather than just saying - "this is the model and that's that". > You suggest (I think) that we should select our model based > on what inference we want to examine. I agree. But that is > not the question either. There are volumes written about how > to identify the best/better model once you've found it. I'm > interest in how we find it. This is my point exactly - I don't believe there is an absolute, linear method available for finding the best model within the framework of hierarchical nonlinear models (there - I've said it). Steve --

RE: General question on modeling

From: Mark Sale Date: March 20, 2007 technical
Steve, I think we're in complete agreement, with one exception. You write > By different answers - are you referring to different models? In which case > the models would presumably be sufficiently confluent that their predictions > of the substantive inference (e.g. dosing regimen) would be the same or at > least very similar (to within an acceptable dose size). No, I meant that one model suggests the dose should be 100 BID and the other suggests it should be 200 QD. Or that the ED50 is 50 mg, and so the dose should be (maybe) 100 mg, or the ED50 is 200, so the dose should be (maybe) 400 mg. Which do you choose (in the real world, commericial gets to choose, so it will be qd - and it will be a blue pill)? While we do, in general, have tools to determine which of these two models is "better", do we have tools that will insure that we even have these two models to evaluate. Or, given the tools we have, are we like to get one, and never even consider the other. Again, we have lots of discussion about "which of these two models is better", very little about how to find these two models to compare in the first place. There certainly is no single criteria by which to evaluate the models, must be purpose specific. Mark Sale MD Next Level Solutions, LLC www.NextLevelSolns.com
Quoted reply history
> -------- Original Message -------- > Subject: RE: [NMusers] General question on modeling > From: "Stephen Duffull" <[EMAIL PROTECTED]> > Date: Mon, March 19, 2007 8:42 pm > To: "'Mark Sale - Next Level Solutions'" <[EMAIL PROTECTED]> > Cc: <[email protected]> > > Mark > > > But, I have to admit that I'm uncomfortable with the concept > > of the "art" of modeling. > > I agree - I like to think of it as a science of modelling - but I have heard > (at conferences) the "science" of modelling referred to as the "art" of > modelling. > > > decisions on art? Shouldn't we be striving for something > > more objective than art? > > We have that now. The model should perform well in the area that it's > supposed to. There are a number of diagnostic and evaluation techniques > that one can use to ask the question "Is my model any good for the purpose > for which I built it?". I think the underlying concept of striving for a > single method for building models is inherently flawed. > > > If this is art, how do we deal with > > the reality that two modelers will get different answers (I > > know,... neither of which is right), but in the end we do > > need to recommend only one dosing regimen. > > By different answers - are you referring to different models? In which case > the models would presumably be sufficiently confluent that their predictions > of the substantive inference (e.g. dosing regimen) would be the same or at > least very similar (to within an acceptable dose size). > > IMHO, a mistake is made in drug development when we try and find the best > single model at every stage of the process. Why not have a selection of > plausible models which all provide essentially the same inferences. In this > case when we design the next study our design will incorporate a > quantitative measure of our uncertainty in the model, rather than just > saying - "this is the model and that's that". > > > You suggest (I think) that we should select our model based > > on what inference we want to examine. I agree. But that is > > not the question either. There are volumes written about how > > to identify the best/better model once you've found it. I'm > > interest in how we find it. > > This is my point exactly - I don't believe there is an absolute, linear > method available for finding the best model within the framework of > hierarchical nonlinear models (there - I've said it). > > Steve > --

Re: General question on modeling

From: Paul Hutson Date: March 20, 2007 technical
Title: Paul R Joga Gobburu: In the context of Nick, Mark, and Steve's comments, can you provide any insight to us about the FDA's current attitude, preferred methodolgy, or a reference for model construction and testing? Thanks! Paul Nick Holford wrote: Mark, If we are talking about science then we are not talking about regulatory decision making. The criteria used for regulatory approval and labelling are based on pragmatism not science e.g. using intention to treat analysis (use effectiveness rather than method effectiveness), dividing a continuous variable like renal function into two categories for dose adjustment. This kind of pragmatism is more art than science because it does not correctly describe the drug properties (ITT typically underestimates the true effect size) nor rationally treat the patient with extreme renal function values. As Steve reminded us all models are wrong. The issue is not whether some ad hoc model building algorithm is correct or has the right type 1 error properties under some null that is largely irrelevant to the purpose. The issue is does the model work well enough to satisfy its purpose. Metrics of model performance should be used to decide if a model is adequate not a string of dubiously applied P values. The search process is up to you. I think from your knowledge of computer search methods you will appreciate that those methods that involve more randomness/wild jumps in the algorithm generally have a better chance of approaching a global minimum. IMHO the covariate search process is like the search for the Holy Grail. Its fundamentally a process for those with a religious belief that there is some special set of as yet unidentified covariates that will explain between subject variability. As a non believer I think that all the major leaps in explaining BSV comes from prior knowledge (weight, renal function, drug interactions, genetic polymorphisms) and none have been discovered by trying all the available covariates during a blind search. If you have a counter example then please let me know and tell me how much the BSV variance was reduced when this unsuspected covariate was added to a model with appropriate prior knowledge covariates. Nick Mark Sale - Next Level Solutions wrote: Steve, I was pretty sure I'd get skewered for the suggestion that this was a linear decision making process (please note the disclaimer in my question). Wasn't sure if it would be Nick or you. As a devout Bayesian, I certainly support the idea of letting prior knowledge (any prior knowledge, not just knowledge of biology) drive the model buildling, or at least the models that are considered justifiable. But, I have to admit that I'm uncomfortable with the concept of the "art" of modeling. Beauty is, after all in the eye of the beholder, and how can we possibly base regulatory decisions on art? Shouldn't we be striving for something more objective than art? If this is art, how do we deal with the reality that two modelers will get different answers (I know,... neither of which is right), but in the end we do need to recommend only one dosing regimen. If I were taking the drug, I'd like that decision based on science, not on art. (although in the 19th centruy, tubercolis was refered to as "the beautiful death" - maybe that is what you mean? ;-) ). But, that is all off the subject, still not sure if there is any rigorous justification for the way we build models, use of prior knowledge not-with-standing. You suggest (I think) that we should select our model based on what inference we want to examine. I agree. But that is not the question either. There are volumes written about how to identify the best/better model once you've found it. I'm interest in how we find it. Mark Sale MD Next Level Solutions, LLC www.NextLevelSolns.com -------- Original Message -------- Subject: RE: [NMusers] General question on modeling From: "Stephen Duffull" <[EMAIL PROTECTED]> Date: Mon, March 19, 2007 5:52 pm To: "'Mark Sale - Next Level Solutions'" <[EMAIL PROTECTED]> Cc: < [email protected] > Mark I've lately been reviewing the literature on model building/selection algorithms. I have been unable to find any even remotely rigorous discussion of the way we all build NONMEM models. The structural first, then variances/forward addition/backward elimination is generally mentioned in a number of places I sort of hope that there is no prescriptive approach to model building for nonlinear mixed effects models since this would suggest that if you follow a set recipe you will end up with a model that works everytime. I'm sure everyone has anecdotes where a "nonlinear" approach to model building worked best, e.g. adding covariates prior to completion of building the structural PK model as is sometimes necessary to be able to build an adequate structural model. Surely the idea is to let the sciences of biological systems and statistics inform the modeller on how to best go about making their model (I have even heard some refer to this as the "art" of model building :-) ). Afterall if we believe that all models are wrong then all we really want from our model is one that performs well for the inference we wish to draw from it. Steve -- Professor Stephen Duffull Chair of Clinical Pharmacy School of Pharmacy University of Otago PO Box 913 Dunedin New Zealand E: [EMAIL PROTECTED] P: +64 3 479 5044 F: +64 3 479 7034 Design software: www.winpopt.com -- Nick Holford, Dept Pharmacology & Clinical Pharmacology University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand email:[EMAIL PROTECTED] tel:+64(9)373-7599x86730 fax:373-7556 http://www.health.auckland.ac.nz/pharmacology/staff/nholford/ -- Paul R. Hutson, Pharm.D. Associate Professor UW School of Pharmacy 777 Highland Avenue Madison WI 53705-2222 Tel 608.263.2496 Fax 608.265.5421 Pager 608.265.7000, p7856

General question on modeling

From: Michael Fossler Date: March 20, 2007 technical
Interesting topic. I have little to add to what has been presented already except to state that we are unlikely to come to a consensus regarding best practices on modeling, with or without FDA input. Statistians have been arguing over these very same issues in the linear regression world for decades and are at the same place we are. I would agree with Pete that in general, covariate analysis adds very little to most analyses. It was my naive hope a few years ago that pharmacogenomics would bring us better covariates which would explain more variability in drug PK and response. I am older now, (and more jaded) and am increasingly uncertain as to whether PG will bring us anything usable, despite glowing articles in Time and Newsweek about "personalized medicine". Mike Fossler

General question on modeling

From: Peter Bonate Date: March 20, 2007 technical
Sometimes these threads kill me. There is a degree of art to modeling. The art is the intangible things that we do during model development. If there was no art, if it was all based on science, then all modelers would be equal and two modelers would always come to the same model. The fact that we don't is the uniqueness of the process and therein lies the art. I would also like to argue that for most drugs, covariate inclusion in a model often reduces BSV and residual variability by very little. There are very few magic bullet covariates like GFR with aminoglycosides. I would think that if two experienced modelers analyzed the same data set and came up with different models that if we were to examine these models we would find they probably would have similar predictive performance. A classic example of this is when you do all possible regressions with a multiple linear regression model. Pete bonate Peter Bonate, PhD, FCP
Quoted reply history
-----Original Message----- From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> To: 'Mark Sale - Next Level Solutions' <[EMAIL PROTECTED]> CC: [email protected] <[email protected]> Sent: Mon Mar 19 19:42:18 2007 Subject: RE: [NMusers] General question on modeling Mark > But, I have to admit that I'm uncomfortable with the concept > of the "art" of modeling. I agree - I like to think of it as a science of modelling - but I have heard (at conferences) the "science" of modelling referred to as the "art" of modelling. > decisions on art? Shouldn't we be striving for something > more objective than art? We have that now. The model should perform well in the area that it's supposed to. There are a number of diagnostic and evaluation techniques that one can use to ask the question "Is my model any good for the purpose for which I built it?". I think the underlying concept of striving for a single method for building models is inherently flawed. > If this is art, how do we deal with > the reality that two modelers will get different answers (I > know,... neither of which is right), but in the end we do > need to recommend only one dosing regimen. By different answers - are you referring to different models? In which case the models would presumably be sufficiently confluent that their predictions of the substantive inference (e.g. dosing regimen) would be the same or at least very similar (to within an acceptable dose size). IMHO, a mistake is made in drug development when we try and find the best single model at every stage of the process. Why not have a selection of plausible models which all provide essentially the same inferences. In this case when we design the next study our design will incorporate a quantitative measure of our uncertainty in the model, rather than just saying - "this is the model and that's that". > You suggest (I think) that we should select our model based > on what inference we want to examine. I agree. But that is > not the question either. There are volumes written about how > to identify the best/better model once you've found it. I'm > interest in how we find it. This is my point exactly - I don't believe there is an absolute, linear method available for finding the best model within the framework of hierarchical nonlinear models (there - I've said it). Steve --

RE: General question on modeling

From: Michael . Looby Date: March 20, 2007 technical
Dear All Certainly an interesting discussion. While developing a model of the relationship between the continuous values of a covariate and a response is of benefit in terms of characterising the dependency, it is not a given that dosing on a continuous scale adds value in terms of better therapy. The key to determining the number of steps in a covariate based dosage algorithm will be the amount of variability accounted for by the covariate. Thus, the greater the amount of variability accounted, the smaller the number of necessary steps. To picture this think of the extremes: if the covariate accounts for all the variability then continuous adjustment will be optimal and at the other (absurd) extreme the covariate does not account for any variability then no adjustment will be best. I mention the latter because very often most covariates tested account for very little variability despite the huge effort put into testing them. >From my perspective adding covariates only adds benefit if they reduce model bias and/or explain enough variability to have benefit for the purpose of individualisation. These thoughts should be central to those involved in this activity Kind regards Mick Mark Sale - Next Level Solutions <[EMAIL PROTECTED]> Sent by: [EMAIL PROTECTED] 20.03.2007 11:29 To: cc: [email protected], (bcc: Michael Looby/PH/Novartis) Subject: RE: [NMusers] General question on modeling Mark, Wow, are we getting off the original subject (which we always do). I'd suggest that oncologists and epileptolgist are exceptions - they have learned to deal with individualized dosing because of the toxicity of the drug they use. Many, many studies have documented the issues of mis-dosing drugs, and estimated the resulting fatalities. Making dosing more complicated is unlikely the help. In addition, each company very much wants their drug to be simpler to use than their competitors. Mark Sale MD Next Level Solutions, LLC www.NextLevelSolns.com
Quoted reply history
> -------- Original Message -------- > Subject: Re: [NMusers] General question on modeling > From: Nick Holford <[EMAIL PROTECTED]> > Date: Mon, March 19, 2007 9:36 pm > To: [email protected] > > Mark, > > > Reality is that the vast majority of providers couldn't > > deal with renal function as a continuous variable in dosing. Writing a > > label requiring them to do so would not result in an optimal outcome. > > The vast majority of providers are perfectly able to deal with renal function as a continuous variable. They don't do it because they dont appreciate the mistakes they are encouraged to make by untested labelling strategies. > > Clinical trials have shown clinicians can be encouraged to use quantitative dosing on a continuous scale with a proven benefit in outcome by ignoring the drug label advice e.g. > > Evans W, Relling M, Rodman J, Crom W, Boyett J, Pui C. Conventional compared with individualized chemotherapy for childhood acute lymphoblastic leukemia. New England Journal of Medicine 1998;338:499-505 > > BTW I'm still waiting to hear if you have an example of finding the Holy Grail... > > > > > > -------- Original Message -------- > > > Subject: Re: [NMusers] General question on modeling > > > From: Nick Holford <[EMAIL PROTECTED]> > > > Date: Mon, March 19, 2007 8:27 pm > > > To: [email protected] > > > > > > Mark, > > > > > > If we are talking about science then we are not talking about regulatory decision making. The criteria used for regulatory approval and labelling are based on pragmatism not science e.g. using intention to treat analysis (use effectiveness rather than method effectiveness), dividing a continuous variable like renal function into two categories for dose adjustment. This kind of pragmatism is more art than science because it does not correctly describe the drug properties (ITT typically underestimates the true effect size) nor rationally treat the patient with extreme renal function values. > > > > > > As Steve reminded us all models are wrong. The issue is not whether some ad hoc model building algorithm is correct or has the right type 1 error properties under some null that is largely irrelevant to the purpose. The issue is does the model work well enough to satisfy its purpose. Metrics of model performance should be used to decide if a model is adequate not a string of dubiously applied P values. > > > > > > The search process is up to you. I think from your knowledge of computer search methods you will appreciate that those methods that involve more randomness/wild jumps in the algorithm generally have a better chance of approaching a global minimum. > > > > > > IMHO the covariate search process is like the search for the Holy Grail. Its fundamentally a process for those with a religious belief that there is some special set of as yet unidentified covariates that will explain between subject variability. As a non believer I think that all the major leaps in explaining BSV comes from prior knowledge (weight, renal function, drug interactions, genetic polymorphisms) and none have been discovered by trying all the available covariates during a blind search. If you have a counter example then please let me know and tell me how much the BSV variance was reduced when this unsuspected covariate was added to a model with appropriate prior knowledge covariates. > > > > > > Nick > > > > -- > Nick Holford, Dept Pharmacology & Clinical Pharmacology > University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand > email:[EMAIL PROTECTED] tel:+64(9)373-7599x86730 fax:373-7556 > http://www.health.auckland.ac.nz/pharmacology/staff/nholford/

General question on modeling

From: Michael Fossler Date: March 20, 2007 technical
Interesting topic. I have little to add to what has been presented already except to state that we are unlikely to come to a consensus regarding best practices on modeling, with or without FDA input. Statistians have been arguing over these very same issues in the linear regression world for decades and are at the same place we are. I would agree with Pete that in general, covariate analysis adds very little to most analyses. It was my naive hope a few years ago that pharmacogenomics would bring us better covariates which would explain more variability in drug PK and response. I am older now, (and more jaded) and am increasingly uncertain as to whether PG will bring us anything usable, despite glowing articles in Time and Newsweek about "personalized medicine". Mike Fossler

RE: General question on modeling

From: James G Wright Date: March 20, 2007 technical
Mark, I think we need to make a distinction between scientific investigation and an experiment. An individual experiment should be reproducible, and our equivalent is the estimation of a given model on a given dataset. The process of scientific investigation varies substantially among investigators in any scientific field. I am not optimistic that scientific research (which implicitly includes the generation of hypotheses, which are partially synonymous with models) can ever be reduced to an algorithm. Best regards, James G Wright PhD Scientist Wright Dose Ltd Tel: 44 (0) 772 5636914 www.wright-dose.com
Quoted reply history
-----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mark Sale - Next Level Solutions Sent: 20 March 2007 13:10 Cc: [email protected] Subject: RE: [NMusers] General question on modeling Pete, Beg to differ, but ... In all other sciences being able to independently reproduce results is the hallmark of a valid piece of work. (remember cold fusion?, not one else could reproduce it, invalid, then there was angiogenic factors, no one else could reproduce (for a long time), then when Folkman showed people how, it was valid). Why are we so special that it is OK for the same experiement to give different results- even different conclusions, and both are valid? It think this is just more than differences in interpreting data - it's like two people do a T test and get different answers. It that happens, we need to question whether the T test is a valid method. But, I agree that covariates are a fairly trivial contributor to explaining variability. The biggest contributor to variability is time (high concentration just after dose, low long after dose). So, usually it is the structural model that drives pretty much everything. It matters more if you chose an Emax over an indirect response for your PD model than whether you put age as a predictor of Emax. Mark Mark Sale MD Next Level Solutions, LLC www.NextLevelSolns.com > -------- Original Message -------- > Subject: [NMusers] General question on modeling > From: "Bonate, Peter" <[EMAIL PROTECTED]> > Date: Tue, March 20, 2007 8:20 am > To: <[email protected]> > > Sometimes these threads kill me. There is a degree of art to > modeling. The art is the intangible things that we do during model > development. If there was no art, if it was all based on science, then > all modelers would be equal and two modelers would always come to the > same model. The fact that we don't is the uniqueness of the process > and therein lies the art. > > I would also like to argue that for most drugs, covariate inclusion in > a model often reduces BSV and residual variability by very little. > There are very few magic bullet covariates like GFR with > aminoglycosides. I would think that if two experienced modelers > analyzed the same data set and came up with different models that if > we were to examine these models we would find they probably would have > similar predictive performance. A classic example of this is when you > do all possible regressions with a multiple linear regression model. > > Pete bonate > Peter Bonate, PhD, FCP > > -----Original Message----- > From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > To: 'Mark Sale - Next Level Solutions' <[EMAIL PROTECTED]> > CC: [email protected] <[email protected]> > Sent: Mon Mar 19 19:42:18 2007 > Subject: RE: [NMusers] General question on modeling > > Mark > > > But, I have to admit that I'm uncomfortable with the concept > > of the "art" of modeling. > > I agree - I like to think of it as a science of modelling - but I have > heard (at conferences) the "science" of modelling referred to as the > "art" of modelling. > > > decisions on art? Shouldn't we be striving for something > > more objective than art? > > We have that now. The model should perform well in the area that it's > supposed to. There are a number of diagnostic and evaluation > techniques that one can use to ask the question "Is my model any good > for the purpose for which I built it?". I think the underlying > concept of striving for a > single method for building models is inherently flawed. > > > If this is art, how do we deal with > > the reality that two modelers will get different answers (I > > know,... neither of which is right), but in the end we do > > need to recommend only one dosing regimen. > > By different answers - are you referring to different models? In > which case the models would presumably be sufficiently confluent that > their predictions > of the substantive inference (e.g. dosing regimen) would be the same or > at > least very similar (to within an acceptable dose size). > > IMHO, a mistake is made in drug development when we try and find the > best single model at every stage of the process. Why not have a > selection of plausible models which all provide essentially the same > inferences. In this > case when we design the next study our design will incorporate a > quantitative measure of our uncertainty in the model, rather than just > saying - "this is the model and that's that". > > > You suggest (I think) that we should select our model based > > on what inference we want to examine. I agree. But that is > > not the question either. There are volumes written about how > > to identify the best/better model once you've found it. I'm > > interest in how we find it. > > This is my point exactly - I don't believe there is an absolute, > linear method available for finding the best model within the > framework of hierarchical nonlinear models (there - I've said it). > > Steve > --

Re: General question on modeling

From: Tim Bergsma Date: March 20, 2007 technical
A lot of ink has been shed over the contrast between art and science. E.g., to what extent should model building be characterized as art? The concept "art" suggests "subjectivity", and clearly there is a subjective element to most model building. Unfortunately, the concept "art" also suggests "arbitrary preference". Most scientists probably do not consider their preferences to be arbitrary. What is missing is the concept of informed preference, or even informed instinct, which may differ across individuals, and may give contrasting yet equally valid results. For clarity, we could call this "skill". It results from training and experience, yet involves a type of knowledge that is difficult to articulate. Philospher Michael Polanyi referred to this sort of knowledge as "the tacit dimension". In his book by the same title, he championed the idea that "we know more than we can tell". Tim Bergsma, Ph.D. Bonate, Peter wrote: > Sometimes these threads kill me. There is a degree of art to modeling. > The art is the intangible things that we do during model development. > If there was no art, if it was all based on science, then all modelers > would be equal and two modelers would always come to the same model. > The fact that we don't is the uniqueness of the process and therein lies > the art. > > I would also like to argue that for most drugs, covariate inclusion in a > model often reduces BSV and residual variability by very little. There > are very few magic bullet covariates like GFR with aminoglycosides. I > would think that if two experienced modelers analyzed the same data set > and came up with different models that if we were to examine these > models we would find they probably would have similar predictive > performance. A classic example of this is when you do all possible > > regressions with a multiple linear regression model. > > Pete bonate > Peter Bonate, PhD, FCP >
Quoted reply history
> -----Original Message----- > From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > To: 'Mark Sale - Next Level Solutions' <[EMAIL PROTECTED]> > CC: [email protected] <[email protected]> > Sent: Mon Mar 19 19:42:18 2007 > Subject: RE: [NMusers] General question on modeling > > Mark > > > But, I have to admit that I'm uncomfortable with the concept of the "art" of modeling. > > I agree - I like to think of it as a science of modelling - but I have > heard > (at conferences) the "science" of modelling referred to as the "art" of > modelling. > > > decisions on art? Shouldn't we be striving for something more objective than art? > > We have that now. The model should perform well in the area that it's > supposed to. There are a number of diagnostic and evaluation techniques > that one can use to ask the question "Is my model any good for the > purpose > for which I built it?". I think the underlying concept of striving for > a > single method for building models is inherently flawed. > > > If this is art, how do we deal with the reality that two modelers will get different answers (I know,... neither of which is right), but in the end we do need to recommend only one dosing regimen. > > By different answers - are you referring to different models? In which > case > the models would presumably be sufficiently confluent that their > predictions > of the substantive inference (e.g. dosing regimen) would be the same or > at > least very similar (to within an acceptable dose size). > > IMHO, a mistake is made in drug development when we try and find the > best > single model at every stage of the process. Why not have a selection of > plausible models which all provide essentially the same inferences. In > this > case when we design the next study our design will incorporate a > quantitative measure of our uncertainty in the model, rather than just > saying - "this is the model and that's that". > > > You suggest (I think) that we should select our model based on what inference we want to examine. I agree. But that is not the question either. There are volumes written about how to identify the best/better model once you've found it. I'm interest in how we find it. > > This is my point exactly - I don't believe there is an absolute, linear > method available for finding the best model within the framework of > hierarchical nonlinear models (there - I've said it). > > Steve > --

Re: General question on modeling

From: Alison Boeckmann Date: March 20, 2007 technical
Dear nmusers, I'd like to add a historical perspective. Mark's original question that started this discussion had to do with Fig. 11.1 of the NONMEM Users Guide Part V. The chapter on Model building was written by Lewis Sheiner, and was pretty much identical to his corresponding lecture in the NONMEM short course. This dates it to approx. 1984. Did Lewis have any rigorous reason for presenting this approach, or did it seem "right" to him? He was a great intuitive thinker. The only way to know what was in his mind at the time might be to 1) check the literature as of that time, and 2) ask the people who were fellows at that time. But remember that early NONMEM users were constrained by very slow computers. To work with large models was prohibitively costly, so there was good reason to stay with a simple structural model and only add intraindividual (ETA) effects later, because they added so much to the compute time. There may not have been much literature on this strategy because (so far as I understand) Sheiner and Beal were among the first to do modelling with both intra and inter individual random effects, and there was not much in the way of software for it before NONMEM. On Mon, 19 Mar 2007 11:32:54 -0700, "Mark Sale - Next Level Solutions" <[EMAIL PROTECTED]> said: > Dear Colleagues, I've lately been reviewing the literature on model > building/selection algorithms. I have been unable to find any even > remotely rigorous discussion of the way we all build NONMEM models. > The structural first, then variances/forward addition/backward > elimination is generally mentioned in a number of places (Ene Ettes in > Ann Pharmacother, 2004, Jaap Mandemas series on POP PK series J PK > Biopharm in 1992, Jose Pinheiros paper from the Joint Stats meeting in > 1994, Peter Bonates AAPS journal article in 2005, Mats Karlsons AAPS > PharmSci, 2002, the FDA guidance on Pop PK). It is most explicitly > stated in the NONMEM manuals (Vol 5, figure 11.1) - without any > reference. From the NONMEM manuals it is reproduced in many courses, > and has become axiomatic. I've looked at the stats literature on > forward addition/backwards elimination in both linear and logistic > regression, where it is at least formally discussed (with some > disagreement about whether it is "correct"). But, I am unable to find > any justification for the structural first, then covariates (drive by > post-hoc plots), then variance effects approach we use (I'm sure many > people will point out that it is not nearly that linear a process, > although in figure 11.1, Vol 5 of the NONMEM manuals, it is depicted > as a step-by-step algorithm, without any looping back). Can anyone > point me to any rigorous discussion of this model building strategy? > > Mark Sale MD Next Level Solutions, LLC www.NextLevelSolns.com > > > -- Alison Boeckmann [EMAIL PROTECTED]

Re: General question on modeling

From: Marc Gastonguay Date: March 20, 2007 technical
Hello Mark & nmusers, I'm just catching up with the flurry of emails on this topic... I don't think that anyone mentioned full model approaches to covariate modeling, although we have discussed this topic in detail in past nmusers threads. Frank Harrell's Regression Modeling Strategies text (I'll second Tony's recommendation) advocates this method as an alternative to stepwise methods when the purpose is to estimate the effect of covariates. The text also includes a useful discussion of the choice of modeling strategy as it relates to modeling objectives (e.g. prediction, effect estimation, hypothesis testing). In our group, we routinely apply full model methods for population PK covariate modeling and have managed to make useful inferences about covariate effects while avoiding stepwise methods and p-values altogether. Some examples of this method will be presented at ASCPT later this week. I also agree with the sentiment expressed by several contributors that we shouldn't be so concerned with finding the one perfect model. Instead, we should probably spend more time evaluating the impact of model deficiencies on the intended model-based applications and inferences. In addition to Harrell's book, some relevant references are listed below. Best regards, Marc Marc R. Gastonguay, Ph.D. Scientific Director, Metrum Institute [www.metruminstitute.org] President & CEO, Metrum Research Group LLC [www.metrumrg.com] Email: [EMAIL PROTECTED] Direct: +1.860.670.0744 Main: +1.860.735.7043 1. Ulrika Wählby, E. Niclas Jonsson and Mats O. Karlsson AAPS PharmSci 2002; 4 (4) article 27 ( http://www.aapspharmsci.org ). Comparison of Stepwise Covariate Model Building Strategies in Population Pharmacokinetic-Pharmacodynamic Analysis ****(full model approach is described in Discussion section). 2. Steyerberg EW, Eijkemans MJ, Habbema JD. Stepwise selection in small data sets: a simulation study of bias in logistic regression analysis. J Clin Epidemiol. October 1999;52(10):935-942. 3. Harrell FE, Jr., Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15(4):361-387. 4. Steyerberg EW, Eijkemans MJ, Harrell FE, Jr., Habbema JD. Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets. Stat Med. 2000;19(8):1059-1079. 5. M.R. Gastonguay. A Full Model Estimation Approach for Covariate Effects: Inference Based on Clinical Importance and Estimation Precision. The AAPS Journal; 6(S1), Abstract W4354, 2004. ( http:// metrumrg.com/publications/full_model.pdf) 6. Balaji Agoram; Anne C. Heatherington; Marc R. Gastonguay. Development and Evaluation of a Population Pharmacokinetic- Pharmacodynamic Model of Darbepoetin Alfa in Patients with Nonmyeloid Malignancies Undergoing Multicycle Chemotherapy. AAPS PharmSci Vol.: 8, No.: 3, 2006
Quoted reply history
On Mar 19, 2007, at 3:34 PM, AJ Rossini wrote: > I'd highly recommend reading Frank Harrell's book on Regression Modeling if you think that stepwise regression makes any sense. While much of the book applies to linear and generalized linear (i.e. categorical, etc) regression models, nonlinear models (and mixed effects models) would generally fall into the "well, if the simple case was like that, it can't be any simpler for the harder cases..."... Frank demonstrates some of the reasons that p- values from models generated using stepwise modeling are fairly useless (i.e. don't > > follow the behavior you'd expect from p-values). > > The literature to start looking at would be modern variable selection > > techniques for linear regression, i.e. work at Stanford Statistics by Hastie, Tibshirani, and their collaborators and former grad students (LASSO, LARS, > > elastic nets, and similar approaches). > > On Monday 19 March 2007 19:32, Mark Sale - Next Level Solutions wrote: > > > Dear Colleagues, > > > > I've lately been reviewing the literature on model building/ selection > > > > algorithms. I have been unable to find any even remotely rigorous > > discussion of the way we all build NONMEM models. The structural > > first, then variances/forward addition/backward elimination is > > generally mentioned in a number of places (Ene Ettes in Ann > > > > Pharmacother, 2004, Jaap Mandemas series on POP PK series J PK Biopharm > > > > in 1992, Jose Pinheiros paper from the Joint Stats meeting in 1994, > > Peter Bonates AAPS journal article in 2005, Mats Karlsons AAPS > > PharmSci, 2002, the FDA guidance on Pop PK). It is most explicitly > > stated in the NONMEM manuals (Vol 5, figure 11.1) - without any > > reference. From the NONMEM manuals it is reproduced in many courses, > > and has become axiomatic. I've looked at the stats literature on > > forward addition/backwards elimination in both linear and logistic > > regression, where it is at least formally discussed (with some > > > > disagreement about whether it is "correct"). But, I am unable to find > > > > any justification for the structural first, then covariates (drive by > > post-hoc plots), then variance effects approach we use (I'm sure many > > people will point out that it is not nearly that linear a process, > > > > although in figure 11.1, Vol 5 of the NONMEM manuals, it is depicted as a step-by-step algorithm, without any looping back). Can anyone point > > > > me to any rigorous discussion of this model building strategy? > > > > Mark Sale MD > > Next Level Solutions, LLC > > www.NextLevelSolns.com > > -- > best, > -tony > > [EMAIL PROTECTED] > Muttenz, Switzerland. > > "Commit early,commit often, and commit in a repository from which we can > > easily > roll-back your mistakes" (AJR, 4Jan05).

Re: General question on modeling

From: Tobias Sing Date: March 21, 2007 technical
Mark & list, I'm a newbie to the list. I hope I'm not duplicating anything mentioned yesterday (the archive seems to become available with delay), but this is a topic I'm also very much interested in, so I'd like to share my current view on it (I'd be happy to hear both dissenting or agreeing opinions).
Quoted reply history
> On Monday 19 March 2007 19:32, Mark Sale - Next Level Solutions wrote: > Dear Colleagues, > I've lately been reviewing the literature on model building/selection > algorithms. The structural > first, then variances/forward addition/backward elimination is > generally mentioned in a number of places > [...] Can anyone point > me to any rigorous discussion of this model building strategy? There can be no rigorous general (i.e. problem-independent) statement about the superiority of any variable or model selection strategy over another: * Wolpert, D.H. and W.G. Macready, 1997. No free lunch theorems for search. IEEE Transactions on Evolutionary Computation (cf. http://citeseer.ist.psu.edu/wolpert95no.html and http://en.wikipedia.org/wiki/No-free-lunch_theorem). Thus, the only justification for advocating the use of a particular strategy _without making use of problem-specific knowledge_ is the empirical observation that it often works well in practice. Other approaches besides forward addition/backward elimination also often work well. An up-to-date overview (opening a whole journal special issue on variable selection): * An Introduction to Variable and Feature Selection Isabelle Guyon, André Elisseeff; Journal of Machine Learning Research 3(Mar):1157--1182, 2003. http://www.jmlr.org/papers/volume3/guyon03a/guyon03a.pdf More or less subtle forms of overfitting always play a role in model selection, and with limited data, it is generally not possible to simultaneously select an optimal model _and_ obtain optimally accurate performance estimates, neither by relying on p-values, AIC/BIC/..., (double-)bootstrap-, or (double-) cross-validation-based procedures. However, the "double" versions for resampling the entire modeling process help a lot in obtaining more reliable estimates when doing a lot of "data dredging". Harrell's (fantastic) book was mentioned by some previous posters. In my personal opinion and experience, it is a bit too negative about stepwise variable selection or the simplified version of univariable screening (e.g. on pp. 56-60). In fact, Guyon/Elisseeff and many others have mentioned that greedy search strategies (such as forward/backward selection) are "particularly computationally advantageous and robust against overfitting", as compared to many more sophisticated approaches. Finally, for me, three important eye-openers on modeling, model uncertainty, and model selection in general (the first two also referenced in Harrell's book) were: * Model Specification: The Views of Fisher and Neyman, and Later Developments E. L. Lehmann Statistical Science 5:2 (1990), pp. 160-168. * Model uncertainty, data mining and statistical inference C. Chatfield Journal of the Royal Statistical Society A 158 (1995), pp. 419-466 * Statistical modeling: the two cultures (+ lots of discussion articles in the same issue) Leo Breiman Statistical Science 16 (2001), pp. 199-231 I hope this didn't sound too disappointing. Put positively, the fact that very few generic things can be said about the model selection process can be considered a "full employment theorem" for modelers... :) Cheers, Tobias. -- Tobias Sing Computational Biology and Applied Algorithmics Max Planck Institute for Informatics Saarbrucken, Germany Phone: +49 681 9325 315 Fax: +49 681 9325 399 http://www.tobiassing.net

RE: General question on modeling

From: Mark Sale Date: March 21, 2007 technical
Tobias, Thanks very much for your prespective. I especially appreciate your interest in evolutionary computation and machine learning, an area I think has a lot to contribute to our field. I don't know the reference you cite (but I will). My reading in evolutionary computation and machine learning (of which GA is one method) is that the "best" search algorithm depends on the assumptions one can make about the structure of the search space. Stepwise regression has it's own set of assumptions, some which are likely true in our field in most cases, some of which are certainly not true. But, evolutionary computation and machine learning is a very different approach (and IMHO a more rigorous approach) than what we currently do. Mark Sale MD Next Level Solutions, LLC www.NextLevelSolns.com
Quoted reply history
> -------- Original Message -------- > Subject: Re: [NMusers] General question on modeling > From: "Tobias Sing" <[EMAIL PROTECTED]> > Date: Tue, March 20, 2007 8:57 pm > To: "Sale Mark" <[EMAIL PROTECTED]>, nmusers > <[email protected]> > > Mark & list, > > I'm a newbie to the list. I hope I'm not duplicating anything > mentioned yesterday (the archive seems to become available with > delay), but this is a topic I'm also very much interested in, so I'd > like to share my current view on it (I'd be happy to hear both > dissenting or agreeing opinions). > > > On Monday 19 March 2007 19:32, Mark Sale - Next Level Solutions wrote: > > Dear Colleagues, > > I've lately been reviewing the literature on model building/selection > > algorithms. The structural > > first, then variances/forward addition/backward elimination is > > generally mentioned in a number of places > > [...] Can anyone point > > me to any rigorous discussion of this model building strategy? > > There can be no rigorous general (i.e. problem-independent) statement > about the superiority of any variable or model selection strategy over > another: > > * Wolpert, D.H. and W.G. Macready, 1997. No free lunch theorems for > search. IEEE Transactions on Evolutionary Computation (cf. > http://citeseer.ist.psu.edu/wolpert95no.html and > http://en.wikipedia.org/wiki/No-free-lunch_theorem). > > Thus, the only justification for advocating the use of a particular > strategy _without making use of problem-specific knowledge_ is the > empirical observation that it often works well in practice. Other > approaches besides forward addition/backward elimination also often > work well. An up-to-date overview (opening a whole journal special > issue on variable selection): > > * An Introduction to Variable and Feature Selection > Isabelle Guyon, André Elisseeff; > Journal of Machine Learning Research 3(Mar):1157--1182, 2003. > http://www.jmlr.org/papers/volume3/guyon03a/guyon03a.pdf > > More or less subtle forms of overfitting always play a role in model > selection, and with limited data, it is generally not possible to > simultaneously select an optimal model _and_ obtain optimally accurate > performance estimates, neither by relying on p-values, AIC/BIC/..., > (double-)bootstrap-, or (double-) cross-validation-based procedures. > However, the "double" versions for resampling the entire modeling > process help a lot in obtaining more reliable estimates when doing a > lot of "data dredging". > > Harrell's (fantastic) book was mentioned by some previous posters. In > my personal opinion and experience, it is a bit too negative about > stepwise variable selection or the simplified version of univariable > screening (e.g. on pp. 56-60). In fact, Guyon/Elisseeff and many > others have mentioned that greedy search strategies (such as > forward/backward selection) are "particularly computationally > advantageous and robust against overfitting", as compared to many more > sophisticated approaches. > > Finally, for me, three important eye-openers on modeling, model > uncertainty, and model selection in general (the first two also > referenced in Harrell's book) were: > > * Model Specification: The Views of Fisher and Neyman, and Later Developments > E. L. Lehmann > Statistical Science 5:2 (1990), pp. 160-168. > > * Model uncertainty, data mining and statistical inference > C. Chatfield > Journal of the Royal Statistical Society A 158 (1995), pp. 419-466 > > * Statistical modeling: the two cultures (+ lots of discussion > articles in the same issue) > Leo Breiman > Statistical Science 16 (2001), pp. 199-231 > > I hope this didn't sound too disappointing. Put positively, the fact > that very few generic things can be said about the model selection > process can be considered a "full employment theorem" for modelers... > :) > > Cheers, > Tobias. > > -- > Tobias Sing > Computational Biology and Applied Algorithmics > Max Planck Institute for Informatics > Saarbrucken, Germany > Phone: +49 681 9325 315 > Fax: +49 681 9325 399 > http://www.tobiassing.net