Re: Covariate modelling question
Dear Rob, Michael, Kasja
Many thanks for your very helpful replies, and apologies for the delay in my response. I had looked at plots, but I confess a while ago now! I will make sure to revisit them to ensure that the model we are building makes sense. Thanks for the Sherer reference, which I will also take a look at.
Best wishes
Fiona
Quoted reply history
On 26/02/2015 18:59, Bies, Robert R. wrote:
> Hi Fiona,
>
> I agree with Michael on this. It is not unusual to get models that are not feasible using this approach as was demonstrated by Mark Sale and Eric Sherer - See Sherer et al JPKPD 2012;Aug 39(4):393-414. In that paper, the authors show a simulation example (it is compared to the GA – but SCM, Lasso and others are tested against each other from a simulated set with different forward and backward thresholds). A key aspect is scientific plausibility in incorporating these effects (i.e., focusing on those that are likely or are part of your hypothesis test). I would add that additional tests could be an evaluation of the predictive capacity of the model with the additional covariates (predicting either into subsets of the dataset as a cross validation) or ideally with an external validation dataset to evaluate improvement in prediction with inclusion.
>
> Regards,
>
> Rob
>
> Robert R. Bies Pharm.D.Ph.D.
>
> Associate Professor of Medicine and Medical Genetics
>
> Division of Clinical Pharmacology
>
> Member
>
> Center for Computational Biology and Bioinformatics
>
> Indiana University School of Medicine
>
> Director, Disease and Therapeutic Response Modeling Program
>
> Indiana CTSI
>
> R2 Room E480
>
> 950 Walnut Street
>
> Indianapolis, IN 46202
>
> 317-274-2822 (office)
>
> *From:* [email protected] [ mailto: [email protected] ] *On Behalf Of *Michael Fossler
>
> *Sent:* Thursday, February 26, 2015 7:34 AM
> *To:* Fiona Vanobberghen; [email protected]
> *Subject:* RE: [NMusers] Covariate modelling question
>
> Hi Fiona;
>
> You didn’t state this, but I am assuming that you have looked at plots of partial residuals of each parameter with respect to each covariate and have determined whether a pattern exists which would help you decide whether a given covariate is worth including in the model? Also, I would assume that you’ve considered the ultimate purpose of the model , and have a pre-specified notion of which covariates you would like to test, based on some biological/medical rationale? My point being, you should not rely on p-values to select covariates – doing so will give you the situation you have just described: a large, overly-complex model.
>
> Regardless of the technical details, if you can’t see a pattern in the residual plots with regard to a given covariate, it is unlikely to provide any meaningful reduction in the residual error of your parameter model.
>
> *Michael Fossler, Pharm. D., Ph. D., F.C.P.*
>
> *Senior Director*
>
> Clinical Pharmacology Modeling and Simulation
>
> RD Projects Clinical Platforms & Sciences
>
> *GSK*
>
> *Upper Merion West*
>
> *King of Prussia, PA*
>
> *Email [email protected] <mailto:[email protected]>*
>
> *Tel +*1 610 270 4797
>
> Cell 443-350-1194
>
> gsk.com < http://www.gsk.com/ > | Twitter < http://twitter.com/GSK > | YouTube < http://www.youtube.com/user/gskvision > | Facebook < http://www.facebook.com/glaxosmithkline > | Flickr < http://www.flickr.com/photos/glaxosmithkline >
>
> *From:* [email protected] < mailto: [email protected] > [ mailto: [email protected] ] *On Behalf Of *Fiona Vanobberghen
>
> *Sent:* Thursday, February 26, 2015 5:01 AM
> *To:* [email protected] <mailto:[email protected]>
> *Subject:* [NMusers] Covariate modelling question
>
> I posted this message a few days ago but it doesn't seem to have been sent to the list - so I'm resending without the example output.
>
> Best wishes
> Fiona
>
> --
> Dear all
>
> I am attempting to do some covariate modelling, using the scm wizard in Pirana. I have seen some results which I wasn't expecting and would be grateful if anyone could shed any light on it for me.
>
> Initially, I used a forward inclusion p value of 0.1 and a backward elimination p value of 0.05. This resulted in quite a complex (implausible) model (we do have a reasonably large dataset), and I decided to be more stringent, using p<0.05 for inclusion (and the same p>0.05 for elimination at the last step). As a shortcut, I could see from the output from the first attempt (with p<0.1) what I expected the final model to look like if I were to run it again with p<0.05, ie where the process would truncate. Just to double check (and verify that nothing would be eliminated at the last step), I re-ran the scm wizard with the more stringent p<0.05. And the results are not what I expected... Below I have pasted the output for the first few forward steps from each attempt. The results are essentially the same up until the third step, although we see some small differences in the OFV creeping in from the second step. However, at the fourth step, the results are completely different. This isn't what I was expecting, based on my understanding of the model selection process. Is this a known behaviour? Has anyone experienced this problem and/or know why these differences might occur? I'd be grateful for any advice.
>
> Many thanks in advance for your help.
>
> Best wishes
> Fiona
>
> --
> *Fiona Vanobberghen (née Ewings), PhD*
> Swiss Tropical and Public Health Institute
> Socinstrasse 57, 4051, Basel, Switzerland
> Tel: +41 61 284 87 41