RE: SAEM and IMP

From: Joseph Standing Date: May 15, 2014 technical Source: mail-archive.com
Dear Emmanuel and Pavel, Further to Bob's answer, recall also that delta OFV in the likelihood ratio test is only asymptotocaily chi squared distributed,and this is not the only reason why you should not get too hung up on OFV to help choose your models. For example, Lavielle 2010 in Biometrics showed nicely how the SAEM algorithm can estimate parameters of complex differential equation models for joint HIV viral load and CD4 counts. Using OFV-based metrics a latent model was chosen whereby the majority of circulating T-cells were infected with virus - this model also gave nice fits to the data. When immunologists look at CD4 cells of HIV infected patients however, they find that much less than 1% (closer to 0.01%) of circulating T-cells contain virus (most of the virus making up the latent reservoir is stuck to folicular cells in the periphery), so one would have to question the meaning of the parameters identified as the best fit by SAEM. By all means use SAEM to fit ODE models that don't run/converge with other algorithms (I do), but choose models with parameters that make mechanistic sense rather than relying too heavily on OFV-based metrics. A nice VPC always goes down well too. Joe Joseph F Standing MRC Fellow, UCL Institute of Child Health Antimicrobial Pharmacist, Great Ormond Street Hospital Tel: +44(0)207 905 2370 Mobile: +44(0)7970 572435
Quoted reply history
________________________________________ From: [email protected] [[email protected]] On Behalf Of Bob Leary [[email protected]] Sent: 15 May 2014 19:22 To: Emmanuel Chigutsa; Pavel Belo; [email protected] Subject: RE: [NMusers] SAEM and IMP Hi Emmanuel, While I am a strong advocate of using quasi-random rather than pseudo- random sequences for importance sampling in EM methods like IMP, there is a theoretical (and very real) problem with their use in the context you suggested in your message, namely with a multivariate t distribution as the importance sampling distribution. The 3S2 option implies you are using a Sobol quasi-random sequence, while the DF=7 implies the use of a multivariate T-distribution with 7 degrees of freedom. The standard way of generating a p-dimensional multivariate t -random variable with DF degrees of freedom is to generate a p-dimensional multivariate normal and then divide by an additional independent random variable which is basically the square root of a 1-d chi square random variable with DF degrees of freedom. Thus to generate a p-dimensional importance sample, you actually need to use p+1 independent random variables. If you simply use a p+1 dimensional Sobol vector as the base quasi-random draw, the nonlinear mapping from p+1 dimensions to the final p dimensional result destroys the low discrepancy property of the final sequence in the p-dimensional space and in fact introduces a significant amount of bias in the final result. The problem arises directly from the p+1 vs p dimensional mismatch. There is no problem if the final p-dimensional result can be generated from a p-dimensional quasi-random sequence, which is the case for multivariate normal Importance samples. So quasi random sequences should really only be used for the DF=0 multivariate normal importance sampling distribution case, not the multivariate DF>0 multivariate t case. I ran across this effect in testing the Sobol-based importance sampling EM algorithm QRPEM in Phoenix NLME. It is very real and the net effect is to introduce a significant bias. There is a partial fix that works but gives up some of the benefit of using low-discrepancy sequences – namely use a p-dimensional quasi-random vector to generate the p-dimensional multivariate normal, but then use a 1-d pseudo-random sequence to generate the chi-square random variable. From: [email protected] [mailto:[email protected]] On Behalf Of Emmanuel Chigutsa Sent: Thursday, May 15, 2014 1:03 PM To: Pavel Belo; [email protected] Subject: Re: [NMusers] SAEM and IMP Hi Pavel I have experienced a similar problem. In my case, the following code for IMP after SAEM (using NM7.3) greatly reduced the Monte Carlo OFV noise from variations of about +/- 60 points to variations of +/- 6 points (though still not good enough for covariate testing): $EST METHOD=IMP LAPLACE INTER NITER=15 ISAMPLE=3000 EONLY=1 DF=7 IACCEPT=0.3 ISAMPEND=10000 STDOBJ=2 MAPITER=0 PRINT=1 SEED=123456 RANMETHOD=3S2 The settings are explained in the NM7.3 guide. If you are using NM7.3, you can also try IACCEPT=0.0 whereupon "NONMEM will determine the most appropriate IACCEPT level for each subject". Of course the settings for DF and IACCEPT in the above code will depend on the type of data you have. Which brings me to my own question. If I have both continous and categorical DVs in the dataset (which would mean different optimal settings) and I am using F_FLAG accordingly, what would the 'right' values of DF and IACCEPT be? I have noticed that the DF automatically chosen by NONMEM for individuals in the dataset can vary from 0-8 and this appears to be random.
May 15, 2014 Pavel Belo SAEM and IMP
May 15, 2014 Robert Bauer RE: SAEM and IMP
May 15, 2014 Brian Sadler RE: SAEM and IMP
May 15, 2014 Emmanuel Chigutsa Re: SAEM and IMP
May 15, 2014 Bob Leary RE: SAEM and IMP
May 15, 2014 Joseph Standing RE: SAEM and IMP
May 16, 2014 Robert Bauer RE: SAEM and IMP
May 16, 2014 Bob Leary RE: SAEM and IMP
May 19, 2014 Robert Bauer RE: SAEM and IMP