RE: SAEM and IMP

From: Bob Leary Date: May 15, 2014 technical Source: mail-archive.com
Hi Emmanuel, While I am a strong advocate of using quasi-random rather than pseudo- random sequences for importance sampling in EM methods like IMP, there is a theoretical (and very real) problem with their use in the context you suggested in your message, namely with a multivariate t distribution as the importance sampling distribution. The 3S2 option implies you are using a Sobol quasi-random sequence, while the DF=7 implies the use of a multivariate T-distribution with 7 degrees of freedom. The standard way of generating a p-dimensional multivariate t -random variable with DF degrees of freedom is to generate a p-dimensional multivariate normal and then divide by an additional independent random variable which is basically the square root of a 1-d chi square random variable with DF degrees of freedom. Thus to generate a p-dimensional importance sample, you actually need to use p+1 independent random variables. If you simply use a p+1 dimensional Sobol vector as the base quasi-random draw, the nonlinear mapping from p+1 dimensions to the final p dimensional result destroys the low discrepancy property of the final sequence in the p-dimensional space and in fact introduces a significant amount of bias in the final result. The problem arises directly from the p+1 vs p dimensional mismatch. There is no problem if the final p-dimensional result can be generated from a p-dimensional quasi-random sequence, which is the case for multivariate normal Importance samples. So quasi random sequences should really only be used for the DF=0 multivariate normal importance sampling distribution case, not the multivariate DF>0 multivariate t case. I ran across this effect in testing the Sobol-based importance sampling EM algorithm QRPEM in Phoenix NLME. It is very real and the net effect is to introduce a significant bias. There is a partial fix that works but gives up some of the benefit of using low-discrepancy sequences - namely use a p-dimensional quasi-random vector to generate the p-dimensional multivariate normal, but then use a 1-d pseudo-random sequence to generate the chi-square random variable.
Quoted reply history
From: [email protected] [mailto:[email protected]] On Behalf Of Emmanuel Chigutsa Sent: Thursday, May 15, 2014 1:03 PM To: Pavel Belo; [email protected] Subject: Re: [NMusers] SAEM and IMP Hi Pavel I have experienced a similar problem. In my case, the following code for IMP after SAEM (using NM7.3) greatly reduced the Monte Carlo OFV noise from variations of about +/- 60 points to variations of +/- 6 points (though still not good enough for covariate testing): $EST METHOD=IMP LAPLACE INTER NITER=15 ISAMPLE=3000 EONLY=1 DF=7 IACCEPT=0.3 ISAMPEND=10000 STDOBJ=2 MAPITER=0 PRINT=1 SEED=123456 RANMETHOD=3S2 The settings are explained in the NM7.3 guide. If you are using NM7.3, you can also try IACCEPT=0.0 whereupon "NONMEM will determine the most appropriate IACCEPT level for each subject". Of course the settings for DF and IACCEPT in the above code will depend on the type of data you have. Which brings me to my own question. If I have both continous and categorical DVs in the dataset (which would mean different optimal settings) and I am using F_FLAG accordingly, what would the 'right' values of DF and IACCEPT be? I have noticed that the DF automatically chosen by NONMEM for individuals in the dataset can vary from 0-8 and this appears to be random.
May 15, 2014 Pavel Belo SAEM and IMP
May 15, 2014 Robert Bauer RE: SAEM and IMP
May 15, 2014 Brian Sadler RE: SAEM and IMP
May 15, 2014 Emmanuel Chigutsa Re: SAEM and IMP
May 15, 2014 Bob Leary RE: SAEM and IMP
May 15, 2014 Joseph Standing RE: SAEM and IMP
May 16, 2014 Robert Bauer RE: SAEM and IMP
May 16, 2014 Bob Leary RE: SAEM and IMP
May 19, 2014 Robert Bauer RE: SAEM and IMP