RE: SAEM and IMP
Bob, thanks for your reply and pointer to the source code for tdev2 . I have
looked at it but perhaps you can clarify what the source of tdev2 is
and how it is used (I cannot see any calls made to it in your source code, so I
am not really
sure how it is being used). I can see for DF =1, tdev2 returns a Cauchy
distributed random variable, which is the 1D t-distribution for DF=1, and for
t=2 it looks like it returns the 1D
t-distribution for DF=2. For the other DF values it is not so clear, but I am
guessing it returns a 1D t-distribution with the specified DF.
If you simply assemble a p-dimensional sample vector with components which are
independent 1D t random variables with a specified DF , indeed it will avoid
the dimensional mismatch
and work OK without bias in the QR case, but it will not be a true
multivariate t-distribution (the components of a multivariate t-distribution
are not independent).
The usual argument for using a multivariate t as an importance sampler is that
a) It has fatter tails than a multivariate normal
b) It preserves the radial symmetry of the multivariate normal in the case
where the covariance matrix is the unit matrix (i.e. the density is only a
function of the distance from the origin)
Assembling a p-dimensional sample vector whose individual components are 1D t
distributions with a specified DF satisfies a), but not b). It’s a perfectly
valid importance sampling distribution, and indeed has been suggested in the
literature as much simpler alternative to the multivariate t, particularly in
low dimensions, but it is not a multivariate t (which is the more usual meaning
of ‘t-distribution’ in the multivariate importance sampling context) . It is
not radially symmetric (which may or may not be important, depending on the
posterior multidimensional shape and
the dimensionality. There is a directionality bias which increases with
dimensionality)
The documentation is somewhat ambiguous –
DF=4
The proposal density is to be t distribution with 4 degrees of freedom.
Based on the code I have seen, I now guess this means that each component of
the proposal is an independent t random variable with DF=4.
But I am only guessing – perhaps you could clarify what distribution is really
being referred to by ‘t-distribution’ in the p-dimensional case – multivariate
p-dimensional t with specified DF or
Cartesian product of p independent 1D t random variables with specified DF?
Quoted reply history
From: [email protected] [mailto:[email protected]] On
Behalf Of Bauer, Robert
Sent: Thursday, May 15, 2014 9:57 PM
To: [email protected]
Subject: RE: [NMusers] SAEM and IMP
Bob:
Yes, when I was developing the code for using the Sobol method in NONMEM, I
too found that using the t-distribution with Sobol/Quasi normal method resulted
in biased assessments of the objective function, when I used the unmodified
standard technique of generating t-samples. So, I experimented with various
alternative methods like you did, and the final algorithms used in NONMEM 7.2
and 7.3 appear to avoid the bias, at least when I tried some simple models,
such as the example1 problem in the NONMEM ..\examples directory. For such a
problem, I tried the following DF settings, with and without using the SOBOL
method (RANMETHOD=3 (default), or RANMETHOD=3S2), DF=0, 1, 2, 3, 4, 5, 6, 7, 8,
10, 11, and 12. I think beyond 12 DF is sufficiently normal like, and not much
use.
In all cases, the OBJF was within 1 unit of each other, and always within a STD
of the Monte Carlo noise to the OBJF (I used EONLY=1, 20 iterations to obtain
OBJFV replicates for each setting). By the way, the .cnv file publishes the
mean and standard deviation of the last CITER objective function values (see
nm730.pdf manual on root.cnv).
Certainly it is worth assuring this lack of bias under other kinds of, and more
complex, models, just to make sure, so it is worth while for modelers to use
caution when mixing Sobol with DF>0.
I should point out that odd-number DF’s such as 7 makes the corrective
algorithm for Sobol with t-distributions work harder, whereas even-numbered
DF’s are a little cleaner. So I recommend even-numbered DF’s, or 1, up to 10
to work with Sobol, at least as far as my algorithm is concerned (so
DF=1,2,4,6,8,10 seem most efficient to use with Sobol).
The algorithm I use for t-distributions is in the subroutine TDEV2, and is in
..source\general.f90. I publish it because I borrowed freely from the
literature, and I consider it therefore “open-source”.
Just a note for orienting you to the code, IRANM=5 refers to the Sobol method.
I am afraid my theoretical expertise in the Sobol/Quasi random methods is not
as good as yours, so please feel free to review the code, and perhaps let me
and the modeling community know if there may be any potential pitfalls in using
this t-distribution algorithm with Sobol. I am also happy to consider
improvements you may come up with on this matter.
Robert J. Bauer, Ph.D.
Vice President, Pharmacometrics, R&D
ICON Development Solutions
7740 Milestone Parkway
Suite 150
Hanover, MD 21076
Tel: (215) 616-6428
Mob: (925) 286-0769
Email: [email protected]<mailto:[email protected]>
Web: http://www.iconplc.com/