RE: SAEM and IMP
Third attempt to send to nmusers. Ignore if you have already received this
message.
Bob:
First. My apologies in case you are getting repeated e-mails from me. I am
finding I need to attempt several times for nmusers list-serv system to publish
my e-mails. Something to do with >40000 characters. So I need to cut off some
of the trailing e-mail portions that have already been published.
In my original investigations, the normal and t-distribution were using
Box-Mueller type random sample generations, which of course did not work well,
so I switched to inverse CDF methods. For normal distribution (DF=0), the IMP
uses the modified GASDEV, where you see that when ISELECT=5 (Sobol), DINVNORM
is used, and the the Box-Mueller method is used when Sobol method is not used.
Robert J. Bauer, Ph.D.
Vice President, Pharmacometrics, R&D
ICON Development Solutions
7740 Milestone Parkway
Suite 150
Hanover, MD 21076
Tel: (215) 616-6428
Mob: (925) 286-0769
Email: [email protected]<mailto:[email protected]>
Web: http://www.iconplc.com/
Quoted reply history
From: Bob Leary [mailto:[email protected]]
Sent: Monday, May 19, 2014 2:36 PM
To: Bauer, Robert; [email protected]<mailto:[email protected]>
Subject: RE: [NMusers] SAEM and IMP
Thanks, Bob. This indeed has been an interesting and thought provoking
discussion.
I too took another look at this. I thought, without really doing any analysis,
that the directionality bias using independent univariate t-components
would get really bad as the dimensionality increased, because that’s the way it
works with using fat-tailed independent double exponentials (samples there tend
to fail disproportionately
near the coordinate axes, and this gets arbitrarily bad as the dimensionality
increases without bound). But when I actually did the analysis in the student
t-case, it’s not bad at all and simply goes to a limit as the dimensionality
increases. So in fact, there may be some real advantages, particularly in
the Sobol quasi-random case, to doing it the way you do with independent
univariate t’s rather than using the multivariate t.
So my original message to Emmanuel to discourage him from using Sobol in
combination with the t-distribution was based on the erroneous assumption that
IMP was using the multivariate-t rather than
Independent univariate t’s. So his suggestion is probably OK (except as you
noted, that an even value of DF might be cleaner than DF=7 given the way tdev2
works). But there is still the question of why you were seeing a bias with
your original method before going to tdev2 –was it simply a different method of
generating the same distribution?
Along somewhat similar lines, I did notice that your routine GASDEV for
generating normal N(0,1) components (presumably used in the DF=0 case)
actually offers two different ways of doing it –
Box-Muller and inverse cdf of a uniform 0-1. It is not clear which one you get
when you simply specify DF=0. The inverse cdf method of course is by
definition OK for the quasi-random case – the cdf of the random vectors
generated with the Sobol distribution using the inverse cdf method will
simply be that Sobol distribution. But for Box Muller, this certainly will not
be the case, and it is not clear what the discrepancy of the Box Muller
generated CDF will be like. Certainly Box Muller is obviously OK in the
pseudorandom case, and will have the same discrepancy as the inverse cdf method
using pseudorandom uniform 0-1 inputs,
but whether the CDF of a Box Muller – generated sequence of Normal random
vectors using Sobol vector inputs is actually a low discrepancy sequence (or
even a relatively low discrepancy sequence) is not at all clear. I know this
has been looked at in 2-dimensions and there Box Muller empirically looks OK,
but I don’t know if anything has actually been proved. The low discrepancy
property tends to be fragile with respect to nonlinear transformations. So it
is possible that tdev2 actually does preserve this property , or at least does
relatively little damage to it compared to whatever you were using before.
From: Bauer, Robert [mailto:[email protected]]
Sent: Monday, May 19, 2014 11:08 AM
To: Bob Leary; [email protected]<mailto:[email protected]>
Subject: RE: [NMusers] SAEM and IMP
Bob:
Yes, these are 1 Dimensional t-distribution random deviates generated by TDEV2,
followed by scaling p sets of them with an offset vector and cholesky matrix to
provide covariance and mean. It provides the tails as you say (property a),
and is conveniently used in the Sobol process.
Your discussion on the properties of various t-distribution sample creation
techniques and their possible impact in importance sampling is interesting.
With that in mind, I just completed testing example6, which has an 8
dimensional eta space, and found no bias in the objective function evaluation,
or in parameter assessment when setting DF=2 or 4 (I have not tried others),
with or without using Sobol (RANMETHOD=3S2). Also, using the 1 dimensional
t-distribution random deviates retained the Sobol method’s stochastic noise
reduction ability in this example. When I used the multivariate t-distribution
algorithm, Sobol’s stochastic noise reduction was not as good, confirming what
you related earlier.
As you point out as well, property b (radial symmetry) may or may not be needed
depending on the posterior density. I am inclined to think that since the
posterior density is not going to be perfectly t-distributed or normal
distributed anyway, then the sampling density matching the posterior density
regarding the radial symmetry property may be of less of relevance. The more
important aspect may be to choose a sampling density that has long tails in
cases where the posterior density also has long tails, to promote the general
efficiency of the sampling density for that posterior density.
It is possible that because the sampling density is also dynamically scaled to
best fit the posterior density, this reduces any inefficiency in fitting the
posterior density that might occur from the sampling density not having radial
symmetry.
Robert J. Bauer, Ph.D.
Vice President, Pharmacometrics, R&D
ICON Development Solutions
7740 Milestone Parkway
Suite 150
Hanover, MD 21076
Tel: (215) 616-6428
Mob: (925) 286-0769
Email: [email protected]<mailto:[email protected]>
Web: http://www.iconplc.com/