RE: distribution assumption of Eta in NONMEM
I'd like to interject a slightly different point of view to the distributional
assumption question here.
When I hear people speak in terms of the distribution assumptions of some
estimation method I think its easy for people to jump to the conclusion that
the normal distribution assumption is just one of many possible, equally
justifiable distributional assumptions that could potentially be made. And
that if the normal distribution is the wrong one then the results from such
an estimation method would be wrong. This is what I used to think, but now I
believe this is wrong and I'd like to help others from wasting as much time
thinking along this path, as I have.
From information theory, information is gained when entropy decreases. So if
you have data from some unknown distribution and if you must make some
distribution assumption in order to analyze the data, you should choose the
highest entropy distribution you can. This insures that your initial
assumptions, the ones you do before you actually consider your data, are the
most uninformative you can make. This is the principle of Maximum Entropy
which is related to Principle of Indifference and the Principle of Insufficient
Reason.
A normal distribution has the highest entropy of all real-valued distributions
that share the same mean and standard deviation. So if you assume your data
has some true SD, then the best distribution to assume would be normal
distribution. So we should not think of the normal distribution assumption as
one of many equally justifiable choices, it is really the least-bad
assumption we can make when we do not know the true distribution. Even if
normal is the wrong distribution, it still remains the best, by virtue of
being the least-bad, because it is the most uninformative assumption that can
be made (assuming a some finite true variance).
In the real-word we never know the true distribution and so it makes sense to
always assume a normal distribution unless we have some scientifically
justifiable reason to believe that some other distribution assumption would be
advantageous.
The Cauchy distribution is a different animal though since its has an infinite
variance, and is therefore an even weaker assumption than the finite true SD of
a normal distribution. It would possibly be even better than a normal
distribution because its entropy is even higher (comparing the standard Cauchy
and standard normal). It would be very interesting if Cauchy distributions
could be used in NONMEM. Actually, the ratio of two N(0,1) random variables is
Cauchy distributed. Maybe this property could be used trick NONMEM into making
a Cauchy (or nearly-Cauchy) distributed random variable?
Douglas Eleveld