Slow Gradient Method
Date: Thu, 31 May 2001 09:51:15 -0700 (PDT)
From: stuart@c255.ucsf.edu
>From Matt Hutmacher:
>I am trying to use the CENTERING option of the ESTIMATION statement for a
>mixture model. I get a statement at the end of the report file that says
>"CENTERED METHODS MUST USE SLOW GRADIENT METHOD WITH MIXTURE MODEL".
>Can someone tell me how to use this method and what it means/does?
As some may know, there are undocumented and unsupported features in NONMEM which are not intended for the general user and which should not interfere with the general use of the program. We make no apology for this.
There are other arcane features which will pop into view on rare occasions. Please feel quite free to contact the NONMEM User Support Group when this happens. It seems that Matt has stumbled on one of these occasions. The meaning of the "SLOW gradient method" is one which the NONMEM user can essentially ignore. However, Matt will need to respond to the message; he should simply include the option SLOW in the $ESTIMATION record. (Matt, please ask yourself once again as to why indeed you wish to use the CENTER option with a mixture model.)
Commenting on some of the NM-Users discussion which ensued from Matt's question, in the order it seems to have been generated:
>From Bill Bachman:
>There are essentially two ways NONMEM obtains gradients needed for
>performing the pseudo-Newton minimization. One involves only numerical
>derivatives, the other involves a combination of analytical and numerical
>derivatives. As you might imagine, the first is often slower than the
>second. It is therefore used less often. It is used when NUMERICAL is
>specified.
This is correct. Note that here Bill is saying that there is a choice between numerical derivatives and analytic ones concerning the way gradients to the objective function surface are computed.
There are second derviatives with respect to eta which are a part of the Laplacian objective function itself. These can often be computed analytically. If the NUMERICAL option is included in the $ESTIMATION record, these second derivatives are computed numerically. Then, as Bill states, it so happens that the SLOW option is always also used. But then however, using NM-TRAN, this choice should be transparent to the user (there will be no message such as Matt experienced).
The NUMERICAL option is documented. It is necessary to use this option in certain cases. NM-TRAN will provide messages that indicate that the option should be used when it is mistakenly omitted. Unless one is using the option in the cases where it is necessary to do so, or unless one is simply experimenting with this option, there is no need to use it.
>From Stephen Duffull:
>Based on the discussions I am a little unsure what the value of the slow
>gradient method is. I would have thought that analytical derivatives would
>be more accurate and perhaps more stable than numerical - and therefore I am
>not sure why a potentially slower and perhaps less reliable method is of
>interest to us? Could you explain where the numerical method might be
>valuable?
Analytical derivatives can be more accurate, more stable, and faster to compute, as Stephen suggests. But, e.g. when NUMERICAL is used, and also in Matt's case, it just so happens that NONMEM is not using analytical derivatives to compute gradients of the objective function surface. This should be essentially of no concern to the user.
>I presume for situations where the model can only be described as ODEs then
>there might be little choice - but otherwise I can't see the advantage.
In fact, NONMEM is unaware when PREDPP is using DE's (differential equations), and NONMEM's choice as to whether or not to use analytical derivatives to compute gradients of the objective function surface is unaffected.
>From Niclas Jonsson:
>I don't know if the SLOW method uses numerical derivatives or not but it
>is perhaps important to point out that the SLOW option on the $ESTIMATION
>is not the same as the NUMERICAL option. The NUMERICAL option requests
>that the second derivatives for the LAPLACE method are computed
>numerically, which, I presume, is quicker and sometimes more tractable
>than analytical second derivatives.
Here, Niclas emphasizes the same distinction I have tried to make above between the SLOW and NUMERICAL options. He suggests moreover that the use of the NUMERICAL option can sometimes result in quicker computations. Indeed this can happen, but the circumstances when this can happen are rare, and I think the user can fairly safely assume that where possible, NUMERICAL should be avoided.
>As I recall it, the SLOW option gives you the version of FOCE that was
>implemented in NONMEM IV. In one of the beta versions of NONMEM V there
>was an improvement to the FOCE algorithm that made it about three times
>faster (my own, hardly remebered, benchmarks). The new method could,
>however, not handle certain cases, i.e. CENTERing, mixture model and when
>the NUMERICAL option is used.
Indeed, with NONMEM IV, the only choice was to use the SLOW gradient method, and so no distinction was made. The newer and faster method may be used in most situations, including ones where the option CENTERING is used, except when there is also a mixture model (Matt's situation). The newer and faster method is the default method.
Stuart Beal