RE: Error models for log-transformed data

From: Martin Bergstrand Date: April 28, 2009 technical Source: mail-archive.com
Dear Kelong and NMusers, We will try to answer the questions by Kelong Han and also make some additional comments on error models for log-transformed data. >> 1. What is the rationale of fixing $SIGMA 1? The error model suggested by Mats Karlsson could as well be written with the estimation of an ERR(1) and ERR(2) (estimation of the variance for two SIGMA). However it is often convenient to fix the variance for SIGMA to 1 and instead estimate a scale-factor for that variance. This scale-factor can in the simple case of an additive error be a single THETA(x). In the case of an additive + proportional error model on the normal scale, the scale-factor is a function of two estimated parameters THETA(x), THETA(y) and the model prediction F. W = SQRT(THETA(x)**2 + THETA(y)**2/F**2) Y = LOG(F) + W*ERR(1) [$SIGMA 1 FIX] There are a number of practical reasons for estimating a scalar for SIGMA (W) rather than SIGMA itself. The two reasons that comes to mind immediately is to be able to calculate individual weighted residuals (see IWRES below) and to use the M3 or M4 methods for handling censored data (Beal SL, 2001). RES = DV - IPRED IWRES = RES/W >> 2. What an error structure on the normal scale is this "double exponential error model" equivalent to? The simple answer is that it is not equivalent to any model on the normal scale. It has some similarities to the combined additive and proportional error model but it does not predict negative concentrations and assumes a slight bias in the model predictions due to the addition of M (Y = LOG(F+M)...). ERR(1) can be fairly well translated to a proportional error on the normal scale but ERR(2) can't be directly interpreted as an additive component. >> 3. Compared to the simplest error model Y=LOG(F)+ERR(1), the two error models mentioned above contain additional THETA's. Are these additional THETA's accounted for in the calculation of the objective function value? Addition of parameters to the error model seems to follow the chi-squared distribution (Silber HE et al, 2009) i.e. with the addition of one parameter a drop in OFV of 3,84 corresponds to a 5% significance. The Karlsson is a nested model to the simple additive error model and the "double exponential error model" could probably be said to be "almost nested". However my personal opinion (Martin's) is that the development of a residual error model is best guided by the gof-plots. As a final remark to this answer I would like to point out that all models suggested this far for approximating a combined additive and proportional error model (on normal scale) for log-transformed data has drawbacks. I have already pointed out that the "double exponential error model" does introduce a bias due to the addition of the parameter M. The model suggested by Mats Karlsson on the other hand is only a good approximation for the cases when F > THETA(y). If F << THETA(y) the approximation will give rise to increasing mean absolute error and unrealistic predictions (Y). From personal experience (Martin's) this is primarily a problem in case of simulations. One might argue that also the combined additive and proportional error model on the normal scale has its drawbacks. Especially so since it is often applied to bioanalytical data for which non-random censoring of estimated negative concentrations is performed. Kind regards, Martin Bergstrand, Andrew Hooker and Joakim Nyberg ----------------------------------------------- Department of Pharmaceutical Biosciences, Uppsala University ----------------------------------------------- P.O. Box 591 SE-751 24 Uppsala Sweden -----------------------------------------------
Quoted reply history
-----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Han, Kelong Sent: den 28 april 2009 00:16 To: [email protected] Subject: [NMusers] Error models for log-transformed data Dear NMusers, I am working on a PK model using log-transformed data. I have read previous discussions on NMusers regarding this, and they are really helpful, but I am still a little bit confused about the following questions. I would greatly appreciate it if someone could make it clear: 1. Dr. Mats Karlsson suggested Y=LOG(F)+SQRT(THETA(x)**2+THETA(y)**2/F**2)*ERR(1) with $SIGMA 1 FIX as an equivalent error structure to the additive+proportional error model on the normal scale. What is the rationale of fixing $SIGMA 1? 2. Dr. Stu Beal and Dr. William Bachman suggested the "double exponential error model": Y = LOG(F+M) + (F/(F+M))*ERR(1) + (M/(F+M))*ERR(2) without fixing $SIGMA. The Goodness-of-Fit plot looks slightly better using this error model in my study. What an error structure on the normal scale is this "double exponential error model" equivalent to? 3. Compared to the simplest error model Y=LOG(F)+ERR(1), the two error models mentioned above contain additional THETA's. Are these additional THETA's accounted for in the calculation of the objective function value? This especially bothers me because the "double exponential error model" leads to a lower OFV compared to Y=LOG(F)+ERR(1) (also slightly better Goodness-of-Fit plot) in my study. Sorry for the length. Would anyone please give me some explanations or references? Thanks a lot! Kelong Han PhD Candidate=
Apr 27, 2009 Kelong Han Error models for log-transformed data
Apr 28, 2009 Martin Bergstrand RE: Error models for log-transformed data
Apr 29, 2009 Kelong Han RE: Error models for log-transformed data
Apr 30, 2009 Martin Bergstrand RE: Error models for log-transformed data
May 01, 2009 Kelong Han RE: Error models for log-transformed data