Constrain PD values using a logistic transformation

5 messages 3 people Latest: Jul 06, 2010
Dear NMusers, I am trying to model some PD data, which has a lower bound of zero and an upper bound of 100. I was wondering how to implement this restriction and if it was possible to use the general logistic transformation in the $ERROR block shown below: $ERROR IPRE=A(1) LT=LOG(IPRE/(100-IPRE))+ERR(1) Y=(100*EXP(LT))/(1+EXP(LT)) If this is appropriate, do I understand correctly that this is NOT a transform both sides approach; i.e. DV stays in its original or natural form. Finally, the logistic transformation extends from -∞ to +∞. However, the dataset does have a small number of values that are zeros and 100 (Five zeros and a couple of 100s in a dataset of about 700 observations). Do these small number of extreme values in the dataset cause problem when the LT term is back transformed above. Any other method and references for papers that use these types of constraints would be greatly appreciated. Warm regards and thanks in advance...MNS
Hi Mahesh, If you plan to use one of the approximate likelihood methods, e.g., FO or FOCE, you may prefer to transform the data and use an additive model. In other words transform the data according to Yobs = LOG(Xobs/(100-Xobs)) and use Y = 100 * LOG(IPRE/(100-IPRE))+ERR(1) where Xobs is the observed data on the restricted range. Since you have some data at the extremes, you may want to extend the range used for the extended logit to (-0.5, 100.5) or something similar. Otherwise you'll end up with under- or over-flows. Regarded other transformations, anything that transforms from a bounded interval to the real line is potentially fair game. For example you could use probit or complimetary log-log transformations extended to (0, 100). Another approach would be to use an beta distribution extended to (0, 100) instead of (0, 1) for the likelihood. Such an approach is described for a model of ADAS-cog scores as a function of time (see the Alzheimer's disease progression model at http://opendiseasemodels.org, specifically the model used for the "raw" scores). Cheers, Bill Gillespie
Quoted reply history
On Jul 1, 2010, at 3:47 PM, Samtani, Mahesh [PRDUS] wrote: > Dear NMusers, > I am trying to model some PD data, which has a lower bound of zero and an > upper bound of 100. I was wondering how to implement this restriction and if > it was possible to use the general logistic transformation in the $ERROR > block shown below: > > $ERROR > IPRE=A(1) > LT=LOG(IPRE/(100-IPRE))+ERR(1) > Y=(100*EXP(LT))/(1+EXP(LT)) > > If this is appropriate, do I understand correctly that this is NOT a > transform both sides approach; i.e. DV stays in its original or natural form. > > Finally, the logistic transformation extends from -∞ to +∞. However, the > dataset does have a small number of values that are zeros and 100 (Five zeros > and a couple of 100s in a dataset of about 700 observations). Do these small > number of extreme values in the dataset cause problem when the LT term is > back transformed above. > > Any other method and references for papers that use these types of > constraints would be greatly appreciated. > > Warm regards and thanks in advance...MNS >
Dear Dr. Gillespie, Your insight is greatly appreciated. I have 3 follow-up questions: 1. Is there a typo in the equation Y = 100 * LOG(IPRE/(100-IPRE))+ERR(1). I guess it should be OK to write IPRE=LOG(A(1)/(100-A(1))) and Y=IPRE+ERR(1) so I can compare IPRE vs. DV on the diagnostics 2. By extended logit do you mean: IPRE=LOG((A(1)+0.5)/(100.5-A(1))). I guess the Yobs would also have to be computed using LOG((Xobs+0.5)/(100.5-Xobs)) 3. I have read some of your elegant work on the AD progression model. However, I am not sure how to implement the beta distribution in NONMEM. It appeared to me that a ratio of ADAS-cog divided by ADAS-COGmax of 70 was computed and then it was put in a logit transform. Could you kindly describe this method in a little bit detail Interestingly I haven't received any other responses from NMusers. I read a couple of papers in the past few days and all of them appear to ignore this problem. The commonly used additive residual error model is usually being utilized for scores with limits as endpoints. Thank-you, Mahesh
Quoted reply history
________________________________ From: Bill Gillespie [mailto:[email protected]] Sent: Fri 7/2/2010 3:04 PM To: Samtani, Mahesh [PRDUS] Cc: [email protected] Subject: Re: [NMusers] Constrain PD values using a logistic transformation Hi Mahesh, If you plan to use one of the approximate likelihood methods, e.g., FO or FOCE, you may prefer to transform the data and use an additive model. In other words transform the data according to Yobs = LOG(Xobs/(100-Xobs)) and use Y = 100 * LOG(IPRE/(100-IPRE))+ERR(1) where Xobs is the observed data on the restricted range. Since you have some data at the extremes, you may want to extend the range used for the extended logit to (-0.5, 100.5) or something similar. Otherwise you'll end up with under- or over-flows. Regarded other transformations, anything that transforms from a bounded interval to the real line is potentially fair game. For example you could use probit or complimetary log-log transformations extended to (0, 100). Another approach would be to use an beta distribution extended to (0, 100) instead of (0, 1) for the likelihood. Such an approach is described for a model of ADAS-cog scores as a function of time (see the Alzheimer's disease progression model at http://opendiseasemodels.org http://opendiseasemodels.org/ , specifically the model used for the "raw" scores). Cheers, Bill Gillespie On Jul 1, 2010, at 3:47 PM, Samtani, Mahesh [PRDUS] wrote: Dear NMusers, I am trying to model some PD data, which has a lower bound of zero and an upper bound of 100. I was wondering how to implement this restriction and if it was possible to use the general logistic transformation in the $ERROR block shown below: $ERROR IPRE=A(1) LT=LOG(IPRE/(100-IPRE))+ERR(1) Y=(100*EXP(LT))/(1+EXP(LT)) If this is appropriate, do I understand correctly that this is NOT a transform both sides approach; i.e. DV stays in its original or natural form. Finally, the logistic transformation extends from -? to +?. However, the dataset does have a small number of values that are zeros and 100 (Five zeros and a couple of 100s in a dataset of about 700 observations). Do these small number of extreme values in the dataset cause problem when the LT term is back transformed above. Any other method and references for papers that use these types of constraints would be greatly appreciated. Warm regards and thanks in advance...MNS
Hi Mahesh,
Quoted reply history
On Jul 4, 2010, at 12:20 AM, Samtani, Mahesh [PRDUS] wrote: > Dear Dr. Gillespie, > Your insight is greatly appreciated. I have 3 follow-up questions: > Is there a typo in the equation Y = 100 * LOG(IPRE/(100-IPRE))+ERR(1). I > guess it should be OK to write IPRE=LOG(A(1)/(100-A(1))) and Y=IPRE+ERR(1) so > I can compare IPRE vs. DV on the diagnostics Yes. the multiplication by 100 was incorrect. Your approach looks OK to me. > By extended logit do you mean: IPRE=LOG((A(1)+0.5)/(100.5-A(1))). I guess the > Yobs would also have to be computed using LOG((Xobs+0.5)/(100.5-Xobs)) A general form for an extended logit is: logit(x, U, L) = log(xTrans / (1 - xTrans)) where xTrans = (x - L) / (U - L) Substitute L = -0.5 and U = 100.5 for your case. > I have read some of your elegant work on the AD progression model. However, I > am not sure how to implement the beta distribution in NONMEM. It appeared to > me that a ratio of ADAS-cog divided by ADAS-COGmax of 70 was computed and > then it was put in a logit transform. Could you kindly describe this method > in a little bit detail In the ADAS-cog model the residual variation in ADAS-cog / 70 is beta distributed, but the conditional mean of ADAS-cog / 70 is related to the otherwise unconstrained model using a logit link. In other words, ADAS-cog / 70 on the ith occasion in the jth patient (y_{ij}) is described by: y_{ij} ~ Beta(mu_{ij} * tau, (1 - mu_{ij}) * tau) logit(mu_{ij}) = f(x_{ij}, theta_j) where E(y_{ij}) = mu_{ij} Var(y_{ij} = mu_{ij} * (1 - mu_{ij}) / (tau + 1) x_{ij} = independent variables, e.g., dose, time, ... theta_j = parameter values for jth patient f = model function with range over real line I have not implemented the beta density in NONMEM. We used BUGS. I imagine a workable approach would be to use an approximation to the beta function, e.g., Stirling's approximation. The rest of the density is easily programmed. You would need the LIKELIHOOD option in the $ESTIMATION record. Others on the list may have more direct experience or better ideas. > Interestingly I haven't received any other responses from NMusers. I read a > couple of papers in the past few days and all of them appear to ignore this > problem. The commonly used additive residual error model is usually being > utilized for scores with limits as endpoints. In cases where one's primary objective is parameter estimation or prediction of population mean response, use of an unconstrained model is often a reasonable approach, particularly if most of the responses are far from the boundaries. So I wouldn't automatically criticize a modeler for failing to constrain their model. On the other hand, when the objective is to simulate individual patients I prefer to use a constrained model, so that all simulated data are within a possible range and I don't have to resort to post-hoc truncation and the potential bias it introduces. > > Thank-you, > Mahesh > From: Bill Gillespie [mailto:[email protected]] > Sent: Fri 7/2/2010 3:04 PM > To: Samtani, Mahesh [PRDUS] > Cc: [email protected] > Subject: Re: [NMusers] Constrain PD values using a logistic transformation > > Hi Mahesh, > > If you plan to use one of the approximate likelihood methods, e.g., FO or > FOCE, you may prefer to transform the data and use an additive model. In > other words transform the data according to Yobs = LOG(Xobs/(100-Xobs)) and > use Y = 100 * LOG(IPRE/(100-IPRE))+ERR(1) where Xobs is the observed data on > the restricted range. > > Since you have some data at the extremes, you may want to extend the range > used for the extended logit to (-0.5, 100.5) or something similar. Otherwise > you'll end up with under- or over-flows. > > Regarded other transformations, anything that transforms from a bounded > interval to the real line is potentially fair game. For example you could use > probit or complimetary log-log transformations extended to (0, 100). Another > approach would be to use an beta distribution extended to (0, 100) instead of > (0, 1) for the likelihood. Such an approach is described for a model of > ADAS-cog scores as a function of time (see the Alzheimer's disease > progression model at http://opendiseasemodels.org, specifically the model > used for the "raw" scores). > > Cheers, > Bill Gillespie > > On Jul 1, 2010, at 3:47 PM, Samtani, Mahesh [PRDUS] wrote: > >> Dear NMusers, >> I am trying to model some PD data, which has a lower bound of zero and an >> upper bound of 100. I was wondering how to implement this restriction and if >> it was possible to use the general logistic transformation in the $ERROR >> block shown below: >> >> $ERROR >> IPRE=A(1) >> LT=LOG(IPRE/(100-IPRE))+ERR(1) >> Y=(100*EXP(LT))/(1+EXP(LT)) >> >> If this is appropriate, do I understand correctly that this is NOT a >> transform both sides approach; i.e. DV stays in its original or natural >> form. >> >> Finally, the logistic transformation extends from -∞ to +∞. However, the >> dataset does have a small number of values that are zeros and 100 (Five >> zeros and a couple of 100s in a dataset of about 700 observations). Do these >> small number of extreme values in the dataset cause problem when the LT term >> is back transformed above. >> >> Any other method and references for papers that use these types of >> constraints would be greatly appreciated. >> >> Warm regards and thanks in advance...MNS >> >
Hello Mahesh, Just a couple of other things to think about besides what Bill has described below. In the statistics literature, such data are called bounded outcome scores (BOS). Sometimes, the term coarsened is applied and generally refers to data that have many ‘levels’ (e.g., HAQ, various point scales etc). For VAS, the BOS is continuous, making things simpler. As you have noted, for continous BOS data, the data can take values on the upper and lower limit of the range. For this reason, these data often appear J or U-shaped, and the means and medians (central tendency) might be consistently different across doses and times, which indicates a non-symmetric distribution. Data not on the boundary can be transformed as Bill has indicated and many transformations exist (as Bill also described) - the logit, complimentary log-log, probit, etc. The logit and complimentary log-log are related through a transformation family called the Aranda-Ordaz. Let Z be the data not on the boundary, then this transformation is Z* = LOG[ {(1-Z)^(-C)-1}/C], where C=1 yields the logit and C=0 is the complementary log-log. The transformation is flexible and the parameter C can be estimated from the data (see PAGE 18 (2009) Abstr 1463 [www.page-meeting.org/?abstract=1463]) and the uncertainty can be taken into account when computing various uncertainty intervals back on the VAS scale. This transformation provides flexibility for handling difficult distributions and hopefully will promote normality of the residual random effects at least at the individual level. In the above abstract, the data supported a transformation that was different from the logit and complimentary log-log). For the data on the boundary, this can be handled using a censored likelihood (see abstract above) where the quantification limits used are the lowest and greatest observed non-boundary data. This avoids adding arbitrary constants to the data to expand the data range in order to apply a transformation to all the data. The problem with adding constants to the data is that the choice of the constant might influence the model fitting due to interaction with the transformation chosen and influence the fitting by over or underweighting certain data. Using the censoring might be considered less arbitrary. If you use a constant to expand the range, I would suggest you perform a sensitivity analysis to the choice. In my opinion the “transform both sides” method is not useful when modeling BOS data. I would model Z* = F + EPS(1) and not Z* = TRANSFORM(F)+EPS(1). Unlike biomarker or PK data, I do not see any a priori, pharmacologically based model that can be posited for modeling the data on the VAS scale. No natural class of models presents itself such that the parameter units need to be preserved relative to the VAS scale (unlike PK and CL/F with L/h). Also, transforming the model induces extra nonlinearity which makes the estimation problem harder to deal with. In fact, Also, the transformation might allow for entering the random effects into the model in a linear fashion which has nice properties. Hope that helps, Matt
Quoted reply history
From: [email protected] [mailto:[email protected]] On Behalf Of Bill Gillespie Sent: Monday, July 05, 2010 10:11 AM To: Samtani, Mahesh [PRDUS] Cc: [email protected] Subject: Re: [NMusers] Constrain PD values using a logistic transformation Hi Mahesh, On Jul 4, 2010, at 12:20 AM, Samtani, Mahesh [PRDUS] wrote: Dear Dr. Gillespie, Your insight is greatly appreciated. I have 3 follow-up questions: 1. Is there a typo in the equation Y = 100 * LOG(IPRE/(100-IPRE))+ERR(1). I guess it should be OK to write IPRE=LOG(A(1)/(100-A(1))) and Y=IPRE+ERR(1) so I can compare IPRE vs. DV on the diagnostics Yes. the multiplication by 100 was incorrect. Your approach looks OK to me. 2. By extended logit do you mean: IPRE=LOG((A(1)+0.5)/(100.5-A(1))). I guess the Yobs would also have to be computed using LOG((Xobs+0.5)/(100.5-Xobs)) A general form for an extended logit is: logit(x, U, L) = log(xTrans / (1 - xTrans)) where xTrans = (x - L) / (U - L) Substitute L = -0.5 and U = 100.5 for your case. 3. I have read some of your elegant work on the AD progression model. However, I am not sure how to implement the beta distribution in NONMEM. It appeared to me that a ratio of ADAS-cog divided by ADAS-COGmax of 70 was computed and then it was put in a logit transform. Could you kindly describe this method in a little bit detail In the ADAS-cog model the residual variation in ADAS-cog / 70 is beta distributed, but the conditional mean of ADAS-cog / 70 is related to the otherwise unconstrained model using a logit link. In other words, ADAS-cog / 70 on the ith occasion in the jth patient (y_{ij}) is described by: y_{ij} ~ Beta(mu_{ij} * tau, (1 - mu_{ij}) * tau) logit(mu_{ij}) = f(x_{ij}, theta_j) where E(y_{ij}) = mu_{ij} Var(y_{ij} = mu_{ij} * (1 - mu_{ij}) / (tau + 1) x_{ij} = independent variables, e.g., dose, time, ... theta_j = parameter values for jth patient f = model function with range over real line I have not implemented the beta density in NONMEM. We used BUGS. I imagine a workable approach would be to use an approximation to the beta function, e.g., Stirling's approximation. The rest of the density is easily programmed. You would need the LIKELIHOOD option in the $ESTIMATION record. Others on the list may have more direct experience or better ideas. Interestingly I haven't received any other responses from NMusers. I read a couple of papers in the past few days and all of them appear to ignore this problem. The commonly used additive residual error model is usually being utilized for scores with limits as endpoints. In cases where one's primary objective is parameter estimation or prediction of population mean response, use of an unconstrained model is often a reasonable approach, particularly if most of the responses are far from the boundaries. So I wouldn't automatically criticize a modeler for failing to constrain their model. On the other hand, when the objective is to simulate individual patients I prefer to use a constrained model, so that all simulated data are within a possible range and I don't have to resort to post-hoc truncation and the potential bias it introduces. Thank-you, Mahesh _____ From: Bill Gillespie [mailto:[email protected]] Sent: Fri 7/2/2010 3:04 PM To: Samtani, Mahesh [PRDUS] Cc: [email protected] Subject: Re: [NMusers] Constrain PD values using a logistic transformation Hi Mahesh, If you plan to use one of the approximate likelihood methods, e.g., FO or FOCE, you may prefer to transform the data and use an additive model. In other words transform the data according to Yobs = LOG(Xobs/(100-Xobs)) and use Y = 100 * LOG(IPRE/(100-IPRE))+ERR(1) where Xobs is the observed data on the restricted range. Since you have some data at the extremes, you may want to extend the range used for the extended logit to (-0.5, 100.5) or something similar. Otherwise you'll end up with under- or over-flows. Regarded other transformations, anything that transforms from a bounded interval to the real line is potentially fair game. For example you could use probit or complimetary log-log transformations extended to (0, 100). Another approach would be to use an beta distribution extended to (0, 100) instead of (0, 1) for the likelihood. Such an approach is described for a model of ADAS-cog scores as a function of time (see the Alzheimer's disease progression model at http://opendiseasemodels.org http://opendiseasemodels.org/ , specifically the model used for the "raw" scores). Cheers, Bill Gillespie On Jul 1, 2010, at 3:47 PM, Samtani, Mahesh [PRDUS] wrote: Dear NMusers, I am trying to model some PD data, which has a lower bound of zero and an upper bound of 100. I was wondering how to implement this restriction and if it was possible to use the general logistic transformation in the $ERROR block shown below: $ERROR IPRE=A(1) LT=LOG(IPRE/(100-IPRE))+ERR(1) Y=(100*EXP(LT))/(1+EXP(LT)) If this is appropriate, do I understand correctly that this is NOT a transform both sides approach; i.e. DV stays in its original or natural form. Finally, the logistic transformation extends from -∞ to +∞. However, the dataset does have a small number of values that are zeros and 100 (Five zeros and a couple of 100s in a dataset of about 700 observations). Do these small number of extreme values in the dataset cause problem when the LT term is back transformed above. Any other method and references for papers that use these types of constraints would be greatly appreciated. Warm regards and thanks in advance...MNS