Re: Bootstrap resampling!

From: Paul Williams Date: March 27, 2001 technical Source: cognigencorp.com
From: "Paul Williams" <pwilliams@uop.edu> Subject: Re: Bootstrap resampling! Date: Tue, 27 Mar 2001 09:18:14 -0800 Less synthetic percentile bootstrap: There are two approaches to applying the bootstrap method. The first is the standard bootstrap which would assume some type of distribution (usually the normal distribution) throughout the entire modeling process (both development and model checking) and therefore relies on the formulae that are used to calculate means, standard errors, 95% CIs etc. The percentile bootstrap is less reliant on the formulae that are a function of an assumed distribution because in the end it ranks the element(s) of interest and takes the 2.5th percentile element and the 97.5th percentile element and constructs the 95% confidence interval for that element as the distance between these two. For example I have previously been interested in a ppk model for an antifungal agent. Cl = theta1 * clcr + theta2. I was interested in the 95% CI for theta1. I constructed 1000 bootstrap data sets and estimated the model for each of the 1000 bootstrap data sets. Rather than plugging the 1000 values into a formula that assumes a normal distribution to calculate the standard error, then the 95% CI, I ranked the 1000 values for theta1 and took the 25th as the lower boundary for the 95% CI and the 975th as the upper boundary. It should be noted that when using the percentile method one must construct at least 1000 data sets and re-estimate the model on all 1000. So I call this a "less synthetic" approach because (1) it is less reliant on formulae and underlying assumptions about distributions and (2) the intervals come directly from a ranking of the data not from a series of calculations. The percentile bootstrap can have the advantage of avoiding nonsense estimates which may sometimes come about when the standard normal distribution is assumed. For example, I have occasionally had results that indicated the lower boundary of a 95% CI for a coefficient of variation for inter-subject variability was intractable (i.e would be less than 0 which would not make sense). This won't happen with the percentile bootstrap. Why do I say internal validation? I would divide model validation into two types. 1] External validation which is the most stringent approach and 2] internal validation methods such as the bootstrap or cross-validation. Internal validation methods are attractive when it is difficult to obtain a new data set for an external validation (such as Peds or rare diseases) or when drug approval should be done expeditiously (such as Tx for AIDS). The FDA "Guidance for Industry: Population Pharmacokinetics" has called these methods internal validation (see page 16 of the Guidance) and has recognized their appropriateness. Although the process is intriguing, I don't want to go into the exact mechanism used to internally validate a model but would refer you to Efron and Gong "A leisurely look at the bootstrap, the jackknife and cross-validation" The American Statistician vol 37 pgs 36 * 48 1983 and Ene Ette's paper "Stability and performance of a population pharmacokinetic model: J Clin Pharmacology 1997:37:486-495. So, I call these internal validation because the validation process comes from the data that was originally used to estimate the model. Bias Correction: please see Efron's text "An Introduction to the Bootstrap" chapter 10 [ISBN = 0-412-04231-2]. Why would I ask these questions? I was being eclectic and interested in Ganesh's approach to the bootstrap. A comment for the good and welfare of all: It does not seem to me that bootstrapping residuals is the appropriate approach for population PK or PD modeling. I have looked at this and the within subject residuals are correlated for population models. The exception would be if cross-sectional sampling was done. So it seems to me that one is restructuring the entire data set when the residuals from subject A are assigned to subject B. Also, sampling of residuals assumes that we know the population model(s) with certainty. I am not sure one can make such an assumption. The safe approach is to randomly sample individuals (with entire data associated with each individual) with replacement to create bootstrap data sets. Cheers to all! Paul
Mar 23, 2001 Ganesh R Iyer Bootstrap resampling!
Mar 23, 2001 Paul Williams Re: Bootstrap resampling!
Mar 24, 2001 Nick Holford Re: Bootstrap resampling!
Mar 24, 2001 Harry Mager Hm Antwort: Bootstrap resampling!
Mar 27, 2001 Paul Williams Re: Bootstrap resampling!
Mar 27, 2001 Nick Holford Re: Bootstrap resampling!
Mar 27, 2001 Stephen Duffull RE: Bootstrap resampling!
Mar 28, 2001 Ludger Banken RE: Bootstrap resampling!
Mar 29, 2001 Nick Holford Re: Bootstrap resampling!
Mar 29, 2001 Leonid Gibiansky RE: Bootstrap resampling!
Mar 29, 2001 Diane Mould Re: Bootstrap resampling!
Mar 29, 2001 Leonid Gibiansky RE: Bootstrap resampling!
Mar 29, 2001 Kenneth G. Kowalski RE: Bootstrap resampling!
Mar 29, 2001 Jogarao Gobburu Re: Bootstrap resampling!
Mar 29, 2001 Nick Holford Re: Bootstrap resampling! -> Randomization test
Mar 29, 2001 Leonid Gibiansky RE: Bootstrap resampling! -> Randomization test
Mar 29, 2001 Kenneth G. Kowalski RE: Bootstrap resampling! -> Randomization test
Mar 29, 2001 Jogarao Gobburu Re: Bootstrap resampling! -> Randomization test
Mar 29, 2001 Kenneth G. Kowalski RE: Bootstrap resampling! -> Randomization test
Mar 30, 2001 Stephen Duffull RE: Bootstrap resampling! -> Randomization test
Mar 30, 2001 Nick Holford Re: Bootstrap resampling! -> Randomization test
Mar 30, 2001 Stephen Duffull RE: Bootstrap resampling! -> Randomization test
Mar 30, 2001 Mats Karlsson Re: Bootstrap resampling!
Mar 30, 2001 Lewis B. Sheiner Re: Bootstrap resampling! -> Randomization test
Apr 07, 2001 Smith Brian P Re: Bootstrap resampling! -> Randomization test