Re: Confidence intervals of PsN bootstrap output
Dear Norman,
As long as you say which bootstrap runs you used in your calculations, and can motivate why, it's really up to you - there's not a gold standard as such. Personally, as a rule of thumb, I only use bootstrap runs that have minimized successfully in calculating the percentile confidence intervals. One could also choose to exclude runs that fail the covariance step, or have estimates are close to boundaries, or have zero gradients (etc) but it really depends on what you would like to show. For example, the proportion of runs that fail covariance can sometimes be a useful indicator of model stability (although opinions vary on the subject).
PsN is configurable in terms of which runs it uses - have a look at the bootstrap documentation (specifically, the various "skip" flags). You could accept its output at face value, provided you know how it was arrived at, or you could choose to do your own calculations from raw_results1.csv.
I would also echo the advice of others regarding Excel - it's not really the best tool for the job. You'd be better off with something like R - it's well worth the time spent getting to know it.
Best
Justin
Quoted reply history
On 7/6/11 4:48 PM, Norman Z wrote:
> Hi Martin,
>
> Thanks for pointing out that the first row is the original estimation. By exclude this row, I recalculated the value, and the excel calculation is a lot closer to the PsN output. For most parameters, the difference between these two calculations could be due to rounding error, yet major difference can still be found for a few parameters.
>
> The wiki page suggested by Douglas is also very helpful, there we can see that the underlining mathematics could be different for different software.
>
> Can anyone suggest that what is the normal practice when reporting the bootstrap value for publication or regulatory submission? Is the PsN output used directly, or is recalculation by other software a must?
>
> Thanks,
>
> Norman
>
> On Tue, Jul 5, 2011 at 7:20 PM, Martin Johnson < [email protected] < mailto: [email protected] >> wrote:
>
> Hi Norman,
>
> The first line (row) in the raw_results1.csv is from the original
> dataset (please refer the manual) and as I see from your excel
> function...there are 501 bootstrap samples.
> My guess would be the median calculated from the excel function
> includes the estimates from the original dataset + 500 bootstrap
> (1+500) which makes the difference in the medians.
>
> Additionally, percentile function in excel can give different
> results than other statistical softwares, please try excel
> function percentile.exc (available in excel 2010/2011), that will
> give your similar (same) results.
>
> Hope this helps a bit
>
> Regards,
> Martin Johnson
> PhD Student
> University of Groningen
> The Netherlands
>
> On Jul 5, 2011, at 9:48 PM, Norman Z wrote:
>
> > Hello everyone,
> >
> > I am using PsN to do some bootstrap and have some questions
> > regarding the PsN output.
> >
> > 1. There are two confidence interval (CI) reported in the output
> > file "bootstrap_results.csv":
> > standard.error.confidence.intervals
> > percentile.confidence.intervals
> > I wonder which one should be used in the publication or report,
> > and what is the difference between them.
> >
> > 2.When I use excel function
> > "=PERCENTILE(T5:T505,5%)" and "=PERCENTILE(T5:T505,95%)" to
> >
> > calculate the 5% and 95% percentile of a parameterfrom the data "raw_results1.csv" the result is different from both
> >
> > "standard.error.confidence.intervals" and
> > "percentile.confidence.intervals".
> > The same happens to the excel function "=MEDIAN(T5:T505)" result
> > and the"medians" in the "bootstrap_results.csv".
> > Does anyone know why it is the case, and which value I should use?
> >
> > bootstrap_results.csv
> > medians
> > 423.5635
> > standard.error.confidence.intervals
> > 5% 419.73239
> > 95% 428.26761
> > percentile.confidence.intervals
> > 5% 419.56165
> > 95% 427.9239
> >
> > Excel calculation from raw_results1.csv
> > Median 423.578
> > 5% percentile 419.593
> > 95% percentile 427.922
> >
> > Thanks,
> >
> > Norman
--
Justin Wilkins
--------------------
Räfiser Feld 10
CH-9470 Buchs
Switzerland
--------------------
[email protected]