Re: Confidence intervals of PsN bootstrap output
Hi Martin,
Thanks for pointing out that the first row is the original estimation. By
exclude this row, I recalculated the value, and the excel calculation is a
lot closer to the PsN output. For most parameters, the difference between
these two calculations could be due to rounding error, yet major difference
can still be found for a few parameters.
The wiki page suggested by Douglas is also very helpful, there we can see
that the underlining mathematics could be different for different software.
Can anyone suggest that what is the normal practice when reporting the
bootstrap value for publication or regulatory submission? Is the PsN output
used directly, or is recalculation by other software a must?
Thanks,
Norman
Quoted reply history
On Tue, Jul 5, 2011 at 7:20 PM, Martin Johnson <[email protected]> wrote:
> Hi Norman,
>
> The first line (row) in the raw_results1.csv is from the original dataset
> (please refer the manual) and as I see from your excel function...there
> are 501 bootstrap samples.
> My guess would be the median calculated from the excel function includes
> the estimates from the original dataset + 500 bootstrap (1+500) which makes
> the difference in the medians.
>
> Additionally, percentile function in excel can give different results than
> other statistical softwares, please try excel function percentile.exc
> (available in excel 2010/2011), that will give your similar (same) results.
>
> Hope this helps a bit
>
> Regards,
> Martin Johnson
> PhD Student
> University of Groningen
> The Netherlands
>
>
> On Jul 5, 2011, at 9:48 PM, Norman Z wrote:
>
> Hello everyone,
>
> I am using PsN to do some bootstrap and have some questions regarding the
> PsN output.
>
> 1. There are two confidence interval (CI) reported in the output file
> "bootstrap_results.csv":
> standard.error.confidence.intervals
> percentile.confidence.intervals
> I wonder which one should be used in the publication or report, and what is
> the difference between them.
>
> 2. When I use excel function
> "=PERCENTILE(T5:T505,5%)" and "=PERCENTILE(T5:T505,95%)" to calculate the
> 5% and 95% percentile of a parameter from the data "raw_results1.csv" the
> result is different from both "standard.error.confidence.intervals" and "
> percentile.confidence.intervals".
> The same happens to the excel function "=MEDIAN(T5:T505)" result and
> the"medians" in the "bootstrap_results.csv".
>
> Does anyone know why it is the case, and which value I should use?
>
> bootstrap_results.csv
> medians
> 423.5635
> standard.error.confidence.intervals
> 5% 419.73239
> 95% 428.26761
> percentile.confidence.intervals
> 5% 419.56165
> 95% 427.9239
>
> Excel calculation from raw_results1.csv
> Median 423.578
> 5% percentile 419.593
> 95% percentile 427.922
>
> Thanks,
>
> Norman
>
>
>