Dear all,
I have a question regarding visual predictive checks (VPCs).
Most of VPCs used now, include a line representing the median and 5th and 95th
percentiles of the data values and an area around the same percentiles that is
commonly define as the 95% confidence interval (of the simulations).
But is it correct, from the statistical point of view, to call confidence
interval to this area? And if this is not the case how should we define them?
Thanks,
Elena Soto
Elena Soto, PhD
Pharmacometrician
Pharmacometrics, Global Clinical Pharmacology
Global Product Development
Pfizer R&D UK Limited, IPC 096
CT13 9NJ, Sandwich, UK
Phone : +44 1304 644883
________________________________
Unless expressly stated otherwise, this message is confidential and may be
privileged. It is intended for the addressee(s) only. Access to this e-mail by
anyone else is unauthorised. If you are not an addressee, any disclosure or
copying of the contents of this e-mail or any action taken (or not taken) in
reliance on it is unauthorised and may be unlawful. If you are not an
addressee, please inform the sender immediately.
Pfizer R&D UK Limited is registered in England under No. 11439437 with its
registered office at Ramsgate Road, Sandwich, Kent CT13 9NJ
VPCs confidence intervals?
6 messages
6 people
Latest: Mar 18, 2019
Hi Elena,
VPCs are accurately called prediction intervals not confidence intervals.
The difference is that a prediction interval shows what you would expect
for the next individual in a study while a confidence interval shows what
you would expect for the result of a statistic (often confidence intervals
of a mean are shown). With many VPCs, the confidence interval of the
median and the confidence interval of the 5th and 95th percentiles are
shown.
Also, when the lines indicate the median, 5th, and 95th percentiles of the
simulations, that is the 90% prediction interval since it is the middle 90%
of the data (not the 95% confidence interval).
Thanks,
Bill
*From:* [email protected] <[email protected]> *On
Behalf Of *Soto, Elena
*Sent:* Thursday, March 14, 2019 12:49 PM
*To:* [email protected]
*Subject:* [NMusers] VPCs confidence intervals?
Dear all,
I have a question regarding visual predictive checks (VPCs).
Most of VPCs used now, include a line representing the median and 5th and 95
th percentiles of the data values and an area around the same percentiles
that is commonly define as the 95% confidence interval (of the simulations).
But is it correct, from the statistical point of view, to call confidence
interval to this area? And if this is not the case how should we define
them?
Thanks,
Elena Soto
Elena Soto, PhD
Pharmacometrician
Pharmacometrics, Global Clinical Pharmacology
Global Product Development
*Pfizer R&D UK Limited, IPC 096*
*CT13 9NJ**, Sandwich, **UK*
*Phone : +44 1304 644883*
------------------------------
Unless expressly stated otherwise, this message is confidential and may be
privileged. It is intended for the addressee(s) only. Access to this
e-mail by anyone else is unauthorised. If you are not an addressee, any
disclosure or copying of the contents of this e-mail or any action taken
(or not taken) in reliance on it is unauthorised and may be unlawful. If
you are not an addressee, please inform the sender immediately.
Pfizer R&D UK Limited is registered in England under No. 11439437 with its
registered office at Ramsgate Road, Sandwich, Kent CT13 9NJ
Hi All,
I know what Bill is trying to say but it is not quite accurate the way he
states it.
A prediction interval makes inference on a statistic based on a future sample
such as a sample mean of a future set of data. In contrast, a confidence
interval makes inference on a parameter such as the population mean which is a
fixed number. A prediction interval takes into account both the uncertainty in
the existing data used to estimate the population parameter as well as the
sampling variation to make inference on a sample statistic (e.g., sample mean
for a future trial). A confidence interval only takes into account the
uncertainty in the existing data used to estimate the parameter. Based on the
Law of Large Numbers, the population mean can be thought of as taking the
sample mean of an infinite sample size (i.e., sampling the entire population).
For this reason, a prediction interval with an infinite sample size will
collapse to a confidence interval.
An interval based on VPCs is more akin to a prediction interval since it takes
into account the sampling variation based on a finite sample size, however, one
cannot assign a valid coverage probability (confidence level) to this interval
unless it also takes into account the parameter uncertainty. With VPCs applied
to existing data (i.e, an internal VPC) it is customary to not take into
account this parameter uncertainty so many refer to such prediction intervals
as degenerate as they place 100% certainty on the model parameter estimates
used to obtain the VPC predictions. One could potentially call these
intervals ‘degenerate prediction intervals’ but I tend to just call them ‘VPC
intervals’ (e.g., a 90% VPC interval) so as to avoid misperception that these
prediction intervals have a statistically valid coverage probability. However,
when VPCs are applied to an independent dataset not used in the development of
the model, it is often advised to take into account the parameter uncertainty
when performing the VPCs to essentially reflect the trial-to-trial uncertainty
of the independent data not used in the estimation of model (i.e., refitting
the same model to a new set of trial data will not give the same set of
estimates and hence reflects trial-to-trial variation). In this setting, where
the VPCs take into account both the parameter uncertainty and sampling
variation to predict on an independent (e.g., future) dataset, then one is on
more solid ground to refer to these VPC intervals as prediction intervals with
valid coverage probabilities.
Kind regards,
Ken
Kenneth G. Kowalski
Kowalski PMetrics Consulting, LLC
Email: <mailto:[email protected]> [email protected]
Cell: 248-207-5082
Quoted reply history
From: [email protected] [mailto:[email protected]] On
Behalf Of Bill Denney
Sent: Thursday, March 14, 2019 1:10 PM
To: Soto, Elena <[email protected]>; [email protected]
Subject: RE: [NMusers] VPCs confidence intervals?
Hi Elena,
VPCs are accurately called prediction intervals not confidence intervals. The
difference is that a prediction interval shows what you would expect for the
next individual in a study while a confidence interval shows what you would
expect for the result of a statistic (often confidence intervals of a mean are
shown). With many VPCs, the confidence interval of the median and the
confidence interval of the 5th and 95th percentiles are shown.
Also, when the lines indicate the median, 5th, and 95th percentiles of the
simulations, that is the 90% prediction interval since it is the middle 90% of
the data (not the 95% confidence interval).
Thanks,
Bill
From: [email protected] <mailto:[email protected]>
<[email protected] <mailto:[email protected]> > On Behalf
Of Soto, Elena
Sent: Thursday, March 14, 2019 12:49 PM
To: [email protected]
Subject: [NMusers] VPCs confidence intervals?
Dear all,
I have a question regarding visual predictive checks (VPCs).
Most of VPCs used now, include a line representing the median and 5th and 95th
percentiles of the data values and an area around the same percentiles that is
commonly define as the 95% confidence interval (of the simulations).
But is it correct, from the statistical point of view, to call confidence
interval to this area? And if this is not the case how should we define them?
Thanks,
Elena Soto
Elena Soto, PhD
Pharmacometrician
Pharmacometrics, Global Clinical Pharmacology
Global Product Development
Pfizer R&D UK Limited, IPC 096
CT13 9NJ, Sandwich, UK
Phone : +44 1304 644883
_____
Unless expressly stated otherwise, this message is confidential and may be
privileged. It is intended for the addressee(s) only. Access to this e-mail by
anyone else is unauthorised. If you are not an addressee, any disclosure or
copying of the contents of this e-mail or any action taken (or not taken) in
reliance on it is unauthorised and may be unlawful. If you are not an
addressee, please inform the sender immediately.
Pfizer R&D UK Limited is registered in England under No. 11439437 with its
registered office at Ramsgate Road, Sandwich, Kent CT13 9NJ
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
Hi Elena and Bill,
I think this has been discussed before in this forum. The VPCs central metric
are the prediction of data percentiles. If you focus on the difference between
e.g. the 5th and 95th percentile based on the simulated data you will have a
prediction interval, like Bill states. If you focus on an individual
percentile, but consider the imprecision with which it is derived, often given
as a shaded area, then it is like other metrics of imprecision a confidence
interval.
Best regards,
Mats
Quoted reply history
From: [email protected] <[email protected]> On Behalf Of
Bill Denney
Sent: den 14 mars 2019 18:10
To: Soto, Elena <[email protected]>; [email protected]
Subject: RE: [NMusers] VPCs confidence intervals?
Hi Elena,
VPCs are accurately called prediction intervals not confidence intervals. The
difference is that a prediction interval shows what you would expect for the
next individual in a study while a confidence interval shows what you would
expect for the result of a statistic (often confidence intervals of a mean are
shown). With many VPCs, the confidence interval of the median and the
confidence interval of the 5th and 95th percentiles are shown.
Also, when the lines indicate the median, 5th, and 95th percentiles of the
simulations, that is the 90% prediction interval since it is the middle 90% of
the data (not the 95% confidence interval).
Thanks,
Bill
From: [email protected]<mailto:[email protected]>
<[email protected]<mailto:[email protected]>> On Behalf
Of Soto, Elena
Sent: Thursday, March 14, 2019 12:49 PM
To: [email protected]<mailto:[email protected]>
Subject: [NMusers] VPCs confidence intervals?
Dear all,
I have a question regarding visual predictive checks (VPCs).
Most of VPCs used now, include a line representing the median and 5th and 95th
percentiles of the data values and an area around the same percentiles that is
commonly define as the 95% confidence interval (of the simulations).
But is it correct, from the statistical point of view, to call confidence
interval to this area? And if this is not the case how should we define them?
Thanks,
Elena Soto
Elena Soto, PhD
Pharmacometrician
Pharmacometrics, Global Clinical Pharmacology
Global Product Development
Pfizer R&D UK Limited, IPC 096
CT13 9NJ, Sandwich, UK
Phone : +44 1304 644883
________________________________
Unless expressly stated otherwise, this message is confidential and may be
privileged. It is intended for the addressee(s) only. Access to this e-mail by
anyone else is unauthorised. If you are not an addressee, any disclosure or
copying of the contents of this e-mail or any action taken (or not taken) in
reliance on it is unauthorised and may be unlawful. If you are not an
addressee, please inform the sender immediately.
Pfizer R&D UK Limited is registered in England under No. 11439437 with its
registered office at Ramsgate Road, Sandwich, Kent CT13 9NJ
När du har kontakt med oss på Uppsala universitet med e-post så innebär det att
vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du
läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/
E-mailing Uppsala University means that we will process your personal data. For
more information on how this is performed, please read here:
http://www.uu.se/en/about-uu/data-protection-policy
This is a great example of the kind of terminology debates that the ASA / ISOP
Statistics and Pharmacometrics special interest group (SxP) is trying to tackle.
As Mats and Bill point out, the common usage within our community is to say
that the percentiles (5th, 95th) are “prediction intervals” and the interval
estimates / uncertainty around these percentiles are “confidence intervals”.
But as Ken points out, these terms do not strictly correspond to the
statistical definition of each if you take into account what the VPC procedure
is actually doing.
The VPC is a model diagnostic procedure for the observed data and provides a
visual check of whether the model is capturing central tendencies and
dispersion in our data. (BTW, I *know* there are debates about the usefulness
or otherwise of VPC plots. I’m not going to address that here and I suggest we
don’t disappear down *that* rabbit hole.) We are NOT trying to make
probabilistic statements about the likelihood of observed percentiles being
within the intervals around these. So if the question arises from some
reviewer based on our use of statistically woolly terms like “prediction
interval” or “confidence interval” we should be ready to put up our hands and
admit that the terms we are using do not imply those statistical properties.
We could advocate changing the terminology used, but that may not have traction
in the community after this length of time. But we *should* be cognizant about
what these things are, what they’re for, what the formal, statistical
terminology implies and what our use (or maybe misuse) is or isn’t implying.
The ASA / ISOP SxP group has had a session accepted at this year’s ACOP meeting
where we hope to surface a few of these thorny issues and debate between our
use of terminology in pharmacometrics, the statistical interpretation of that
terminology and whether it *really* matters. If you’re interested, please come
along and be prepared to engage in the discussion!
Best regards,
Mike
(co-chair of ASA / ISOP SxP SIG)
Quoted reply history
From: [email protected]<mailto:[email protected]>
<[email protected]<mailto:[email protected]>> On Behalf
Of Ken Kowalski
Sent: 14 March 2019 21:02
To: 'Bill Denney'
<[email protected]<mailto:[email protected]>>;
[email protected]<mailto:[email protected]>; Soto, Elena
<[email protected]<mailto:[email protected]>>
Subject: [EXTERNAL] RE: [NMusers] VPCs confidence intervals?
Hi All,
I know there is a lot of confusion about the distinction between a confidence
interval and a prediction interval. Here is a layperson’s way of making the
distinction.
A confidence interval makes inference on a population parameter which is fixed
(never changes) regardless of any sample data that is collected to estimate the
parameter (if you repeatedly sampled an infinite number of observations to
obtain the population value by definition you would get the same population
value for each sample with an infinite sample size) . Thus, the confidence
interval only reflects the uncertainty in the estimate of that parameter.
In contrast, a prediction interval makes inference on a statistic for a future
sample set of data. That statistic will vary from sample to sample and hence
must also take into account the sampling variation as well as the parameter
uncertainty. A prediction interval can be thought of as a confidence interval
of the prediction of some statistic from a future sample. That is, both a
confidence interval and a prediction interval have a confidence level
associated with them. In the case of the confidence interval, the confidence
level is the coverage probability that the interval will contain the true
value of the population parameter if one were to repeat the experiment an
infinite number of times. In the case of the prediction interval, the
confidence level is the coverage probability that the interval will contain the
future sample mean (of a finite sample size) if one were to repeat the
experiment an infinite number of times.
There is another type of statistical interval in addition to confidence and
prediction intervals and that is a tolerance interval. A tolerance interval
can be thought of as a confidence interval that a specified proportion of the
individual responses will be contained within the interval. For example, we
can calculate a 95% tolerance interval to contain 90% of the observed data
(i.e., we are 95% confident that the interval will contain 90% of the
individual observations). Tolerance intervals are more common in a
manufacturing setting where it is important to produce an item to some
specification within some tolerance limits. Nevertheless, there is a certain
VPC plot that we often generate that is somewhat akin to a tolerance interval.
When we summarize our simulated data for VPCs and summarize the 5th and 95th
percentiles of the individual responses this is more akin to a tolerance
interval to contain 90% of the observed individual data. In contrast, when we
summarize the sample mean or median from say 1000 simulated trials and
calculate the 5th and 95th percentiles across the 1000 trials that is more akin
to a prediction interval for that statistic (e.g., sample mean or sample
median). Note however, the intervals obtained as percentiles of a sample
statistic across trials (i.e., prediction interval) or sample observations
across individual subjects (i.e., tolerance interval) don’t have valid coverage
probabilities for repeated experiments unless they take into account parameter
uncertainty.
Kind regards,
Ken
Hi Elena,
Thanks to Ken and Bill for explaining some of the statistical issues but they
only discuss the lower and upper prediction percentiles (typically 5%ile and
95%ile are used). Mike has also mentioned the central tendency
An arguably more important percentile for model evaluation in a VPC is the
50%ile (the median). This gives you the clearest idea of how well the model
predicts the central tendency of the observations and can give you direct
insight into model mis-specification and how this might be addressed. See the
tutorial by Nguyen et al (2017) for examples.
VPCs are most easily evaluated by comparing the observation percentile with its
corresponding prediction percentile. Unfortunately some commonly used VPC tools
do not include the prediction percentile by default so users are left having to
guess how well the observed and predicted percentiles agree. Hint to VPC tool
developers – please help users by including the prediction percentiles by
default.
Best wishes,
Nick
Nguyen TH, Mouksassi MS, Holford N, Al-Huniti N, Freedman I, Hooker AC, et al.
Model Evaluation of Continuous Data Pharmacometric Models: Metrics and
Graphics. CPT: pharmacometrics & systems pharmacology. 2017;6(2):87-109.
--
Nick Holford, Professor Clinical Pharmacology
Dept Pharmacology & Clinical Pharmacology, Bldg 503 Room 302A
University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand
office:+64(9)923-6730 mobile:NZ+64(21)46 23 53 FR+33(6)62 32 46 72
email: [email protected]<mailto:[email protected]>
http://holford.fmhs.auckland.ac.nz/
http://orcid.org/0000-0002-4031-2514
Read the question, answer the question, attempt all questions
Quoted reply history
From: [email protected] <[email protected]> On Behalf Of
Smith, Mike K
Sent: Tuesday, 19 March 2019 6:13 AM
To: [email protected]
Subject: RE: [NMusers] VPCs confidence intervals?
This is a great example of the kind of terminology debates that the ASA / ISOP
Statistics and Pharmacometrics special interest group (SxP) is trying to tackle.
As Mats and Bill point out, the common usage within our community is to say
that the percentiles (5th, 95th) are “prediction intervals” and the interval
estimates / uncertainty around these percentiles are “confidence intervals”.
But as Ken points out, these terms do not strictly correspond to the
statistical definition of each if you take into account what the VPC procedure
is actually doing.
The VPC is a model diagnostic procedure for the observed data and provides a
visual check of whether the model is capturing central tendencies and
dispersion in our data. (BTW, I *know* there are debates about the usefulness
or otherwise of VPC plots. I’m not going to address that here and I suggest we
don’t disappear down *that* rabbit hole.) We are NOT trying to make
probabilistic statements about the likelihood of observed percentiles being
within the intervals around these. So if the question arises from some
reviewer based on our use of statistically woolly terms like “prediction
interval” or “confidence interval” we should be ready to put up our hands and
admit that the terms we are using do not imply those statistical properties.
We could advocate changing the terminology used, but that may not have traction
in the community after this length of time. But we *should* be cognizant about
what these things are, what they’re for, what the formal, statistical
terminology implies and what our use (or maybe misuse) is or isn’t implying.
The ASA / ISOP SxP group has had a session accepted at this year’s ACOP meeting
where we hope to surface a few of these thorny issues and debate between our
use of terminology in pharmacometrics, the statistical interpretation of that
terminology and whether it *really* matters. If you’re interested, please come
along and be prepared to engage in the discussion!
Best regards,
Mike
(co-chair of ASA / ISOP SxP SIG)