RE: Observed (yaxis) vs Predicted (xaxis) Diagnostic Plot - Scientific basis.
Thanks Joga for raising the issue of so called diagnostic plots and Martin’s
reminder that they are not reliable as diagnostics.
The gold standard tool for model evaluation, which may also help diagnose model
problems, it the VPC. Martin - it is not a “for example” method -- it is the
primary model evaluation tool.
Comparison of the median observed percentile with the median predicted
percentile is the first step in using a VPC. Unfortunately, there are still
VPCs being produced that show only the observed percentiles without the
corresponding predicted percentiles.
All so called diagnostic plots and VPCs that do not show observed AND predicted
percentiles belong in the bin.
Best wishes,
Nick
--
Nick Holford, Professor Emeritus Clinical Pharmacology, MBChB, FRACP
mobile:NZ+64(21)46 23 53 ; FR+33(6)62 32 46 72
email: [email protected]<mailto:[email protected]>
web: http://holford.fmhs.auckland.ac.nz/
Quoted reply history
From: [email protected] <[email protected]> On Behalf Of
Martin Bergstrand
Sent: Friday, August 18, 2023 9:48 AM
To: Gobburu, Joga <[email protected]>
Cc: [email protected]
Subject: Re: [NMusers] Observed (yaxis) vs Predicted (xaxis) Diagnostic Plot -
Scientific basis.
Dear Joga and all,
Joga makes a valuable point that all pharmacometricians should be aware of.
Standard methodology for regression assumes that the x-variable is without
error (loess, linear regression etc.). Note that it is the same for NLME models
i.e. we generally assume that our independent variables e.g. time, covariates
etc. are without error.
For DV vs. PRED plots it is common practice, even among those that do not know
why, to plot PRED on the x-axis and DV on the y-axis. A greater problem with
these plots is the commonly held expectation that for a "good model" a smooth
or regression line should align with the line of unity. Though this seems
intuitive it is a flawed assumption. This issue was clearly pointed out by Mats
Karlsson and Rada Savic in their 2007 paper titled "Diagnosing Model
Diagnostics''. For simple well-behaved examples you will see an alignment
around the line of unity for DV vs. PRED plots. However, there are several
factors that contribute to an expected deviation from this expectation:
(1) Censoring (e.g. censoring of observations < LLOQ)
- In this case DVs are capped at LLOQ but PRED values are not. This makes it
perfectly expected that there will be a deviation from alignment around the
line of unity in the lower range.
(2) Strong non-linearities
- The more nonlinear the modelled system is, the greater the expected deviation
from the line of unity. Especially in combination with significant ETA
correlations.
(3) High variability
- With higher between/within subject variability (e.g. IIV and RUV) that isn't
normally distributed (e.g. exponential distributions) will result in an
expected deviation from the line of unity. Note: this is a form of
non-linearity so it may fall under the above category.
(4) Adaptive designs (e.g. TDM dosing)
- Listed in the original paper by Karlsson & Savic but I have not been able to
recreate an issue in this case.
I am rather sure that many thousands of hours have been spent on modeling
trying to correct for perceived model misspecifications that are not really
there. This is why I recommend relying primarily on simulation-based model
diagnostics (e.g. VPCs) and as far as possible account for censoring that
affects the original dataset. As pointed out by Karlsson & Savic a
simulation/re-estimation based approach can also be used to investigate the
expected behavior for DV vs. PRED plots for a particular model and dataset
(e.g. mirror plots in Xpose). Note that to my knowledge there is yet no
automated way to handle censoring in this context (clearly doable if anyone
wants to develop a nifty implementation of that).
If we leave the DV vs. PRED plot case, there are many other instances where we
use scatter plots where it is much less clear what can be considered the
independent variable and yet other cases where the assumption that the
x-variable is without error is violated in a way that makes the results hard to
interpret. One instance of the latter is when exposure-response is studied by
plotting observed PD response versus observed trough plasma concentrations.
This is already a way too long email so I will not deep dive into that problem
as well.
Best regards,
Martin Bergstrand, Ph.D.
Principal Consultant
Pharmetheus AB
[email protected]<mailto:[email protected]>
http://www.pharmetheus.com
On Thu, Aug 17, 2023 at 12:44 PM Gobburu, Joga
<[email protected]<mailto:[email protected]>> wrote:
Dear Friends – Observations versus population predicted is considered a
standard diagnostic plot in our field. I used to place observations on the
x-axis and predictions on the yaxis. Then I was pointed to a publication from
ISOP
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5321813/figure/psp412161-fig-0001/)
which recommended plotting predictions on the xaxis and observations on the
yaxis. To the best of my knowledge, there was no justification provided. It did
question my decades old practice, so I did some thinking and digging. Thought
to share it here so others might benefit from it. If this is obvious to you
all, then I can say I am caught up!
1. We write our models as observed = predicted + random error; which can be
interpreted to be in the form: y = f(x) + random error. It is technically not
though. Hence predicted goes on the xaxis, as it is free of random error. It is
considered a correlation plot, which makes plotting either way acceptable. This
is not so critical as the next one.
2. However, there is a statistical reason why it is important to keep
predictions on the xaxis. Invariably we always add a loess trend line for these
diagnostic plots. To demonstrate the impact, I took a simple iv bolus single
dose dataset and compared both approaches. The results are available at this
link:
https://github.com/jgobburu/public_didactic/blob/main/iv_sd.html.pdf.
I used Pumas software, but the scientific underpinning is agnostic to
software. See the two plots on Pages 5 and 6. The interpretation of the bias
between the two approaches is different. This is the statistical reason why it
matters to plot predictions on the xaxis.
Joga Gobburu
University of Maryland
This communication is confidential and is only intended for the use of the
individual or entity to which it is directed. It may contain information that
is privileged and exempt from disclosure under applicable law. If you are not
the intended recipient please notify us immediately. Please do not copy it or
disclose its contents to any other person.
Any personal data will be processed in accordance with Pharmetheus' privacy
notice, available https://pharmetheus.com/privacy-policy/.