Maximizing coefficient of determination
From: Erik Olofsen <E.Olofsen@lumc.nl>
Subject: Maximizing coefficient of determination
Date: Thu, 8 Feb 2001 15:32:55 +0100 (CET)
Dear NONMEM users,
Instead of maximizing the likelihood of a set of observations I would like to maximize the coefficient of determination, given by:
r2 = 1 - sum((yi-yhati)^2)/sum(yi-mean(y)^2) eq.(1)
where yi is a measured variable and yhati is its prediction. Both yi and yhati are given by models that contain parameters to be estimated, and minimizing the sum of squares would lead to parameter values that give the optimal, but meaningless yi = yhati = constant.
Now note that the rightmost part of eq.(1) can written as
sum((yi-yhati)^2)/N/sigma^2 eq.(2)
and the log-likelihood function for normally distributed observations is
LL = -N log(sigma) - N/2 log(2pi) - 1/2/sigma^2 sum((yi-yhati)^2) eq.(3)
so maximizing the correlation coefficient would be equivalent with maximizing LL if we would drop the first (and second) term and let sigma be given by eqs.(1) and (2) instead of letting this be an estimable parameter of the residual standard deviation.
I've implemented this in NONMEM using the LIKELIHOOD option of the $ESTIMATION record and it works. At the moment I combine observations of a population by taking the rightmost part of eq.(3) where sigma may depend on the individual, but I'm not sure yet how to incorporate the fact that N is not the same for each individual.
I would like to ask you to comment on this procedure and whether it affects NONMEM parameter estimation and hypothesis testing using the minimum value of the objective function in ways I might overlook.
Erik Olofsen
Department of Anesthesiology
Leiden University Medical Center
The Netherlands