Re: RE: Akaike information criterion
From: cng@imap.unc.edu
Subject: Re: RE: Akaike information criterion
Date: Fri, 13 Jul 2001 10:45:14 -0400 (Eastern Daylight Time)
If I understand correctly, the single-sample statistics (for linear model )
like AIC, SBC, MDL, FPE, Mallow's Cp etc. can only be used as crude estimates
of generalization error in nonlinear models when you have a "large" training
set. Why use AIC? Did anyone try SBC or MDL (Minimum Description Length
Principle). Among the simple generalization estimators that do not require the
noise variance to be known, SBC often work well (at least in neural network).
Shao (1995) showed that in linear model (at least), SBC provides consistent
sub-set selection, while AIC dose not. That is, SBC will choose the "best"
subset with probability approaching one as the size of the training set goes to
infinity. AIC has an asymptotic probability of one of choosing a good subset
, but less than one of choosing the best subset (Stone 1979). Many
simulation studies have also found that AIC overfits badly in small samples,
and that SBC works well. MDL has been showed to be closely related to SBC.
Did anyone know a study that compare the model selection criterion (i.e SBC,
AICs) in NOMEM model selection? Thanks.
Chee Ng