RE: question about incorporting genotyping data in disease progression model
Hi Kehua,
If I understand you correctly you screened thousands of genotypes to find those
that appeared to be the (60 most) promising predictors?
Were the asthma patients in your nonmem analysis part of the material you used
for GWAS screen, or is the nonmem analysis based on external data from other
patients that were not part of the initial screen?
I also was not quite clear on whether the subsequent nonmem analysis was based
on genotype or gene expression?
Either way, if an external set of patients were not used in the nonmem
analysis; that would be the reason you find so many significant covariates.
Apologies if this was a trivial answer that was not relevant for your work, but
there are many examples of this in the field of data mining, where the multiple
testing has not been taken into account when declaring significance or claiming
that a highly predictive model has been established.
Best regards
Jakob
Quoted reply history
From: [email protected] [mailto:[email protected]] On
Behalf Of kehua wu
Sent: 29 August 2012 17:21
To: [email protected]
Subject: [NMusers] question about incorporting genotyping data in disease
progression model
Dear NONMEM users,
I am working on a model in asthma patients and trying to build a model of FEV1,
which is the evaluation of lung function.
I have 500,000 genotyping data. First, I screened the genotyping data by
running GWAS to find out the potential genotyping data, which gave me about 60
genotypes. Then, I tried to add these 60 genotyping data into model to find out
if the progress of FEV1 is related with gene expression.
But the problem is that too many genotypes were related with a significant
change in OFV, which does not sound reasonable to me. I was hoping to find few
(2-3) genotypes are associated with the progress in lung function.
I have tried to include the genotyping data as discrete covariate (if
genotyping =1 then parameter=theta(1); if genotyping =2 then
parameter=theta(2); if genotyping=3 then parameter =theta(3)), and power
function (genotype**(theta)).
Did I do something wrong when including the genotyping data in the model as
covariate?
Thanks a lot in advance!
Kehua