Genotype data missing in some individuals

7 messages 7 people Latest: Nov 19, 2014

Genotype data missing in some individuals

From: 이소정 Date: November 19, 2014 technical
Dear all, I’ve analyzed a tacrolimus PopPK in pediatric patients. As you know, CYP3A5 genotype can change the tacrolimus PK significantly, 3A5 genotyping was performed in the study, however, in 20% of the subjects, the genotype data was missed. Then, how can I reflect the CYP3A5 genotype effect to the tacrolimus population model appropriately? Is there any solution? Best regards, SoJeong Yi
Dear SoJeong, First you might want to answer the question whether that phenotype is indeed important in your dataset. With the initial popPK model you could plot posthoc clearance against bodyweight and/or inspect the posthocs of clearance for evidence of multiple peaks in your distribution. You also may see the impact of phenotype in stratified concentration versus time plots. Depending on the dataset, with its sampling scheme, number of subjects (perhaps a low number) and distribution across age, it could be masked. If the impact is clear however, it might be benificial to try to include the subjects wih missing genotype. With a clear effect, you might be able to develop a mixture model. The mixture approach would describe the different populations in your dataset corresponding to the different phenotypes. The genotype would than inform the mixture as a covariate - the missing information would fall back to the pure mixture approach. As a warning, this approach is quite difficult. I would advise you to read up on the nonmem guides ($MIX) on this and look in the literature for examples - the Karlsson group has published about it, most recently this one (it contains code): http://link.springer.com/article/10.1208/s12248-009-9093-4. A search in the literature gives you additional background such as http://www.page-meeting.org/pdf_assets/9595-PAGE2007_3.pdf and http://link.springer.com/article/10.1007/s10928-006-9038-9. If the impact is not clear, a more empirical approach might be called for, in this case a subset analysis, i.e. where you exclude the missing subjects, of the covariate relationship might be all that you could achieve. If there is no impact at all, you do not need the genotype of course. Hope this helps! Best regards, Jeroen http://pd-value.com [email protected] @PD_value +31 6 23118438 -- More value out of your data!
Quoted reply history
On Nov 19, 2014, 7:57 AM, at 7:57 AM, "이소정" <[email protected]> wrote: >Dear all, > > > >I’ve analyzed a tacrolimus PopPK in pediatric patients. > >As you know, CYP3A5 genotype can change the tacrolimus PK >significantly, 3A5 genotyping was performed in the study, > >however, in 20% of the subjects, the genotype data was missed. > > > >Then, how can I reflect the CYP3A5 genotype effect to the tacrolimus >population model appropriately? > >Is there any solution? > > > >Best regards, > >SoJeong Yi > >

RE: Genotype data missing in some individuals

From: Dinko Rekic Date: November 19, 2014 technical
Dear SoJeong, I agree with everything Jeroen proposed. In addition to that, you may want to code the subjects with missing genotype as genotype 99 (or something similar) and then estimate genotype as categorical covariate on CL. This approach is not elegant but it is quick and often useful for initial analysis. Kind regards Dinko "The contents of this message are mine personally and do not necessarily reflect any position of the Government or the Food and Drug Administration."
Quoted reply history
From: [email protected] [mailto:[email protected]] On Behalf Of Jeroen Elassaiss-Schaap Sent: Wednesday, November 19, 2014 6:16 AM To: 이소정 Cc: [email protected] Subject: Re: [NMusers] Genotype data missing in some individuals Dear SoJeong, First you might want to answer the question whether that phenotype is indeed important in your dataset. With the initial popPK model you could plot posthoc clearance against bodyweight and/or inspect the posthocs of clearance for evidence of multiple peaks in your distribution. You also may see the impact of phenotype in stratified concentration versus time plots. Depending on the dataset, with its sampling scheme, number of subjects (perhaps a low number) and distribution across age, it could be masked. If the impact is clear however, it might be benificial to try to include the subjects wih missing genotype. With a clear effect, you might be able to develop a mixture model. The mixture approach would describe the different populations in your dataset corresponding to the different phenotypes. The genotype would than inform the mixture as a covariate - the missing information would fall back to the pure mixture approach. As a warning, this approach is quite difficult. I would advise you to read up on the nonmem guides ($MIX) on this and look in the literature for examples - the Karlsson group has published about it, most recently this one (it contains code): http://link.springer.com/article/10.1208/s12248-009-9093-4. A search in the literature gives you additional background such as http://www.page-meeting.org/pdf_assets/9595-PAGE2007_3.pdf and http://link.springer.com/article/10.1007/s10928-006-9038-9. If the impact is not clear, a more empirical approach might be called for, in this case a subset analysis, i.e. where you exclude the missing subjects, of the covariate relationship might be all that you could achieve. If there is no impact at all, you do not need the genotype of course. Hope this helps! Best regards, Jeroen http://pd-value.com [email protected]<mailto:[email protected]> @PD_value +31 6 23118438 -- More value out of your data! On Nov 19, 2014, at 7:57 AM, "이소정" <[email protected]<mailto:[email protected]>> wrote: Dear all, I’ve analyzed a tacrolimus PopPK in pediatric patients. As you know, CYP3A5 genotype can change the tacrolimus PK significantly, 3A5 genotyping was performed in the study, however, in 20% of the subjects, the genotype data was missed. Then, how can I reflect the CYP3A5 genotype effect to the tacrolimus population model appropriately? Is there any solution? Best regards, SoJeong Yi

Re: Genotype data missing in some individuals

From: Leonid Gibiansky Date: November 19, 2014 technical
I would do mixture model only if there is a very large -several folds- difference in PK parameters for two genotypes. If the difference is comparable with the inter-subject variability within the genotype, I would introduce category "missing" to remove the effect of those subjects on covariate effect estimate. So if the genotype is binary (YES/NO), you introduce the new third level "missing", work with it as with the 3-level categorical covariate, and report the difference between NO and YES as the genotype effect on PK. As a check for consistency, you may want to check whether the estimate of the PK parameter for "missing" level is somewhere between the estimates for "NO" and "YES" levels, closer to the value for the level with higher prevalence in your dataset. Regards, Leonid -------------------------------------- Leonid Gibiansky, Ph.D. President, QuantPharm LLC web: www.quantpharm.com e-mail: LGibiansky at quantpharm.com tel: (301) 767 5566
Quoted reply history
On 11/19/2014 6:16 AM, Jeroen Elassaiss-Schaap wrote: > Dear SoJeong, > > First you might want to answer the question whether that phenotype is > indeed important in your dataset. With the initial popPK model you could > plot posthoc clearance against bodyweight and/or inspect the posthocs of > clearance for evidence of multiple peaks in your distribution. You also > may see the impact of phenotype in stratified concentration versus time > plots. Depending on the dataset, with its sampling scheme, number of > subjects (perhaps a low number) and distribution across age, it could be > masked. > > If the impact is clear however, it might be benificial to try to include > the subjects wih missing genotype. With a clear effect, you might be > able to develop a mixture model. The mixture approach would describe > the different populations in your dataset corresponding to the different > phenotypes. The genotype would than inform the mixture as a covariate - > the missing information would fall back to the pure mixture approach. As > a warning, this approach is quite difficult. I would advise you to read > up on the nonmem guides ($MIX) on this and look in the literature for > examples - the Karlsson group has published about it, most recently this > one (it contains code): > http://link.springer.com/article/10.1208/s12248-009-9093-4. A search in > the literature gives you additional background such as > http://www.page-meeting.org/pdf_assets/9595-PAGE2007_3.pdf and > http://link.springer.com/article/10.1007/s10928-006-9038-9. > > If the impact is not clear, a more empirical approach might be called > for, in this case a subset analysis, i.e. where you exclude the missing > subjects, of the covariate relationship might be all that you could > achieve. If there is no impact at all, you do not need the genotype of > course. > > Hope this helps! > > Best regards, > > Jeroen > > http://pd-value.com http://pd-value.com > [email protected] > @PD_value > +31 6 23118438 > -- More value out of your data! > > On Nov 19, 2014, at 7:57 AM, "이소정" <[email protected] > <mailto:[email protected]>> wrote: > > Dear all, > > I’ve analyzed a tacrolimus PopPK in pediatric patients. > > As you know, CYP3A5 genotype can change the tacrolimus PK > significantly, 3A5 genotyping was performed in the study, > > however, in 20% of the subjects, the genotype data was missed. > > Then, how can I reflect the CYP3A5 genotype effect to the tacrolimus > population model appropriately? > > Is there any solution? > > Best regards, > > SoJeong Yi > > No virus found in this message. > Checked by AVG - www.avg.com http://www.avg.com > Version: 2014.0.4765 / Virus Database: 4189/8594 - Release Date: 11/18/14

RE: Genotype data missing in some individuals

From: Bill Denney Date: November 19, 2014 technical
Hi SoJeong, I agree with Leonid here on the value of the mixture model. With potentially subtle changes, mixture models can be very difficult. One way that I've had luck previously with a similar approach is to make "unknown genotype" a separate category and then to fit a parameter that is fraction "yes" (similar to a mixture model, but not specifying a genotype for a subject). Something like: G1 = THETA(1) G2 = THETA(2) FRA = 1/(1+EXP(-THETA(3))) IF (GENOTYPE1) THEN GENE = G1 IF (GENOTYPE2) THEN GENE = G2 IF (GENOTYPEUNK) THEN GENE = G1*FRA+G2*(1-FRA) You can then compare FRA to the expected genotypic distribution in the population. Thanks, Bill
Quoted reply history
-----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Leonid Gibiansky Sent: Wednesday, November 19, 2014 10:11 AM To: Jeroen Elassaiss-Schaap; 이소정 Cc: [email protected] Subject: Re: [NMusers] Genotype data missing in some individuals I would do mixture model only if there is a very large -several folds- difference in PK parameters for two genotypes. If the difference is comparable with the inter-subject variability within the genotype, I would introduce category "missing" to remove the effect of those subjects on covariate effect estimate. So if the genotype is binary (YES/NO), you introduce the new third level "missing", work with it as with the 3-level categorical covariate, and report the difference between NO and YES as the genotype effect on PK. As a check for consistency, you may want to check whether the estimate of the PK parameter for "missing" level is somewhere between the estimates for "NO" and "YES" levels, closer to the value for the level with higher prevalence in your dataset. Regards, Leonid -------------------------------------- Leonid Gibiansky, Ph.D. President, QuantPharm LLC web: www.quantpharm.com e-mail: LGibiansky at quantpharm.com tel: (301) 767 5566 On 11/19/2014 6:16 AM, Jeroen Elassaiss-Schaap wrote: > Dear SoJeong, > > First you might want to answer the question whether that phenotype is > indeed important in your dataset. With the initial popPK model you > could plot posthoc clearance against bodyweight and/or inspect the > posthocs of clearance for evidence of multiple peaks in your > distribution. You also may see the impact of phenotype in stratified > concentration versus time plots. Depending on the dataset, with its > sampling scheme, number of subjects (perhaps a low number) and > distribution across age, it could be masked. > > If the impact is clear however, it might be benificial to try to > include the subjects wih missing genotype. With a clear effect, you > might be able to develop a mixture model. The mixture approach would > describe the different populations in your dataset corresponding to > the different phenotypes. The genotype would than inform the mixture > as a covariate - the missing information would fall back to the pure > mixture approach. As a warning, this approach is quite difficult. I > would advise you to read up on the nonmem guides ($MIX) on this and > look in the literature for examples - the Karlsson group has published > about it, most recently this one (it contains code): > http://link.springer.com/article/10.1208/s12248-009-9093-4. A search > in the literature gives you additional background such as > http://www.page-meeting.org/pdf_assets/9595-PAGE2007_3.pdf and > http://link.springer.com/article/10.1007/s10928-006-9038-9. > > If the impact is not clear, a more empirical approach might be called > for, in this case a subset analysis, i.e. where you exclude the > missing subjects, of the covariate relationship might be all that you > could achieve. If there is no impact at all, you do not need the > genotype of course. > > Hope this helps! > > Best regards, > > Jeroen > > http://pd-value.com http://pd-value.com > [email protected] > @PD_value > +31 6 23118438 > -- More value out of your data! > > On Nov 19, 2014, at 7:57 AM, "이소정" <[email protected] > <mailto:[email protected]>> wrote: > > Dear all, > > I’ve analyzed a tacrolimus PopPK in pediatric patients. > > As you know, CYP3A5 genotype can change the tacrolimus PK > significantly, 3A5 genotyping was performed in the study, > > however, in 20% of the subjects, the genotype data was missed. > > Then, how can I reflect the CYP3A5 genotype effect to the tacrolimus > population model appropriately? > > Is there any solution? > > Best regards, > > SoJeong Yi >

RE: Genotype data missing in some individuals

From: Mats Karlsson Date: November 19, 2014 technical
Hi, I would use: IF (GENOTYPE.EQ.1) GENE = THETA(1) IF (GENOTYPE.EQ.2) GENE = THETA(2) IF (GENOTYPE.EQ.-99.AND.MIXNUM.EQ.1) GENE = THETA(1) IF (GENOTYPE.EQ.-99.AND.MIXNUM.EQ.2) GENE = THETA(2) $MIX P(1)=THETA(3) P(2)=1-P(1) .. To handle THETA(3) there are different options If I believe that missingness is completely at random (MCAR): THETA(3) can be fixed to the frequency of GENOTYPE=1 in the population you are studying if it is known what this frequency is. If it is not known, I would fix it to the fraction of GENOTYPE=1 in your sample. If I was really ambitious, I would take into account that your sample is small and therefore it may not perfectly reflect the proportion in your population. If so you could use the prior functionality. If you believe missingness is missing at random (MAR) another approach could be implemented. [MAR in this case could be that there are more missing of one ethnic group than another, but you know the ethnicity of everyone.] You would only modify one line in the code above: $MIX P(1)=THETA(3) IF(ETHNICITY.EQ.2) P(1)= THETA(4) If you believe that missingness could be not at random (MNAR), for example that genotyping failed more often for sujects with true GENO=1, then use the top code but estimate THETA(3) would be the appropriate thing to do. There are other options too. Two recent articles on this are provided below with comparison between methods. Also it describes a multiple imputation routine that we recently implemented in PsN. Comparison of methods for handling missing covariate data. Johansson ÅM, Karlsson MO. AAPS J. 2013 Oct;15(4):1232-41. doi: 10.1208/s12248-013-9526-y. Multiple imputation of missing covariates in NONMEM and evaluation of the method's sensitivity to η-shrinkage. Johansson ÅM, Karlsson MO. AAPS J. 2013 Oct;15(4):1035-42. doi: 10.1208/s12248-013-9508-0. Best regards, Mats Mats Karlsson, PhD Professor of Pharmacometrics Dept of Pharmaceutical Biosciences Faculty of Pharmacy Uppsala University Box 591 75124 Uppsala Phone: +46 18 4714105 Fax + 46 18 4714003 www.farmbio.uu.se/research/researchgroups/pharmacometrics/
Quoted reply history
-----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Denney, William S. Sent: Wednesday, November 19, 2014 6:24 PM To: Leonid Gibiansky; Jeroen Elassaiss-Schaap; 이소정 Cc: [email protected] Subject: RE: [NMusers] Genotype data missing in some individuals Hi SoJeong, I agree with Leonid here on the value of the mixture model. With potentially subtle changes, mixture models can be very difficult. One way that I've had luck previously with a similar approach is to make "unknown genotype" a separate category and then to fit a parameter that is fraction "yes" (similar to a mixture model, but not specifying a genotype for a subject). Something like: G1 = THETA(1) G2 = THETA(2) FRA = 1/(1+EXP(-THETA(3))) IF (GENOTYPE1) THEN GENE = G1 IF (GENOTYPE2) THEN GENE = G2 IF (GENOTYPEUNK) THEN GENE = G1*FRA+G2*(1-FRA) You can then compare FRA to the expected genotypic distribution in the population. Thanks, Bill -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Leonid Gibiansky Sent: Wednesday, November 19, 2014 10:11 AM To: Jeroen Elassaiss-Schaap; 이소정 Cc: [email protected] Subject: Re: [NMusers] Genotype data missing in some individuals I would do mixture model only if there is a very large -several folds- difference in PK parameters for two genotypes. If the difference is comparable with the inter-subject variability within the genotype, I would introduce category "missing" to remove the effect of those subjects on covariate effect estimate. So if the genotype is binary (YES/NO), you introduce the new third level "missing", work with it as with the 3-level categorical covariate, and report the difference between NO and YES as the genotype effect on PK. As a check for consistency, you may want to check whether the estimate of the PK parameter for "missing" level is somewhere between the estimates for "NO" and "YES" levels, closer to the value for the level with higher prevalence in your dataset. Regards, Leonid -------------------------------------- Leonid Gibiansky, Ph.D. President, QuantPharm LLC web: www.quantpharm.com e-mail: LGibiansky at quantpharm.com tel: (301) 767 5566 On 11/19/2014 6:16 AM, Jeroen Elassaiss-Schaap wrote: > Dear SoJeong, > > First you might want to answer the question whether that phenotype is > indeed important in your dataset. With the initial popPK model you > could plot posthoc clearance against bodyweight and/or inspect the > posthocs of clearance for evidence of multiple peaks in your > distribution. You also may see the impact of phenotype in stratified > concentration versus time plots. Depending on the dataset, with its > sampling scheme, number of subjects (perhaps a low number) and > distribution across age, it could be masked. > > If the impact is clear however, it might be benificial to try to > include the subjects wih missing genotype. With a clear effect, you > might be able to develop a mixture model. The mixture approach would > describe the different populations in your dataset corresponding to > the different phenotypes. The genotype would than inform the mixture > as a covariate - the missing information would fall back to the pure > mixture approach. As a warning, this approach is quite difficult. I > would advise you to read up on the nonmem guides ($MIX) on this and > look in the literature for examples - the Karlsson group has published > about it, most recently this one (it contains code): > http://link.springer.com/article/10.1208/s12248-009-9093-4. A search > in the literature gives you additional background such as > http://www.page-meeting.org/pdf_assets/9595-PAGE2007_3.pdf and > http://link.springer.com/article/10.1007/s10928-006-9038-9. > > If the impact is not clear, a more empirical approach might be called > for, in this case a subset analysis, i.e. where you exclude the > missing subjects, of the covariate relationship might be all that you > could achieve. If there is no impact at all, you do not need the > genotype of course. > > Hope this helps! > > Best regards, > > Jeroen > > http://pd-value.com http://pd-value.com > [email protected] > @PD_value > +31 6 23118438 > -- More value out of your data! > > On Nov 19, 2014, at 7:57 AM, "이소정" <[email protected] > <mailto:[email protected]>> wrote: > > Dear all, > > I’ve analyzed a tacrolimus PopPK in pediatric patients. > > As you know, CYP3A5 genotype can change the tacrolimus PK > significantly, 3A5 genotyping was performed in the study, > > however, in 20% of the subjects, the genotype data was missed. > > Then, how can I reflect the CYP3A5 genotype effect to the tacrolimus > population model appropriately? > > Is there any solution? > > Best regards, > > SoJeong Yi >

RE: Genotype data missing in some individuals

From: Sebastian Frechen Date: November 19, 2014 technical
Hey Bill & SoJeong, in your suggestion (Bill), you would estimate a fixed effect for the unknown genotype weighting between G1 and G2. But let's assume that the status "genotype unknown" is completely at random, so some are of type 1 and some are of type 2. Wouldn't it be possible to implement then something like this (not tested, just an idea!!): ----------------------- G1 = THETA(1) G2 = THETA(2) FRA = PHI(ETA(1)) ; with PHI() as the cdf of the standard normal distribution IF (GENOTYPE1) THEN GENE = G1 IF (GENOTYPE2) THEN GENE = G2 IF (GENOTYPEUNK) THEN GENE = G1*FRA+G2*(1-FRA) ... ... $OMEGA 1 FIX ----------------------- So, with fixing the variance of ETA(1) to 1, NONMEM can obtain standard normally distributed random effects in the conditional estimation step. Evaluating these etas with the cdf of the standard normal distribution (PHI), we obtain uniformly distributed effects (between 0 and 1). Hence, for each subject with GENOTYPEUNK, we obtain a probability (weight) for G1 and G2, respectively, based on the current parameter estimates of the model. This might be a quasi mixture model approach. ________________________________________ Von: [email protected] [[email protected]]" im Auftrag von "Denney, William S. [[email protected]] Gesendet: Mittwoch, 19. November 2014 18:23 An: Leonid Gibiansky; Jeroen Elassaiss-Schaap; 이소정 Cc: [email protected] Betreff: RE: [NMusers] Genotype data missing in some individuals Hi SoJeong, I agree with Leonid here on the value of the mixture model. With potentially subtle changes, mixture models can be very difficult. One way that I've had luck previously with a similar approach is to make "unknown genotype" a separate category and then to fit a parameter that is fraction "yes" (similar to a mixture model, but not specifying a genotype for a subject). Something like: G1 = THETA(1) G2 = THETA(2) FRA = 1/(1+EXP(-THETA(3))) IF (GENOTYPE1) THEN GENE = G1 IF (GENOTYPE2) THEN GENE = G2 IF (GENOTYPEUNK) THEN GENE = G1*FRA+G2*(1-FRA) You can then compare FRA to the expected genotypic distribution in the population. Thanks, Bill
Quoted reply history
-----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Leonid Gibiansky Sent: Wednesday, November 19, 2014 10:11 AM To: Jeroen Elassaiss-Schaap; 이소정 Cc: [email protected] Subject: Re: [NMusers] Genotype data missing in some individuals I would do mixture model only if there is a very large -several folds- difference in PK parameters for two genotypes. If the difference is comparable with the inter-subject variability within the genotype, I would introduce category "missing" to remove the effect of those subjects on covariate effect estimate. So if the genotype is binary (YES/NO), you introduce the new third level "missing", work with it as with the 3-level categorical covariate, and report the difference between NO and YES as the genotype effect on PK. As a check for consistency, you may want to check whether the estimate of the PK parameter for "missing" level is somewhere between the estimates for "NO" and "YES" levels, closer to the value for the level with higher prevalence in your dataset. Regards, Leonid -------------------------------------- Leonid Gibiansky, Ph.D. President, QuantPharm LLC web: www.quantpharm.com e-mail: LGibiansky at quantpharm.com tel: (301) 767 5566 On 11/19/2014 6:16 AM, Jeroen Elassaiss-Schaap wrote: > Dear SoJeong, > > First you might want to answer the question whether that phenotype is > indeed important in your dataset. With the initial popPK model you > could plot posthoc clearance against bodyweight and/or inspect the > posthocs of clearance for evidence of multiple peaks in your > distribution. You also may see the impact of phenotype in stratified > concentration versus time plots. Depending on the dataset, with its > sampling scheme, number of subjects (perhaps a low number) and > distribution across age, it could be masked. > > If the impact is clear however, it might be benificial to try to > include the subjects wih missing genotype. With a clear effect, you > might be able to develop a mixture model. The mixture approach would > describe the different populations in your dataset corresponding to > the different phenotypes. The genotype would than inform the mixture > as a covariate - the missing information would fall back to the pure > mixture approach. As a warning, this approach is quite difficult. I > would advise you to read up on the nonmem guides ($MIX) on this and > look in the literature for examples - the Karlsson group has published > about it, most recently this one (it contains code): > http://link.springer.com/article/10.1208/s12248-009-9093-4. A search > in the literature gives you additional background such as > http://www.page-meeting.org/pdf_assets/9595-PAGE2007_3.pdf and > http://link.springer.com/article/10.1007/s10928-006-9038-9. > > If the impact is not clear, a more empirical approach might be called > for, in this case a subset analysis, i.e. where you exclude the > missing subjects, of the covariate relationship might be all that you > could achieve. If there is no impact at all, you do not need the > genotype of course. > > Hope this helps! > > Best regards, > > Jeroen > > http://pd-value.com http://pd-value.com > [email protected] > @PD_value > +31 6 23118438 > -- More value out of your data! > > On Nov 19, 2014, at 7:57 AM, "이소정" <[email protected] > <mailto:[email protected]>> wrote: > > Dear all, > > I’ve analyzed a tacrolimus PopPK in pediatric patients. > > As you know, CYP3A5 genotype can change the tacrolimus PK > significantly, 3A5 genotyping was performed in the study, > > however, in 20% of the subjects, the genotype data was missed. > > Then, how can I reflect the CYP3A5 genotype effect to the tacrolimus > population model appropriately? > > Is there any solution? > > Best regards, > > SoJeong Yi >