TY - JOUR
T1 - Integration of single nucleotide variants and whole-genome DNA methylation profiles for classification of rheumatoid arthritis cases from controls
AU - Amiri Roudbar, Mahmoud
AU - Mohammadabadi, Mohammad Reza
AU - Ayatollahi Mehrgardi, Ahmad
AU - Abdollahi-Arpanahi, Rostam
AU - Momen, Mehdi
AU - Morota, Gota
AU - Brito Lopes, Fernando
AU - Gianola, Daniel
AU - Rosa, Guilherme J.M.
N1 - Publisher Copyright:
© 2020, The Author(s), under exclusive licence to The Genetics Society.
PY - 2020/5/1
Y1 - 2020/5/1
N2 - This study evaluated the use of multiomics data for classification accuracy of rheumatoid arthritis (RA). Three approaches were used and compared in terms of prediction accuracy: (1) whole-genome prediction (WGP) using SNP marker information only, (2) whole-methylome prediction (WMP) using methylation profiles only, and (3) whole-genome/methylome prediction (WGMP) with combining both omics layers. The number of SNP and of methylation sites varied in each scenario, with either 1, 10, or 50% of these preselected based on four approaches: randomly, evenly spaced, lowest p value (genome-wide association or epigenome-wide association study), and estimated effect size using a Bayesian ridge regression (BRR) model. To remove effects of high levels of pairwise linkage disequilibrium (LD), SNPs were also preselected with an LD-pruning method. Five Bayesian regression models were studied for classification, including BRR, Bayes-A, Bayes-B, Bayes-C, and the Bayesian LASSO. Adjusting methylation profiles for cellular heterogeneity within whole blood samples had a detrimental effect on the classification ability of the models. Overall, WGMP using Bayes-B model has the best performance. In particular, selecting SNPs based on LD-pruning with 1% of the methylation sites selected based on BRR included in the model, and fitting the most significant SNP as a fixed effect was the best method for predicting disease risk with a classification accuracy of 0.975. Our results showed that multiomics data can be used to effectively predict the risk of RA and identify cases in early stages to prevent or alter disease progression via appropriate interventions.
AB - This study evaluated the use of multiomics data for classification accuracy of rheumatoid arthritis (RA). Three approaches were used and compared in terms of prediction accuracy: (1) whole-genome prediction (WGP) using SNP marker information only, (2) whole-methylome prediction (WMP) using methylation profiles only, and (3) whole-genome/methylome prediction (WGMP) with combining both omics layers. The number of SNP and of methylation sites varied in each scenario, with either 1, 10, or 50% of these preselected based on four approaches: randomly, evenly spaced, lowest p value (genome-wide association or epigenome-wide association study), and estimated effect size using a Bayesian ridge regression (BRR) model. To remove effects of high levels of pairwise linkage disequilibrium (LD), SNPs were also preselected with an LD-pruning method. Five Bayesian regression models were studied for classification, including BRR, Bayes-A, Bayes-B, Bayes-C, and the Bayesian LASSO. Adjusting methylation profiles for cellular heterogeneity within whole blood samples had a detrimental effect on the classification ability of the models. Overall, WGMP using Bayes-B model has the best performance. In particular, selecting SNPs based on LD-pruning with 1% of the methylation sites selected based on BRR included in the model, and fitting the most significant SNP as a fixed effect was the best method for predicting disease risk with a classification accuracy of 0.975. Our results showed that multiomics data can be used to effectively predict the risk of RA and identify cases in early stages to prevent or alter disease progression via appropriate interventions.
UR - http://www.scopus.com/inward/record.url?scp=85081269377&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85081269377&partnerID=8YFLogxK
U2 - 10.1038/s41437-020-0301-4
DO - 10.1038/s41437-020-0301-4
M3 - Article
C2 - 32127659
AN - SCOPUS:85081269377
SN - 0018-067X
VL - 124
SP - 658
EP - 674
JO - Heredity
JF - Heredity
IS - 5
ER -