访问量:   最后更新时间:--

胡学海

博士生导师
硕士生导师
教师姓名:胡学海
教师拼音名称:huxuehai
职务:大数据系系主任
职称:教授
学历:博士研究生毕业
学位:理学博士学位
办公地点:逸夫楼C609
电子邮箱:
毕业院校:武汉大学
所属院系:信息学院
所在单位:信息学院
学科:统计学其他专业    生物信息学    
其他联系方式
论文成果
A directed learning strategy integrating multiple omic data improves genomic prediction
发布时间:2021-04-30    点击次数:

DOI码:10.1111/pbi.13117

发表刊物:Plant Biotechnology Journal

关键字:directed learning, genetic features, genomic prediction, LASSO, multiple omic data

摘要:Genomic prediction (GP) aims to construct a statistical model for predicting phenotypes using genome-wide markers and is a promising strategy for accelerating molecular plant breeding. However, current progress of phenotype prediction using genomic data alone has reached a bottleneck, and previous studies on transcriptomic and metabolomic predictions ignored genomic information. Here, we designed a novel strategy of GP called multilayered least absolute shrinkage and selection operator (MLLASSO) by integrating multiple omic data into a single model that iteratively learns three layers of genetic features (GFs) supervised by observed transcriptome and metabolome. Significantly, MLLASSO learns higher order information of gene interactions, which enables us to achieve a significant improvement of predictability of yield in rice from 0.1588 (GP alone) to 0.2451 (MLLASSO). In the prediction of the first two layers, some genes were found to be genetically predictable genes (GPGs) as their expressions were accurately predicted with genetic markers. Interestingly, we made three dramatic discoveries for the GPGs: (i) GPGs are good predictors for highly complex traits like yield; (ii) GPGs are mostly eQTL genes (cis or trans); and (iii) trait-related transcriptional factor families are enriched in GPGs. These findings support the notion that learned GFs not only are good predictors for traits but also have specific biological implications regarding regulation of gene expressions. To differentiate the new method from conventional GP models, we called MLLASSO a directed learning strategy supervised by intermediate omic data. This new prediction model appears to be more reliable and more robust than conventional GP models.

论文类型:期刊论文

学科门类:理学

一级学科:生物学

卷号:17

页面范围:2011–2020

是否译文:

发表时间:2019-01-01

收录刊物:SCI