tsaccessamount   tslastupdatetime--

胡学海

tsdoctortutor
tsgtutor
tsname胡学海
tsnamepinyinhuxuehai
tsjob大数据系系主任
tsprorankname教授
tseducation博士研究生毕业
tsdegree理学博士学位
tsofficelocation湖北洪山实验室C308
tsemail
tsgraduateduniversity武汉大学
tsteachercollege信息学院
tsunit信息学院
tsdiscipline统计学其他专业    生物信息学    
tsothercontact
论文成果
Improved Prediction of Regulatory Element Using Hybrid Abelian Complexity Features with DNA Sequences
tsreleasetime2021-04-30    tsclick

tsimpactfactor4.556

tsdoi10.3390/ijms20071704

tsjournalnameInternational Journal of Molecular Sciences

tskeywordregulatory element; enhancer; abelian complexity; prediction

tssummaryDeciphering the code of cis-regulatory element (CRE) is one of the core issues of current biology. As an important category of CRE, enhancers play crucial roles in gene transcriptional regulations in a distant manner. Further, the disruption of an enhancer can cause abnormal transcription and, thus, trigger human diseases, which means that its accurate identification is currently of broad interest. Here, we introduce an innovative concept, i.e., abelian complexity function (ACF), which is a more complex extension of the classic subword complexity function, for a new coding of DNA sequences. After feature selection by an upper bound estimation and integration with DNA composition features, we developed an enhancer prediction model with hybrid abelian complexity features (HACF). Compared with existing methods, HACF shows consistently superior performance on three sources of enhancer datasets. We tested the generalization ability of HACF by scanning human chromosome 22 to validate previously reported super-enhancers. Meanwhile, we identified novel candidate enhancers which have supports from enhancer-related ENCODE ChIP-seq signals. In summary, HACF improves current enhancer prediction and may be beneficial for further prioritization of functional noncoding variants.

tsthesistype期刊论文

tsdiscipline理学

tsfirstleveldis生物学

tsreelnumber20

tspagescope1704

tstranslation

tspublishtime2019-01-01

tsjournalcodeSCI