访问量:   最后更新时间:--

胡学海

博士生导师
硕士生导师
教师姓名:胡学海
教师拼音名称:huxuehai
职务:大数据系系主任
职称:教授
学历:博士研究生毕业
学位:理学博士学位
办公地点:逸夫楼C609
电子邮箱:
毕业院校:武汉大学
所属院系:信息学院
所在单位:信息学院
学科:统计学其他专业    生物信息学    
其他联系方式
论文成果
Improved prediction of DNA-binding proteins using chaos game representation and random forest
发布时间:2021-04-30    点击次数:

发表刊物:Current Bioinformatics

关键字:DNA-binding proteins, chaos game representation, fractal dimension, random forest.

摘要:DNA-binding proteins (DNA-BPs) play an important role in many biological processes. Now next-generation sequencing technologies are widely used to obtain genome of many organisms. Consequently, identification of DNA-BPs accurately and rapidly will provide significant helps in annotation of genomes. Chaos game representation (CGR) can reveal the information hidden in protein sequences. Furthermore, fractal dimensions are a vital index to measure compactness of complex and irregular geometric objects. In this research, in order to extract the intrinsic correlation with DNA- binding property from protein sequence, CGR algorithm and fractal dimension, together with amino acid composition are applied to formulate the protein samples. Here we employ the random forest as the classifier to predict DNA-BPs based on sequence-derived features with amino acid composition and fractal dimension. This resulting predictor is compared with three important existing methods DNA-Prot, iDNA-Prot and DNAbinder in the same datasets. On two benchmark datasets from DNA-Prot and iDNA-Prot, the average accuracies (ACC) achieve 82.07%, 84.91% respectively, and average Matthew's correlation coefficients (MCC) achieve 0.6085, 0.6981 respectively. The point to point comparisons demonstrate that our fractal approach shows some improvements.

论文类型:期刊论文

卷号:11

是否译文:

发表时间:2016-01-01

收录刊物:SCI