tsaccessamount   tslastupdatetime--

胡学海

tsdoctortutor
tsgtutor
tsname胡学海
tsnamepinyinhuxuehai
tsjob大数据系系主任
tsproranknameProfessor
tseducationWith Certificate of Graduation for Doctorate Study
tsdegreeDoctoral Degree in Science
tsofficelocation湖北洪山实验室C308
tsemail
tsgraduateduniversity武汉大学
tsteachercollegeCollege of Informatics
tsunit信息学院
tsdisciplineOther specialties in Statistics    bioinformatics    
tsothercontact

Email:

Paper Publications
Improved prediction of DNA-binding proteins using chaos game representation and random forest
tsreleasetime2021-04-30    tsclick

tsjournalnameCurrent Bioinformatics

tskeywordDNA-binding proteins, chaos game representation, fractal dimension, random forest.

tssummaryDNA-binding proteins (DNA-BPs) play an important role in many biological processes. Now next-generation sequencing technologies are widely used to obtain genome of many organisms. Consequently, identification of DNA-BPs accurately and rapidly will provide significant helps in annotation of genomes. Chaos game representation (CGR) can reveal the information hidden in protein sequences. Furthermore, fractal dimensions are a vital index to measure compactness of complex and irregular geometric objects. In this research, in order to extract the intrinsic correlation with DNA- binding property from protein sequence, CGR algorithm and fractal dimension, together with amino acid composition are applied to formulate the protein samples. Here we employ the random forest as the classifier to predict DNA-BPs based on sequence-derived features with amino acid composition and fractal dimension. This resulting predictor is compared with three important existing methods DNA-Prot, iDNA-Prot and DNAbinder in the same datasets. On two benchmark datasets from DNA-Prot and iDNA-Prot, the average accuracies (ACC) achieve 82.07%, 84.91% respectively, and average Matthew's correlation coefficients (MCC) achieve 0.6085, 0.6981 respectively. The point to point comparisons demonstrate that our fractal approach shows some improvements.

tsthesistypeJournal paper

tsreelnumber11

tstranslationno

tspublishtime2016-01-01

tsjournalcodeSCI