胡学海
影响因子:5.61
DOI码:10.1093/bioinformatics/btaa1100
发表刊物:Bioinformatics
摘要:Abstract Motivation: Both the lack or limitation of experimental data of transcription factor binding sites (TFBS) in plants and the independent evolutions of plant TFs make computational approaches for identifying plant TFBSs lagging behind the relevant human researches. Observing that TFs are highly conserved among plant species, here we first employ the deep convolutional neural network (DeepCNN) to build 265 Arabidopsis TFBS prediction models based on available DAP-seq (DNA affinity purification sequencing) datasets, and then transfer them into homologous TFs in other plants. Results: DeepCNN not only achieves greater successes on Arabidopsis TFBS predictions when compared with gkm- SVM and MEME but also has learned its known motif for most Arabidopsis TFs as well as cooperative TF motifs with protein–protein interaction evidences as its biological interpretability. Under the idea of transfer learning, trans- species prediction performances on ten TFs of other three plants of Oryza sativa, Zea mays and Glycine max demon- strate the feasibility of current strategy. Availability and implementation: The trained 265 Arabidopsis TFBS prediction models were packaged in a Docker image named TSPTFBS, which is freely available on DockerHub at https://hub.docker.com/r/vanadiummm/tsptfbs. Source code and documentation are available on GitHub at: https://github.com/liulifenyf/TSPTFBS. Contact: huxuehai@mail.hzau.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.
论文类型:期刊论文
学科门类:理学
一级学科:生物学
卷号:37
期号:2
页面范围:260-262
是否译文:否
发表时间:2021-01-01
收录刊物:SCI