Cited 0 times in
i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Hasan, MM | - |
dc.contributor.author | Manavalan, B | - |
dc.contributor.author | Shoombuatong, W | - |
dc.contributor.author | Khatun, MS | - |
dc.contributor.author | Kurata, H | - |
dc.date.accessioned | 2022-10-24T05:53:44Z | - |
dc.date.available | 2022-10-24T05:53:44Z | - |
dc.date.issued | 2020 | - |
dc.identifier.issn | 0167-4412 | - |
dc.identifier.uri | http://repository.ajou.ac.kr/handle/201003/22375 | - |
dc.description.abstract | DNA N6-methyladenine (6 mA) is one of the most vital epigenetic modifications and involved in controlling the various gene expression levels. With the avalanche of DNA sequences generated in numerous databases, the accurate identification of 6 mA plays an essential role for understanding molecular mechanisms. Because the experimental approaches are time-consuming and costly, it is desirable to develop a computation model for rapidly and accurately identifying 6 mA. To the best of our knowledge, we first proposed a computational model named i6mA-Fuse to predict 6 mA sites from the Rosaceae genomes, especially in Rosa chinensis and Fragaria vesca. We implemented the five encoding schemes, i.e., mononucleotide binary, dinucleotide binary, k-space spectral nucleotide, k-mer, and electron-ion interaction pseudo potential compositions, to build the five, single-encoding random forest (RF) models. The i6mA-Fuse uses a linear regression model to combine the predicted probability scores of the five, single encoding-based RF models. The resultant species-specific i6mA-Fuse achieved remarkably high performances with AUCs of 0.982 and 0.978 and with MCCs of 0.869 and 0.858 on the independent datasets of Rosa chinensis and Fragaria vesca, respectively. In the F. vesca-specific i6mA-Fuse, the MBE and EIIP contributed to 75% and 25% of the total prediction; in the R. chinensis-specific i6mA-Fuse, Kmer, MBE, and EIIP contribute to 15%, 65%, and 20% of the total prediction. To assist high-throughput prediction for DNA 6 mA identification, the i6mA-Fuse is publicly accessible at https://kurata14.bio.kyutech.ac.jp/i6mA-Fuse/. | - |
dc.language.iso | en | - |
dc.subject.MESH | Adenine | - |
dc.subject.MESH | Algorithms | - |
dc.subject.MESH | Binding Sites | - |
dc.subject.MESH | Computational Biology | - |
dc.subject.MESH | DNA, Plant | - |
dc.subject.MESH | Datasets as Topic | - |
dc.subject.MESH | Machine Learning | - |
dc.subject.MESH | Models, Genetic | - |
dc.subject.MESH | Rosaceae | - |
dc.title | i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation | - |
dc.type | Article | - |
dc.identifier.pmid | 32140819 | - |
dc.subject.keyword | DNA 6 mA | - |
dc.subject.keyword | Feature encoding | - |
dc.subject.keyword | Machine learning | - |
dc.subject.keyword | Sequence analysis | - |
dc.contributor.affiliatedAuthor | Manavalan, B | - |
dc.type.local | Journal Papers | - |
dc.identifier.doi | 10.1007/s11103-020-00988-y | - |
dc.citation.title | Plant molecular biology | - |
dc.citation.volume | 103 | - |
dc.citation.number | 1-2 | - |
dc.citation.date | 2020 | - |
dc.citation.startPage | 225 | - |
dc.citation.endPage | 234 | - |
dc.identifier.bibliographicCitation | Plant molecular biology, 103(1-2). : 225-234, 2020 | - |
dc.embargo.liftdate | 9999-12-31 | - |
dc.embargo.terms | 9999-12-31 | - |
dc.identifier.eissn | 1573-5028 | - |
dc.relation.journalid | J001674412 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.