Cited 0 times in Scipus Cited Count

Integrative machine learning framework for the identification of cell-specific enhancers from the human genome

Authors
Basith, S  | Hasan, MM | Lee, G  | Wei, L | Manavalan, B
Citation
Briefings in bioinformatics, 22(6). : bbab252-bbab252, 2021
Journal Title
Briefings in bioinformatics
ISSN
1467-54631477-4054
Abstract
Enhancers are deoxyribonucleic acid (DNA) fragments which when bound by transcription factors enhance the transcription of related genes. Due to its sporadic distribution and similar fractions, identification of enhancers from the human genome seems a daunting task. Compared to the traditional experimental approaches, computational methods with easy-to-use platforms could be efficiently applied to annotate enhancers' functions and physiological roles. In this aspect, several bioinformatics tools have been developed to identify enhancers. Despite their spectacular performances, existing methods have certain drawbacks and limitations, including fixed length of sequences being utilized for model development and cell-specificity negligence. A novel predictor would be beneficial in the context of genome-wide enhancer prediction by addressing the above-mentioned issues. In this study, we constructed new datasets for eight different cell types. Utilizing these data, we proposed an integrative machine learning (ML)-based framework called Enhancer-IF for identifying cell-specific enhancers. Enhancer-IF comprehensively explores a wide range of heterogeneous features with five commonly used ML methods (random forest, extremely randomized tree, multilayer perceptron, support vector machine and extreme gradient boosting). Specifically, these five classifiers were trained with seven encodings and obtained 35 baseline models. The output of these baseline models was integrated and again inputted to five classifiers for the construction of five meta-models. Finally, the integration of five meta-models through ensemble learning improved the model robustness. Our proposed approach showed an excellent prediction performance compared to the baseline models on both training and independent datasets in different cell types, thus highlighting the superiority of our approach in the identification of the enhancers. We assume that Enhancer-IF will be a valuable tool for screening and identifying potential enhancers from the human DNA sequences.
Keywords

MeSH

DOI
10.1093/bib/bbab252
PMID
34226917
Appears in Collections:
Journal Papers > School of Medicine / Graduate School of Medicine > Physiology
Ajou Authors
Balachandran, Manavalan  |  Basith, Shaherin  |  이, 광
Files in This Item:
There are no files associated with this item.
Export

qrcode

해당 아이템을 이메일로 공유하기 원하시면 인증을 거치시기 바랍니다.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse