UIJRT » United International Journal for Research & Technology

Genomics, High Performance Computing and Machine Learning

Vaidehi Thakre, Shreyas Vedpathak, and Sejal Sawarkar

Total Views / Downloads: 62 

Cite ➜

Thakre, V., Vedpathak, S. and Sawarkar, S., 2021. Genomics, High Performance Computing and Machine Learning. United International Journal for Research & Technology (UIJRT), 2(8), pp.149-155.


Genomic data has the potential to improve healthcare strategy in a variety of ways, including illness prevention, improved diagnosis, and better treatment. While Machine Learning may have revolutionized many fields, its implementation in the field of Genomics is new. Currently, Machine Learning is being applied and tested in a lot of genomic processes but all of those have not been clinically validated. Hence, we are far from providing Machine Learning or Deep Learning models for -omics data which can be implemented. This paper aims to explore in a very uncomplicated manner, what exactly is genomics, where does high performance computing and machine learning come into picture, current applications of machine learning in genomics and discuss potential future scope of machine learning in genomics.

Keywords: Deep Learning, Genomics, High-Performance Computing, Machine Learning, Mass Spectrometry, Next-Generation Sequencing.


  1. Saeed, “Big Data Proteogenomics and High-Performance Computing: Challenges and Opportunities”, IEEE GlobalSIP 2015 — Symposium on Signal and Information Processing for Software-Defined Ecosystems, and Green Computing, 2015.
  2. Maxwell W. Libbrecht1 and William Stafford Noble1, “Machine learning applications in genetics and genomics”, Nature Reviews Genetics, 2015.
  3. Liu, J., Li, J., Wang, H., & Yan, J. “Application of deep learning in genomics”, Science China Life Sciences, 2020.
  4. Koumakis, “Deep learning models in genomics; are we there yet?”, Computational and Structural Biotechnology Journal, 2020.
  5. Lu Zhang, Jianjun Tan, Dan Han and Hao Zhu, ” From machine learning to deep learning: progress in machine intelligence for rational drug discovery”, Elsevier, 2017
  6. Cao, C. Freitas, L. Chan, M. Sun, H. Jiang and Z. Chen, “ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network”, Molecules, vol. 22, no. 10, p. 1732, 2017. Available: https://www.mdpi.com/1420-3049/22/10/1732.
  7. Boža, B. Brejová and T. Vinař, “DeepNano: Deep recurrent neural networks for base calling in MinION nanopore reads”, PLOS ONE, vol. 12, no. 6, p. e0178751, 2017. Available: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0178751.
  8. Quang and X. Xie, “DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences”, Nucleic Acids Research, vol. 44, no. 11, pp. e107-e107, 2016. Available: 10.1093/nar/gkw226.
  9. Sønderby, C. Sønderby, H. Nielsen and O. Winther, “Convolutional LSTM Networks for Subcellular Localization of Proteins”, Algorithms for Computational Biology, pp. 68-80, 2015. Available: 10.1007/978-3-319-21233-3_6
  10. Urda, J. Montes-Torres, F. Moreno, L. Franco and J. Jerez, “Deep Learning to Analyze RNA-Seq Gene Expression Data”, Advances in Computational Intelligence, pp. 50-59, 2017. Available: 10.1007/978-3-319-59147-6_5
  11. Singh, Y. Yang, B. Póczos and J. Ma, “Predicting enhancer-promoter interaction from genomic sequence with deep neural networks”, Quantitative Biology, vol. 7, no. 2, pp. 122-137, 2019. Available: 10.1007/s40484-019-0154-0
  12. Whalen, R. Truty and K. Pollard, “Enhancer–promoter interactions are encoded by complex genomic signatures on looping chromatin”, Nature Genetics, vol. 48, no. 5, pp. 488-496, 2016. Available: 10.1038/ng.3539.
  13. Jha, M. Gazzara and Y. Barash, “Integrative deep models for alternative splicing”, Bioinformatics, vol. 33, no. 14, pp. i274-i282, 2017. Available: 10.1093/bioinformatics/btx268.
  14. Xiong, Y. Barash and B. Frey, “Bayesian prediction of tissue-regulated splicing using RNA sequence and cellular context”, Bioinformatics, vol. 27, no. 18, pp. 2554-2562, 2011. Available: 10.1093/bioinformatics/btr444.
  15. Leung, H. Xiong, L. Lee and B. Frey, “Deep learning of the tissue-regulated splicing code”, Bioinformatics, vol. 30, no. 12, pp. i121-i129, 2014. Available:10.1093/bioinformatics/btu277.
  16. scikit-learn, Choosing the right estimator
  17. Sennaar, “Machine Learning in Genomics – Current Efforts and Future Applications | Emerj”, Emerj, 2021. [Online]. Available: https://emerj.com/ai-sector-overviews/machine-learning-in-genomics-applications/.
  18. Wickramarachchi, “Machine Learning For Genomics”, Medium, 2021. [Online]. Available: https://towardsdatascience.com/machine-learning-for-genomics-c02270a51795.
  19. Bonat, “Apply Machine Learning Algorithms for Genomics Data Classification”, Medium, 2021. [Online].Available: https://medium.com/mlearning-ai/apply-machine-learning-algorithms-for-genomics-data-classification-132972933723.
  20. Dias and A. Torkamani, “Artificial intelligence in clinical and genomic diagnostics”, GenomeMedicine, 2021. [Online]. Available: https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-019-0689-8.
  21. Bonat and B. Rayamajhi, “Apply Machine Learning Algorithms for Genomics Data Classification”, Medium, 2021. [Online]. Available: https://medium.com/mlearning-ai/apply-machine-learning-algorithms-for-genomics-data-classification-132972933723#0f4e.

For Conference & Paper Publication​

UIJRT Publication - International Journal