Integrating Refined Clustering and Deep Neural Networks for Biomedical Named Entity Recognition in Textual Data

Dr. Matteo Ricci; Prof. Linh Tran

Integrating Refined Clustering and Deep Neural Networks for Biomedical Named Entity Recognition in Textual Data

Authors

Dr. Matteo Ricci Department of Computer Science, University of Bologna, Italy
Prof. Linh Tran School of Information and Communication Technology, Hanoi University of Science and Technology, Vietnam

Keywords:

Biomedical Named Entity Recognition (Bio-NER), Deep Learning, Cluster Analysis, BiLSTM-CRF, Attention Mechanisms, Natural Language Processing, Information Extraction

Abstract

The accurate extraction of named entities from the vast and ever-growing volume of biomedical literature is fundamental for accelerating research and discovery in life sciences. However, the unique characteristics of biomedical texts, including highly specialized terminology, widespread use of synonyms, and complex entity structures, pose significant challenges for traditional Named Entity Recognition (NER) systems. This study introduces an innovative methodology that combines a sophisticated, enhanced cluster merging strategy with a robust deep neural network architecture to improve the identification of biomedical entity names from text corpora. Our approach first employs a novel cluster refinement process to semantically link and consolidate fragmented or varied mentions of the same biomedical entity throughout the corpus. The information derived from these refined clusters is then integrated as a rich, auxiliary feature into a Bidirectional Long Short-Term Memory (BiLSTM) network, further enhanced by an attention mechanism and topped with a Conditional Random Field (CRF) layer. Experimental validation on the widely recognized GENIA corpus demonstrates that this integrated framework achieves superior performance compared to existing state-of-the-art Bio-NER methods. The synergy between context-aware clustering and powerful deep learning capabilities offers a robust and effective solution for navigating the intricacies of biomedical text, ultimately facilitating more precise and comprehensive information extraction for biological and clinical applications.

Downloads

Published

2024-12-22

How to Cite

Dr. Matteo Ricci, & Prof. Linh Tran. (2024). Integrating Refined Clustering and Deep Neural Networks for Biomedical Named Entity Recognition in Textual Data. Frontiers in Emerging Computer Science and Information Technology, 1(1), 23–28. Retrieved from https://irjernet.com/index.php/fecsit/article/view/6

Download Citation

Issue

Vol. 1 No. 1 (2024): Volume01 Issue01 2024 December

Section

CS & IT

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors retain the copyright of their articles published in this journal. All articles are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly cited.

Frontiers in Emerging Computer Science and Information Technology

Article Details Page

Integrating Refined Clustering and Deep Neural Networks for Biomedical Named Entity Recognition in Textual Data

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

About

Journal Info

Get In Touch