Frontiers in Emerging Computer Science and Information Technology

  1. Home
  2. Archives
  3. Vol. 1 No. 1 (2024): Volume01 Issue01 2024 December
  4. CS & IT
Frontiers in Emerging Computer Science and Information Technology

Article Details Page

Integrating Refined Clustering and Deep Neural Networks for Biomedical Named Entity Recognition in Textual Data

Authors

  • Dr. Matteo Ricci Department of Computer Science, University of Bologna, Italy
  • Prof. Linh Tran School of Information and Communication Technology, Hanoi University of Science and Technology, Vietnam

Keywords:

Biomedical Named Entity Recognition (Bio-NER), Deep Learning, Cluster Analysis, BiLSTM-CRF, Attention Mechanisms, Natural Language Processing, Information Extraction

Abstract

The accurate extraction of named entities from the vast and ever-growing volume of biomedical literature is fundamental for accelerating research and discovery in life sciences. However, the unique characteristics of biomedical texts, including highly specialized terminology, widespread use of synonyms, and complex entity structures, pose significant challenges for traditional Named Entity Recognition (NER) systems. This study introduces an innovative methodology that combines a sophisticated, enhanced cluster merging strategy with a robust deep neural network architecture to improve the identification of biomedical entity names from text corpora. Our approach first employs a novel cluster refinement process to semantically link and consolidate fragmented or varied mentions of the same biomedical entity throughout the corpus. The information derived from these refined clusters is then integrated as a rich, auxiliary feature into a Bidirectional Long Short-Term Memory (BiLSTM) network, further enhanced by an attention mechanism and topped with a Conditional Random Field (CRF) layer. Experimental validation on the widely recognized GENIA corpus demonstrates that this integrated framework achieves superior performance compared to existing state-of-the-art Bio-NER methods. The synergy between context-aware clustering and powerful deep learning capabilities offers a robust and effective solution for navigating the intricacies of biomedical text, ultimately facilitating more precise and comprehensive information extraction for biological and clinical applications.

Downloads

Published

2024-12-22

How to Cite

Dr. Matteo Ricci, & Prof. Linh Tran. (2024). Integrating Refined Clustering and Deep Neural Networks for Biomedical Named Entity Recognition in Textual Data. Frontiers in Emerging Computer Science and Information Technology, 1(1), 23–28. Retrieved from https://irjernet.com/index.php/fecsit/article/view/6