Refining Word Sense Disambiguation via Multilevel Clustering of Lexical Representations
Keywords:
Word sense disambiguation, multilevel clustering, lexical representations, natural language processingAbstract
Word Sense Disambiguation (WSD), the task of identifying the correct meaning of a word in a given context, remains a pivotal challenge in Natural Language Processing (NLP). This article explores the application of multilevel clustering techniques to enhance WSD accuracy and provide clearer contextual understanding. By leveraging hierarchical analysis across various linguistic features, including word embeddings, lexical networks, and syntactic patterns, this approach aims to capture intricate semantic relationships. We discuss the methodological framework for integrating clustering at different analytical levels and synthesize the potential benefits, particularly in addressing polysemy and homonymy. The proposed multilevel clustering paradigm offers a robust pathway for refining sense assignments, leading to improved performance in downstream NLP applications.
References
Agirre E., L ́opez de Lacalle O., Soroa A.: Random Walks for Knowledge-Based Word Sense Disambiguation, Computational Linguistics, vol. 40(1), pp. 57–84, 2014. doi: 10.1162/COLIa00164.
AlMousa M., Benlamri R., Khoury R.: A novel word sense disambiguation approach using WordNet knowledge graph, Computer Speech & Language, vol. 74, 101337, 2022. doi: 10.1016/j.csl.2021.101337.
Anaya-S ́anchez H., Pons-Porrata A., Berlanga-Llavori R.: TKB-UO: Using Sense Clustering for WSD. In: Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007), Prague, June 2007, pp. 322–325, 2007. doi: 10.3115/1621474.1621544.
Berahmand K., Li Y., Xu Y.: A Deep Semi-Supervised Community Detection Based on Point-Wise Mutual Information, IEEE Transactions on Computational Social Systems, vol. 11(3), pp. 3444–3456, 2023. doi: 10.1109/TCSS.2023.3327810.
Bhingardive S., Singh D., Rudramurthy V., Redkar H., Bhattacharyya P.: Unsupervised most frequent sense detection using word embeddings. In: Proceedings of the 2015 conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies, pp. 1238–1243, 2015. doi: 10.3115/v1/n15-1132.
Chifu A.G., Hristea F., Mothe J., Popescu M.: Word sense discrimination in information retrieval: A spectral clustering-based approach, Information Processing & Management, vol. 51(2), pp. 16–31, 2015. doi: 10.1016/j.ipm.2014.10.007.
Dubey S., Kohli N.: A Multilevel Center Embedding approach for Sentence Similarity having Complex structures. In: 2023 World Conference on Communication & Computing (WCONF), pp. 1–8, 2023. doi: 10.1109/wconf58270.2023.10235102.
Guerrieri A., Rahimian F., Girdzijauskas S., Montresor A.: Tovel: Distributed graph clustering for word sense disambiguation. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), pp. 623–630, IEEE, 2016. doi: 10.1109/icdmw.2016.0094.
Huang L., Sun C., Qiu X., Huang X.: GlossBERT: BERT for word sense disambiguation with gloss knowledge, arXiv preprint arXiv:190807245, 2019. doi: 10.18653/v1/d19-1355.
Le A.C., Shimazu A., Huynh V.N., Nguyen L.M.: Semi-supervised learning integrated with classifier combination for word sense disambiguation, Computer Speech & Language, vol. 22(4), pp. 330–345, 2008. doi: 10.1016/j.csl.2007.11.001.
Li X., Qing S., Zhang H., Wang T., Yang H.: Kernel methods for word sense disambiguation, Artificial Intelligence Review, vol. 46, pp. 41–58, 2016. doi: 10.1007/s10462-015-9455-5.
Lyashevskaya O., Mitrofanova O., Grachkova M., Romanov S., Shimorina A., Shurygina A.: Automatic Word Sense Disambiguation and Construction Identification Based on Corpus Multilevel Annotation. In: Text, Speech and Dialogue: 14th International Conference, TSD 2011, Pilsen, Czech Republic, September 1–5, 2011. Proceedings 14, pp. 80–90, Springer, 2011. doi: 10.1007/978-3-642-23538-2_11.
Mart ́ın T., Berlanga-Llavori R.: A clustering-based approach for unsupervised word sense disambiguation, Procesamiento del Lenguaje Natural, vol. 49, pp. 49–56, 2012.
Miller G.A., Chodorow M., Landes S., Leacock C., Thomas R.G.: Using a semantic concordance for sense identification. In: Human Language Technology: Proceedings of a Workshop held at Plainsboro, New Jersey, March 8-11, 1994, 1994. doi: 10.3115/1075812.1075866.
Niu Z.Y., Ji D.H., Tan C.L.: Learning model order from labeled and unlabeled data for partially supervised classification, with application to word sense disambiguation, Computer Speech & Language, vol. 21(4), pp. 609–619, 2007. doi: 10.1016/j.csl.2007.02.001.
Pasini T., Scozzafava F., Scarlini B.: CluBERT: A cluster-based approach for learning sense distributions in multiple languages. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4008–4018, 2020. doi: 10.18653/v1/2020.acl-main.369.
Patil A.P., Ramteke R., Bhavsar R., Darbari H.: Graph-Based Algorithm for Word Sense Disambiguation: A Performance and Comparison, Sambodhi, vol. 44(03), pp. 77–79, 2021.
Pelevina M., Arefyev N., Biemann C., Panchenko A.: Making sense of word embeddings, arXiv preprint arXiv:170803390, 2017. doi: 10.48550/arXiv.1708.03390.
Seo H.C., Chung H., Rim H.C., Myaeng S.H., Kim S.H.: Unsupervised word sense disambiguation using WordNet relatives, Computer Speech & Language, vol. 18(3), pp. 253–273, 2004. doi: 10.1016/j.csl.2004.05.004.
Shirai K., Nakamura M.: JAIST: Clustering and classification based approaches for Japanese WSD. In: Proceedings of the 5th International Workshop on Semantic Evaluation, Uppsala, Sweden, 15–16 July 2010, pp. 379–382, 2010.
Vaishnav Z.B., Sajja P.S.: Knowledge-Based Approach for Word Sense Disambiguation Using Genetic Algorithm for Gujarati. In: S. Satapathy, A. Joshi (eds.), Information and Communication Technology for Intelligent Systems. Smart Innovation, Systems and Technologies, vol. 1, pp. 485–494, Springer, Singapore, 2019. doi: 10.1007/978-981-13-1742-2_48.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Prof. Omar F. Abdelrahman, Laila K. Mostafa

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain the copyright of their articles published in this journal. All articles are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly cited.