Hindi Query Expansion based on Semantic Importance of Hindi WordNet Relations and Fuzzy Graph Connectivity Measures

Amita Jain, Sonakshi Vij, Oscar Castillo

Abstract


Query expansion refers to the process of adding terms to a given query for improving the performance of information retrieval (IR). The query might consist of polysemous terms which usually bring down the overall IR performance. To resolve this issue and perform optimized IR, we propose an approach based on fuzzy graphs for Hindi query expansion. To identify additional terms for query, we consider the relative semantic importance of the relations present in Hindi WordNet. The query is represented by the sub-graph extracted from the Hindi WordNet graph. Hindi WordNet is semantically richer due to the presence of a greater number of semantic relations as compared to other WordNets. For all 16 semantic relations present in Hindi WordNet a relative significance score proportional to semantic relatedness is provided. This score acts as the edge weights to the Hindi WordNet graph which is now represented as a fuzzy graph. This assignment helps in moving more semantically related words, closer and recedes away less semantically related words in Hindi WordNet.  The selection of significant terms that are to be used for query expansion is done by using local and global fuzzy graph connectivity measures. The proposed method is evaluated on the Forum for Information Retrieval (FIRE) dataset for 3 consecutive years which depicts that the proposed method provides better results than the state-of-art approaches.


Keywords


Fuzzy graph connectivity measures, information retrieval, natural language processing, query expansion, word sense disambiguation

Full Text: PDF