LexAN: Lexical Association Networks
Abstract
This paper presents Lexical Association Networks (LexAN), which entail the development of a mathematical model comprising a collection of words derived from a textual corpus. The interconnections between word tokens are represented by weighted edges within a non-directed graph structure. The construction process of LexAN involves 6 stages: 1) Lemmatization 2) Multi-word expressions 3) Stopwords removal 4) Co-ocurrence graph 5) Word Co-ocurrence norms, and 6) LexAN construction. We employed a Medical text corpus containing 574,011 words to build our graphs. To assess the efficacy of our LexAN, these graph structures were implemented within a tool designed to address the lexical access problem, specifically functioning as a reverse dictionary. This application resulted in favorable and promising results.
Keywords
Network, co-occurrence, lexical access