Survey of Word Co-occurrence Measures for Collocation Detection

Authors

  • Olga Kolesnikova Instituto Politécnico Nacional, Superior School of Computing Sciences (ESCOM),

DOI:

https://doi.org/10.13053/cys-20-3-2456

Keywords:

Word co-occurrence measure, association measure, collocation, statistical language model, rule-based language model, hybrid approach to model word co-occurrence.

Abstract

This paper presents a detailed survey of word co-occurrence measures used in natural language processing. Word co-occurrence information is vital for accurate computational text treatment, it is important to distinguish words which can combine freely with other words from other words whose preferences to generate phrases are restricted. The latter words together with their typical co-occurring companions are called collocations. To detect collocations, many word co-occurrence measures, also called association measures, are used to determine a high degree of cohesion between words in collocations as opposed to a low degree of cohesion in free word combinations. We describe such association measures grouping them in classes depending on approaches and mathematical models used to formalize word co-occurrence.

Author Biography

Olga Kolesnikova, Instituto Politécnico Nacional, Superior School of Computing Sciences (ESCOM),

Holds the M.Sc. in Linguistics and the Ph.D. in Computer Science. She is a full-time professor and researcher at the Superior School of Computer Science of the National Polytechnic Institute, Mexico. She is a member of the National System of Researchers of Mexico (SNI 1). Her interests are in computer linguistics and natural language processing, semantic analysis of collocations and other types of restricted lexical co-occurrence, comparative phonetics, and intelligent tutor systems. She authors various publications including a book, a chapter, and articles in international journals. She leads and participates in various research projects on natural language processing, serves as a reviewer in international journals and conferences.

Downloads

Published

2016-09-30