A Rule-Based Meronymy Extraction Module for Portuguese

Authors

  • Ilia Markov Centro de Investigación en Computación (CIC),Instituto Politécnico Nacional (IPN)
  • Nuno Mamede Universidade de Lisboa
  • Jorge Baptista Universidade do Algarve/FCHS and CECL,

DOI:

https://doi.org/10.13053/cys-19-4-2255

Keywords:

whole-part relation, meronymy, body-part noun, disease noun, Portuguese

Abstract

In this article, we improve the extraction of semantic relations between textual elements as it is currently performed by STRING, a hybrid statistical and rulebased Natural Language Processing (NLP) chain for Portuguese, by targeting whole-part relations (meronymy), that is, a semantic relation between an entity that is perceived as a constituent part of another entity, or a member of a set. In this case, we focus on the type of meronymy involving human entities and body-part nouns (Nbp) (e.g., O Pedro partiu uma perna `Pedro broke a leg': WHOLE-PART(Pedro,perna) 'WHOLE-PART(Pedro,leg)'). In order to extract this type of whole-part relations, a rule-based meronymy extraction module has been built and integrated in the grammar of the STRING system. The module was evaluated with promising results.

Author Biographies

Ilia Markov, Centro de Investigación en Computación (CIC),Instituto Politécnico Nacional (IPN)

received his Bachelor degree in ComputerEngineering in 2001 from the Kaliningrad State Technical University, Russia. He obtained hisMaster degree in Language Sciences in 2012 fromthe University of Algarve, Portugal. He is currentlya PhD student at Instituto Polit ´ecnico Nacional,Center for Computing Research, Mexico. His mainresearch interests include natural language processing,computational linguistics, and informationretrieval.

Nuno Mamede, Universidade de Lisboa

graduated in Electrotechnical andComputers Engineering by the Instituto SuperiorT´ecnico (IST), Lisbon, in 1981, and received hisMSc and PhD degrees in Electrotechnic and ComputersEngineering, from the same University in1985 and 1992, respectively. In 1982 he startedas lecturer and since 2006 he holds a position ofAssociate Professor in Instituto Superior T´ecnico,where he has taught Digital Systems, ObjectOriented Programming, Programming Languages,knowledge representation, Natural Language Processing.He has been a researcher at INESC-IDLisboa, in Lisbon, since its creation in 1980. Heparticipated in the foundation of L2F where hold aposition in the Executive Board. His activities havebeen in the areas of Written Natural Language Processing,namely on Syntactic Processing, NamedEntity Recognition, and Natural Language Interfacesto Data Bases. He has authored a significantnumber of scientific papers. He is a member ofAAAI, ACM and ACL.

Jorge Baptista, Universidade do Algarve/FCHS and CECL,

received his Bachelor and Masterdegrees in Linguistics from the Faculty of Letters ofthe University of Lisbon, in 1990 and 1995, respectively.He has a PhD in Linguistics (syntax) fromUniversity of Algarve (2001). He is an AssociateProfessor at University of Algarve and an invitedresearcher at L2F, INESC-ID Lisbon. His mainresearch interests are in computational and theoreticallinguistics (syntax, grammar, large coveragelexica, corpus linguistics, machine translation).

Downloads

Published

2015-12-18