Proposal for Named Entities Recognition and Classification (NERC) and the Automatic Generation of Rules on Mexican News

Orlando Ramos Flores, David Pinto

Abstract


In this paper we introduce a proposal for extracting facts from news on Mexican online newspapers through their RSS (Really Simple Syndication). This problem will be addressed by using the task of automatic named entities recognition and classification (NERC), as well as the semantic relation extraction among entities, so that we can build a database of facts and rules from the obtained entities in an automatic manner. The final aim is to be able to infer new rules through the use of the knowledge databases constructed and an inference engine. In order to build the NER model, we performa manual annotation of corpora with different tags that include the baseline tags (person names, organizations, locations, dates and numeral). The idea proposed is presented in this paper with an example scenario together with the procedure employed for solving the problem of automatic inference of new rules.

Keywords


NERC, Semantic relations, facts-base, rules, spanish news

Full Text: PDF