Proposal for Named Entities Recognition and Classification (NERC) and the Automatic Generation of Rules on Mexican News
Abstract
In this paper we introduce a proposal for extracting facts from news on Mexican online newspapers through their RSS (Really Simple Syndication). This problem will be addressed by using the task of automatic named entities recognition and classification (NERC), as well as the semantic relation extraction among entities, so that we can build a database of facts and rules from the obtained entities in an automatic manner. The final aim is to be able to infer new rules through the use of the knowledge databases constructed and an inference engine. In order to build the NER model, we performa manual annotation of corpora with different tags that include the baseline tags (person names, organizations, locations, dates and numeral). The idea proposed is presented in this paper with an example scenario together with the procedure employed for solving the problem of automatic inference of new rules.
Keywords
NERC, Semantic relations, facts-base, rules, spanish news