Extract Reliable Relations from Wikipedia Texts for Practical Ontology Construction

Authors

  • Jin-Xia Huang Electronics and Telecommunications Research Institute - Chonbuk National University
  • Kyung Soon Lee Chonbuk National University
  • Key-Sun Choi School of Computing
  • Young-Kil Kim Electronics and Telecommunications Research Institute

DOI:

https://doi.org/10.13053/cys-20-3-2454

Keywords:

Information classification, information extraction, feature-based, relatedness information, ontology building.

Abstract

A feature based relation classification approach is presented in this paper. We aimed to exact relation candidates from Wikipedia texts. A probabilistic and a semantic relatedness features are employed with other linguistic information for the purpose. The experiments show that, relation classification using the proposed relatedness features with surface information like word and part-of-speech tags is competitive with or even outperforms the one of using deep syntactic information. Meanwhile, an approach is proposed to distinguish reliable relation candidates from others, so that these reliable results can be accepted for knowledge building without human verification. The experiments show that, with the relation classification approach presented in this paper, more than 40% of the classification results are reliable, which means, at least 40% of the human and time costs can be saved in practice.

Author Biographies

Jin-Xia Huang, Electronics and Telecommunications Research Institute - Chonbuk National University

Received her M.S. degree in Computer Science from KAIST in 2000. She is now a senior researcher of Natural Language Processing Research Section, ETRI.

Kyung Soon Lee, Chonbuk National University

Received her Master and PhD degrees in Computer Science from KAIST in 1997 and 2001, respectively. She is now a full professor of Division of Computer Science and Engineering, Chonbuk National University.

Key-Sun Choi, School of Computing

Received his Master and PhD degrees in Computer Science from KAIST in 1980 and 1986, respectively. He is now a full professor of Computer Science Department, KAIST since joining in 1988.

Young-Kil Kim, Electronics and Telecommunications Research Institute

Received his Master and PhD degrees in Computer Science from Hanyang University in 1995 and 1997, respectively. He is now a Principal Researcher and Director of Natural Language Processing Research Section, ETRI.

Downloads

Published

2016-09-30