Extract Reliable Relations from Wikipedia Texts for Practical Ontology Construction
Abstract
A feature based relation classification approach is presented in this paper. We aimed to exact relation candidates from Wikipedia texts. A probabilistic and a semantic relatedness features are employed with other linguistic information for the purpose. The experiments show that, relation classification using the proposed relatedness features with surface information like word and part-of-speech tags is competitive with or even outperforms the one of using deep syntactic information. Meanwhile, an approach is proposed to distinguish reliable relation candidates from others, so that these reliable results can be accepted for knowledge building without human verification. The experiments show that, with the relation classification approach presented in this paper, more than 40% of the classification results are reliable, which means, at least 40% of the human and time costs can be saved in practice.