A Supervised Approach for Reconstructing Thread Structure in Comments on Blogs and Online News Agencies

Authors

  • Ali Balali School of ECE, College of Engineering, University of Tehran
  • Hesham Faili School of ECE, College of Engineering, University of Tehran
  • Masoud Asadpour School of ECE, College of Engineering, University of Tehran
  • Mostafa Dehghani School of ECE, College of Engineering, University of Tehran

DOI:

https://doi.org/10.13053/cys-17-2-1525

Keywords:

Reconstructing thread structure, reply structure, information extraction, blogs and online news agencies, machine learning, information management.

Abstract

There is a great deal of knowledge in online environments such as forums, chats and blogs. A large volume of comments with different subjects on a page has created a lot of complexity in following the actual conversation streams, since the reply structures of comments are generally not publicly accessible in online environments. It is beneficial to automatically reconstruct thread structure of comments to deal with such a problem. This work focuses on reconstructing thread structures on blogs and online news agencies’ comment space. First, we define a set of textual and non-textual features. Then we use a learning algorithm to combine extracted features. The proposed method has been evaluated on three different datasets, which include two datasets in Persian and one in English. The accuracy ratio of the proposed model is compared with three baseline algorithms. The results reveal higher accuracy ratio for the proposed method in comparison with the baseline methods for all datasets.

Downloads

Published

2013-06-29