Multi-document Text Summarization through Features Relevance Calculation

Verónica Neri-Mendoza, Yulia Ledeneva, René Arnulfo García-Hernández, Ángel Hernández-Castañeda

Abstract


Multi-Document Text Summarization is obtaining relevant information from a set of documents describing the same topic. However, determining the key sentences in the text to be presented as a summary is difficult. Consequently, it is necessary to use featuresthat help to identify informative sentences from those that are not. However, distinguishing between significant and insignificant features is a challenging task. In this study, we introduced a method to assess the impact of 19 linguistic and statistical features derived from human-written reference summaries. Moreover, we tested them using the DUC01 dataset in two lengths (50 and 100 words). The results demonstrate that the proposed method outperforms state-of-the-art approaches and heuristics based on the ROUGE-1 metric.

Keywords


Text features, Summarization, Multi-Document, Contribution

Full Text: PDF