An Ensemble of Automatic Keyword Extractors: TextRank, RAKE and TAKE

Tayfun Pay, Stephen Lucci, James L. Cox


We construct an ensemble method for automatic keyword extraction from single documents. We utilize three different unsupervised automatic keyword extractors in building our ensemble method. These three approaches provide candidate keywords for the ensemble method without using their respective threshold functions. The ensemble method combines these candidate keywords and recomputes their scores after applying pruning heuristics. It then extracts keywords by employing dynamic thres hold functions. Weanalyze the performance of our ensemble method by using all parts of the Inspect data set. Our ensemble method achieved a better overall performance when compared to the automatic keyword extractors that were used in its development as well as to some recent automatic keyword extraction methods.


Data mining, text mining, text analysis, ensemble methods

