TOM: Twitter Opinion Mining

Fernando Manuel Rodríguez, Sara E. Garza

Abstract


We present an opinion mining approach whose aim is to perform sentiment classification over microblogs in Spanish; since we use the Twitter microblog as a case study, this approach receives the name of Twitter Opinion Mining or TOM. To classify acomment as positive, negative, or neutral, TOM uses a term-counting strategy that sums the individual polarities of words and phrases contained in the comment. These polarities are obtained with an opinion lexicon that consists of weighted terms and valence shifters. Our lexicon not only includes generic terms translated from an English repository, but also more specific vocabulary from Twitter; this vocabulary is extracted by detecting adjectives and nouns from tweets with emoticons and trigrams that follow the “is-a” pattern. To assess TOM’s quality, we measured precision, recall, and F1 using a set of manually-classified tweets. Our results show high averages for each of these metrics, which were also used for comparing TOM against Sentitext, a tool for opinion mining in Spanish. The results for this comparison show that our approach outperforms this state of the art method.

Keywords


Opinion mining, sentiment analysis, lexicon, twitter, spanish

Full Text: PDF