Tunisian Dialect Sentiment Analysis: A Natural Language Processing-based Approach

Authors

  • Hala Mulki Selcuk University, Department of Computer Engineering
  • Hatem Haddad Université Libre de Bruxelles, Department of Computer & Decision Engineering (CoDE)
  • Chedi Bechikh Ali Université Libre de Bruxelles, Department of Computer & Decision Engineering (CoDE),
  • Ismail Babaoğlu Selcuk University

DOI:

https://doi.org/10.13053/cys-22-4-3009

Keywords:

Tunisian sentiment analysis, text preprocessing, named entities.

Abstract

Social media platforms have been witnessinga significant increase in posts written in the Tunisiandialect since the uprising in Tunisia at the end of2010. Most of the posted tweets or comments reflectthe impressions of the Tunisian public towards social,economical and political major events. These opinionshave been tracked, analyzed and evaluated throughsentiment analysis systems. In the current study,we investigate the impact of several preprocessingtechniques on sentiment analysis using two sentimentclassification models: Supervised and lexicon-based.These models were trained on three Tunisian datasetsof different sizes and multiple domains. Our resultsemphasize the positive impact of preprocessing phaseon the evaluation measures of both sentiment classifiersas the baseline was significantly outperformed whenstemming, emoji recognition and negation detectiontasks were applied. Moreover, integrating namedentities with these tasks enhanced the lexicon-basedclassification performance in all datasets and that of thesupervised model in medium and small sized datasets.

Author Biographies

Hala Mulki, Selcuk University, Department of Computer Engineering

Selcuk University, Department of Computer Engineering

Ismail Babaoğlu, Selcuk University

Selcuk University, Department of Computer Engineering

Downloads

Published

2018-12-30