Tunisian Dialect Sentiment Analysis: A Natural Language Processing-based Approach

Hala Mulki; Hatem Haddad; Chedi Bechikh Ali; Ismail Babaoğlu

doi:10.13053/cys-22-4-3009

Tunisian Dialect Sentiment Analysis: A Natural Language Processing-based Approach

Autores/as

Hala Mulki Selcuk University, Department of Computer Engineering
Hatem Haddad Université Libre de Bruxelles, Department of Computer & Decision Engineering (CoDE)
Chedi Bechikh Ali Université Libre de Bruxelles, Department of Computer & Decision Engineering (CoDE),
Ismail Babaoğlu Selcuk University, Department of Computer Engineering

DOI:

https://doi.org/10.13053/cys-22-4-3009

Palabras clave:

Tunisian sentiment analysis, text preprocessing, named entities

Resumen

Social media platforms have been witnessing a significant increase in posts written in the Tunisian dialect since the uprising in Tunisia at the end of 2010. Most of the posted tweets or comments reflect the impressions of the Tunisian public towards social, economical and political major events. These opinions have been tracked, analyzed and evaluated through sentiment analysis systems. In the current study, we investigate the impact of several preprocessing techniques on sentiment analysis using two sentiment classification models: Supervised and lexicon-based. These models were trained on three Tunisian datasets of different sizes and multiple domains. Our results emphasize the positive impact of preprocessing phase on the evaluation measures of both sentiment classifiers as the baseline was significantly out performed when stemming, emoji recognition and negation detection tasks were applied. Moreover, integrating name dentities with these tasks enhanced the lexicon-based classification performance in all datasets and that of the supervised model in medium and small sized datasets.

Biografía del autor/a

Hala Mulki, Selcuk University, Department of Computer Engineering

Selcuk University, Department of Computer Engineering

Ismail Babaoğlu, Selcuk University, Department of Computer Engineering

Selcuk University, Department of Computer Engineering

Descargas

PDF (English)

Archivos adicionales

Publicado

2018-12-30

Número

Vol. 22 Núm. 4 (2018): Topic Trends in Computing Research (Guest Editors: A. Aguilar-Meléndez, E. Moya-Sánchez)

Sección

Articles of the Thematic Section

Licencia

Transfiero exclusivamente a la revista “Computación y Sistemas”, editada por el Centro de Investigación en Computación (CIC), los Derechos de Autor del artículo antes mencionado, asimismo acepto que no serán transferidos a ninguna otra publicación, en cualquier formato, idioma, medio existente (incluyendo los electrónicos y multimedios) o por desarrollar.

Certifico que el artículo, no ha sido divulgado previamente o sometido simultáneamente a otra publicación y que no contiene materiales cuya publicación violaría los Derechos de Autor u otros derechos de propiedad de cualquier persona, empresa o institución. Certifico además que tengo autorización de la institución o empresa donde trabajo o estudio para publicar este Trabajo.

El autor, representante acepta la responsabilidad por la publicación del Trabajo en nombre de todos y cada uno de los autores.

Esta Transferencia está sujeta a las siguientes reservas:

Los autores conservan todos los derechos de propiedad (tales como derechos de patente) de este Trabajo, con excepción de los derechos de publicación transferidos al CIC, mediante este documento.
Los autores conservan el derecho de publicar el Trabajo total o parcialmente en cualquier libro del que ellos sean autores o editores y hacer uso personal de este trabajo en conferencias, cursos, páginas web personal, etc.

Tunisian Dialect Sentiment Analysis: A Natural Language Processing-based Approach

Autores/as

DOI:

Palabras clave:

Resumen

Biografía del autor/a

Hala Mulki, Selcuk University, Department of Computer Engineering

Ismail Babaoğlu, Selcuk University, Department of Computer Engineering

Descargas

Archivos adicionales

Publicado

Número

Sección

Licencia

Desarrollado por

Información

Idioma