Multi-class Sentiment Analysis of COVID-19 Tweets by Machine Learning and Deep Learning Approaches
Abstract
COVID-19 is a virus that has spread rapidly over the globe. The condition has repercussions beyond the realm of public health. Twitter is one platform where people post reactions to events during the outbreak. User-generated information, like tweets, presents unique challenges for sentiment analysis on Twitter data. With that in mind, this work employs four methods for analyzing Twitter data in terms of sentiment: the vector space model (TF-IDF) with three different ensemble machine learning models (voting, bagging, and stacking) and BERT (Bidirectional Encoder Representations from Transformers). Experiments showed that BERT outperformed the other three techniques, with an F1-score of 74%, a precision of 74%, and a recall of 74% for categorizing five sentiment classes on data from a Kaggle competition (Coronavirus tweets NLP-Text Classification).
Keywords
Ensemble machine learning; Deep learning; Voting; Bagging; Stacking; BERT