Comparative Analysis of Machine Learning and Deep Learning Models for Harassment and Discrimination Detection in Text

Ana Laura Lezama Sánchez, Mireya Tovar Vidal

Abstract


Harassment and discrimination affect both
workplace environments and online platforms. To To
address this issue, we focus on automatically detecting
such behaviors in textual data to help create safer
digital spaces. In this article, we compare traditional
machine learning and deep learning models for detecting
harassment and discrimination. We evaluate four
approaches: TF-IDF with logistic regression, BERT-based
classification, a CNN with GloVe embeddings, and a
GRU model enhanced with attention mechanisms and
capsule networks. For all experiments, we rely on the
Everyday Sexism Project dataset, which groups the texts
into five categories: Workplace Harassment, Harassment,
Discrimination, Sexism, and Other. We evaluate their
performance applying accuracy, precision, recall, and F1.
The obtained results show that deep learning models
outperform traditional methods in identifying complex
linguistic patterns in abusive content.

Keywords


Machine learning, deep learning, harassment, discrimination

Full Text: PDF