Enhancing the Detection of Sexist Messages Through a Multi-Profile-Based Ensemble Approach
Abstract
Sexism in language perpetuates harmful stereotypes, especially in cultures with deeply ingrained traditional gender roles, such as Mexico. While detection of misogynistic content in English has advanced, detecting sexist language in Spanish is less explored. This study uses the EXIST corpus, annotated by various demographic groups, to examine differing perceptionsof sexism across genders and ages. Our analysis finds significant perception discrepancies, with 25% oftexts showing disagreements between male and femalean notators. We propose an ensemble classification model that integrates outputs from gender-specific andage-specific models based on ROBERTuito, achieving an F1 score of 0.854. To gain insights into ourbest classifier’s decision-making, we present an error analysis based on the visualization of attention weights, which helps us identify the most relevant words inthe detection of subtle sexism. Additionally, weleverage ChatGPT’s capabilities to model language nuances, generating potential interpretations of textsas sociated with the classifications provided by our approach. This study underscores the importance of demographic considerations in sexist language detection and demonstrates that combining diverse perspectiveswith advanced techniques can enhance detection in Spanish social media.
Keywords
Sexism, hierarchical attention networks, transformers, social media, ensemble classification, sexism detection