Author Profiling on Social Media using New Weighting Schemes that Emphasize Personal Information
Abstract
This paper summarizes the thesis: ¨Identificaci´on del perfil de autores en redes sociales usandonuevos esquemas de pesado que enfatizan informaci´onde tipo personal”whose main idea indicates that termslocated in phrases exposing personal information arehighly valuable for the AP task. Firstly, it is presentedan study on the relevance of this information to thistask. Secondly, it is proposed a novel approach, whichaims to emphasize the value of this type of terms bytwo proposals: a feature selection method and a termweighting scheme; both of them are based on a novelmeasure called personal expression intensity, whichestimates the quantity of personal information revealedby each term. The approach was evaluated in age andgender prediction on different social media. The resultsare encouraging, with average improvements about7.34% and 5.76% for age and gender identificationrespectively in comparison with the best results fromthe state of the art. These results allow concluding thatpersonal information play an important role in the task.
Keywords
Identificación del perfil de autores, información personal, esquemas de pesado, PEI, DPP, EXPEI