Mining a Trending Topic: U.S. Immigration on the Context of Social Media

Authors

  • Esteban Castillo Instituto Tecnológico y de Estudios Superiores de Monterrey
  • Ofelia Cervantes Universidad de las Américas Puebla

DOI:

https://doi.org/10.13053/cys-28-2-4574

Keywords:

Text mining, Statistics, Graph mining, Social network analysis, Natural language processing, Big data

Abstract

This paper presents a text mining approach for extracting valuable patterns from social media documents in the contextof U.S. immigration. The paper points out the uncovering of statistical features alongside linguistic elements based on graph techniques. The use of graphs provide rich data structures for representing lexical and syntactic aspects of texts, allowing the discovery of complex patterns that used by experts could provide valuable insight. The proposed method is applied over a Twitter-Reddit dataset that comprise English and Spanish language samples from 2016 up to 2019. Experimental results showed that our interpretation of classic statistic techniques provide a baseline understanding of the topic while a more robust analysis (graphs) permits to uncover/predict hidden patterns over large amount of samples. In particular, the use of a co-occurrence graph helped to obtain relevant words, phrases and sentences while a user-interaction graph allow to detect important users, communities and interactions among themselves.

Author Biographies

Esteban Castillo, Instituto Tecnológico y de Estudios Superiores de Monterrey

Full-time professor

Ofelia Cervantes, Universidad de las Américas Puebla

Full-time professor

Downloads

Published

2024-06-12

Issue

Section

Articles