Gender Recognition of Teen and Adult Voices in Non-Tonal and Tonal Languages in Uncontrolled Environments

Autores/as

  • Enrique Díaz-Ocampo Colegio Universitario Científico de Datos
  • Areli Karina Martínez-Tapia Colegio Universitario Científico de Datos
  • Andrea Magadán-Salazar Tecnológico Nacional de México
  • Raúl Pinto-Elías Tecnológico Nacional de México
  • Máximo López-Sánchez Tecnológico Nacional de México
  • Yael Bensoussan University of South Florida

DOI:

https://doi.org/10.13053/cys-29-1-4495

Palabras clave:

Voice gender recognition, fundamental frequency, vocal tract length, tonal language, spanish language

Resumen

Voice gender recognition systems is a term that refers the automatization of gender detection by an acoustic signal of voice. These systems can be trained in uncontrolled environments, whose audios present different types of noises and speaker characteristics. However, the current systems present a bias in the training language, which is usually mainly English. The present work focused on the gender recognition of adult and teen voices in a group of tonal languages and Spanish under uncontrolled environments. The features used were 7 derived from pitch, and two from the mean of the fourth formant and vocal tract length. Two scenarios were built: a training-test scenario on one dataset, and a second validation scenario using the other dataset. The metrics used were accuraccy, recall, F1-score, and area under the ROC curve. The algorithms used were Multilayer Perceptron and Random Forest. Despite the bias in the datasets, the biological features and the algorithms were robust to language change.

Biografía del autor/a

Enrique Díaz-Ocampo, Colegio Universitario Científico de Datos

Profesor de Tiempo Completo

Areli Karina Martínez-Tapia, Colegio Universitario Científico de Datos

Profesor de Tiempo Completo

Andrea Magadán-Salazar, Tecnológico Nacional de México

Coordinadora de la línea de Inteligencia Artificial.

Raúl Pinto-Elías, Tecnológico Nacional de México

Profesor de Tiempo Completo

Máximo López-Sánchez, Tecnológico Nacional de México

Profesor de tiempo completo.

Yael Bensoussan, University of South Florida

Assistant ProfessorChief, Division of Laryngology

Descargas

Publicado

2025-03-25

Número

Sección

Artículos