WavLM-Based Automatic Pronunciation Assessment for Yuhmu Speech: A Low-Resource Language

Authors

  • Eric Ramos-Aguilar Benemérita Universidad Autónoma de Puebla
  • Arturo Olvera-López Benemérita Universidad Autónoma de Puebla
  • Ivan Olmos-Pineda Benemérita Universidad Autónoma de Puebla

DOI:

https://doi.org/10.13053/cys-29-3-5913

Keywords:

Low resources languages, Yuhmu language, supervised learning, speech analysis

Abstract

This paper presents an approach to classify correct and incorrect pronunciation in Yuhmu, an endangered Indigenous Minority Language, using acoustic embeddings combined with SVM and MLP models. Unlike typical low-resource language tasks focused on automatic speech recognition (ASR) or machine translation, this work employs deep acoustic representations to detect phonetic quality, achieving high accuracy and consistency across different embedding sizes. The results highlight the potential of leveraging labeled audio data and advanced speech models like WavLM to provide phonetic feedback and support language revitalization. This research establishes a foundation for deeper computational phonetic analysis in Yuhmu and opens avenues for future exploration in direct audio-to-audio translation, automatic phonetic segmentation, and detailed phoneme-level evaluation, contributing to the documentation and preservation of underrepresented languages.

Downloads

Published

2025-09-28

Issue

Section

Articles of the Thematic Section