Etiquetado fonético automático al nivel palabra usando la dinámica de cambio de los vectores del libro código

Sergio Suárez Guerra; José Luis Oropeza Rodríguez

doi:10.13053/cys-24-2-3229

Automatic Phonetic Labeling at Word Level Using the Dynamics of Changing Codebook Vectors

Authors

Sergio Suárez Guerra Instituto Politécnico Nacional, Centro de Investigación en Computación
José Luis Oropeza Rodríguez Instituto Politécnico Nacional, Centro de Investigación en Computación

DOI:

https://doi.org/10.13053/cys-24-2-3229

Keywords:

Phonetic labeling, voice recognition

Abstract

An alternative solution is described regarding the phonetic labeling that compose a set of pronounced by an announcer, susceptible of being used in any language, according to the needs and characteristics associated with the proposal. The procedure is based on the monitoring of the dynamics of change of the cepstral vectors associated with the frequency of Mel (MFCCs) that make up the Book Code (LC), extracted from the word to be labeled. This dynamics of change analyzes where a transition from one vector (MFCC) of the LC occurs to another, as well as the disturbances that occur in the zone of change due to the phonetic concatenation. Metrics are established to consider coarticulation noise and define the location of the phonetic separation boundary. Two methods are used to evaluate the dynamics of vector change and deliver the most accurate labeling. The percentage of recognition and correct labeling obtained with this application is 97.9% lower by 1.06%, with respect to the percentage of recognition obtained on the same corpus of words, but using manual labeling. The more important are that, the time used in the labeling of the voice corpus automatically is significantly less than the estimate of being done manually, in addition to eliminating personal subjectivity in the labeling work.

Author Biography

Sergio Suárez Guerra, Instituto Politécnico Nacional, Centro de Investigación en Computación

Profesor-investigador del CIC-IPN desde 1998, Dr. en Ciencias Técnicas de la Informática graduado en Rusia en 1979. Jefe del Laboratorio de Procesamiento Digital de Señales en el CIC-IPN

Downloads

PDF (Español (España))

Published

2020-06-23

Issue

Vol. 24 No. 2 (2020): Thematic Issue on Language & Knowledge Engineering (Guest editors: D. Pinto, B. Beltrán, A. Vázquez)

Section

Articles

License

Hereby I transfer exclusively to the Journal "Computación y

Sistemas", published by the Computing Research Center (CIC-IPN),

the Copyright of the aforementioned paper. I also accept that these

rights will not be transferred to any other publication, in any other

format, language or other existing means of developing.

I certify that the paper has not been previously disclosed or simultaneo

usly submitted to any other publication, and that it does not contain

material whose publication would violate the Copyright or other

proprietary rights of any person, company or institution. I certify that

I have the permission from the institution or company where I work or

study to publish this work.

The representative author accepts the responsibility for the publication

of this paper on behalf of each and every one of the authors.

This transfer is subject to the following conditions:

The authors retain all ownership rights (such as patent rights) of this work, except for the publishing rights transferred to the CIC, through this document.
Authors retain the right to publish the work in whole or in part in any book they are the authors or publishers. They can also make use of this work in conferences, courses, personal web pages, and so on.
Authors may include working as part of his thesis, for non-profit distribution only.