Part-Of-Speech Tagging for Mizo Language Using Conditional Random Field

Morrel VL Nunsanga; Partha Pakray; C. Lallawmsanga; L. Lolit Kumar Singh

doi:10.13053/cys-25-4-4044

Part-Of-Speech Tagging for Mizo Language Using Conditional Random Field

Autores/as

Morrel VL Nunsanga Mizoram University
Partha Pakray National Institute of Technology Silchar
C. Lallawmsanga Mizoram University
L. Lolit Kumar Singh Mizoram University

DOI:

https://doi.org/10.13053/cys-25-4-4044

Palabras clave:

Mizo POS tagging, conditional random field, mizo part of speech tagger, computational linguistics

Resumen

Part of speech (POS) tagging assigns a class or tag to each token in a sentence. The tag allocated to a word is mainly its part of speech or any other class of interest. Several applications of Natural Language Processing (NLP) require it as a prerequisite. The development of part-of-speech tagging for the under-resourced Mizo language is presented in this study, which makes use of a stochastic model known as Conditional Random Field (CRF). The CRF is a discriminative probabilistic classifier that considers both the context of a given word and the tag transition probabilities in the training dataset. A corpus of approximately 30,000 words was collected and manually annotated with the proposed tagset for system evaluation. On various sizes of training and test sets, the tagger achieved 89.46 % accuracy, 89.3 % F1-score, 89.42 % precision, and 89.48 % recall.

Biografía del autor/a

Morrel VL Nunsanga, Mizoram University

Department of Information Technology

Partha Pakray, National Institute of Technology Silchar

Department of Computer Science and Engineering

C. Lallawmsanga, Mizoram University

Department of Information Technology

L. Lolit Kumar Singh, Mizoram University

Department of Electronics and Communication Engineering

Descargas

PDF (English)

Publicado

2021-11-29

Número

Vol. 25 Núm. 4 (2021): 25(4) 2021: Thematic Issue on Artificial Intelligence for Industry 4.0 (Guest Editors: O. Vergara et al.)

Sección

Artículos

Licencia

Transfiero exclusivamente a la revista “Computación y Sistemas”, editada por el Centro de Investigación en Computación (CIC), los Derechos de Autor del artículo antes mencionado, asimismo acepto que no serán transferidos a ninguna otra publicación, en cualquier formato, idioma, medio existente (incluyendo los electrónicos y multimedios) o por desarrollar.

Certifico que el artículo, no ha sido divulgado previamente o sometido simultáneamente a otra publicación y que no contiene materiales cuya publicación violaría los Derechos de Autor u otros derechos de propiedad de cualquier persona, empresa o institución. Certifico además que tengo autorización de la institución o empresa donde trabajo o estudio para publicar este Trabajo.

El autor, representante acepta la responsabilidad por la publicación del Trabajo en nombre de todos y cada uno de los autores.

Esta Transferencia está sujeta a las siguientes reservas:

Los autores conservan todos los derechos de propiedad (tales como derechos de patente) de este Trabajo, con excepción de los derechos de publicación transferidos al CIC, mediante este documento.
Los autores conservan el derecho de publicar el Trabajo total o parcialmente en cualquier libro del que ellos sean autores o editores y hacer uso personal de este trabajo en conferencias, cursos, páginas web personal, etc.

Part-Of-Speech Tagging for Mizo Language Using Conditional Random Field

Autores/as

DOI:

Palabras clave:

Resumen

Biografía del autor/a

Morrel VL Nunsanga, Mizoram University

Partha Pakray, National Institute of Technology Silchar

C. Lallawmsanga, Mizoram University

L. Lolit Kumar Singh, Mizoram University

Descargas

Publicado

Número

Sección

Licencia

Desarrollado por

Información

Idioma