Exploiting Bishun to Predict the Pronunciation of Chinese

Authors

  • Chenggang Mi Xinjiang Technical Institute of Physics & Chemistry - Xinjiang Key Laboratory of Minority Speech and Language Information Processing
  • Yating Yang Xinjiang Technical Institute of Physics & Chemistry - Xinjiang Key Laboratory of Minority Speech and Language Information Processing - Institute of Acoustics
  • Xi Zhou Xinjiang Technical Institute of Physics & Chemistry - Xinjiang Key Laboratory of Minority Speech and Language Information Processing
  • Lei Wang Xinjiang Technical Institute of Physics & Chemistry - Xinjiang Key Laboratory of Minority Speech and Language Information Processing
  • Xiao Li Xinjiang Technical Institute of Physics & Chemistry - Xinjiang Key Laboratory of Minority Speech and Language Information Processing
  • Tonghai Jiang Xinjiang Technical Institute of Physics & Chemistry - Xinjiang Key Laboratory of Minority Speech and Language Information Processing

DOI:

https://doi.org/10.13053/cys-20-3-2451

Keywords:

Pronunciation prediction, Bishun, language model, translation model, error tolerant.

Abstract

Learning to pronounce Chinese characters is usually considered as a very hard part to foreigners to study Chinese. At beginning, Chinese learners must bear in mind thousands of Chinese characters, including their pronunciation, meanings, Bishun (order of strokes) etc., which is very time consuming and boring. In this paper, we proposed a novel method based on translation model to predict the Chinese character pronunciation automatically. We first convert each Chinese character into Bishun, then, we train the pronunciation prediction model (translation model) according to Bishun and their correspondence Pinyin sequences. To make our model practically, we also introduced some error tolerant strategies. Experimental results show that our method can predict the pronunciation of Chinese characters effectively.

Author Biographies

Chenggang Mi, Xinjiang Technical Institute of Physics & Chemistry - Xinjiang Key Laboratory of Minority Speech and Language Information Processing

Received his PhD in Natural Language Processing and Machine Translation from the University of Chinese Academy of Sciences. Currently, he is an assistant professor in Xinjiang Technical Institute of Physics and Chemistry of Chinese Academy of Sciences. His research interests are: machine translation, natural language processing and machine learning. 

Yating Yang, Xinjiang Technical Institute of Physics & Chemistry - Xinjiang Key Laboratory of Minority Speech and Language Information Processing - Institute of Acoustics

Received her PhD from the Graduate University of Chinese Academy of Sciences. Currently, she is an associate professor in Xinjiang Technical Institute of Physics and Chemistry of Chinese Academy of Sciences. Her research interest is machine translation.

Xi Zhou, Xinjiang Technical Institute of Physics & Chemistry - Xinjiang Key Laboratory of Minority Speech and Language Information Processing

Received his PhD from the University of Chinese Academy of Sciences. Currently, he is a professor in Xinjiang Technical Institute of Physics and Chemistry of Chinese Academy of Sciences. His research interest is multilingual processing.

Lei Wang, Xinjiang Technical Institute of Physics & Chemistry - Xinjiang Key Laboratory of Minority Speech and Language Information Processing

Received his PhD from the University of Chinese Academy of Sciences. Currently, he is a professor in Xinjiang Technical Institute of Physics and Chemistry of Chinese Academy of Sciences. His research interest is multilingual processing.

Xiao Li, Xinjiang Technical Institute of Physics & Chemistry - Xinjiang Key Laboratory of Minority Speech and Language Information Processing

Is currently a professor in Xinjiang Technical Institute of Physics and Chemistry of Chinese Academy of Sciences. His research interest is multilingual processing.

Tonghai Jiang, Xinjiang Technical Institute of Physics & Chemistry - Xinjiang Key Laboratory of Minority Speech and Language Information Processing

Is currently a professor in Xinjiang Technical Institute of Physics and Chemistry of Chinese Academy of Sciences. His research interest is multilingual processing.

Downloads

Published

2016-09-30