Kazakh Text Summarization using Fuzzy Logic

Autores/as

  • Altanbek Zulkhazhav L.N.Gumilyov Eurasian National University, Faculty of Information Technologies
  • Zhanibek Kozhirbayev Nazarbayev University, National Laboratory Astana
  • Zhandos Yessenbayevy L.N.Gumilyov Eurasian National University, Faculty of Information Technologies
  • Altynbek Sharipbay L.N.Gumilyov Eurasian National University, Faculty of Information Technologies

DOI:

https://doi.org/10.13053/cys-23-3-3239

Palabras clave:

Extractive text summarization, natural language processing, fuzzy logic

Resumen

In this paper we present an extractive summarization method for the Kazakh language basedon fuzzy logic. We aimed to extract and concatenate important sentences from the primary text to obtainits shorter form. With the rapid growth of information on the Internet there is a demand on its efficient and cost-effective summarization. Therefore the creation of automatic summarization methods is considered as a very important task of natural language processing. Our approach is based on the preprocessing of the sentences by applying morphological analysis and pronoun resolution techniques in order to avoid their early rejections. Afterwards, we determine the features of the processed sentences need for exploiting fuzzy logic methods. Additionally, since there is no available data for the given task, we collected and manually annotated our own dataset from the different Internet resources in the Kazakh language for the experimentation. We also applied our method on CNN/Daily Maildataset. The ROUGE-N indicators were calculated to assess the quality of the proposed method. The ROUGE-L(f-score) score by the proposed method with pronoun resolution for the former dataset is 0.40, where as for the latter one it is 0.38.

Descargas

Publicado

2019-09-25