Social Media – Processing Romanian Chat and Discourse Analysis

Authors

  • Cătălina Mărănduc Al. I. Cuza University - Academic Institute of Linguistics Iorgu Iordan
  • Cenel-Augusto Perez Al. I. Cuza University
  • Radu Simionescu Al. I. Cuza University

DOI:

https://doi.org/10.13053/cys-20-3-2453

Keywords:

Conversational particularities, dependency treebank, discourse analysis, processing non-standard texts, social-media communication.

Abstract

In order to obtain a balanced corpus, a sub-corpus of 2,576 sentences illustrating contemporary social media language has been added to the Dependency Treebank for Romanian. The texts were taken from the chat. The subject of this paper is to describe the second step of processing non-standard texts with a hybrid POS-tagger for Romanian and with a Malt parser, both until now trained on standard language and on other styles of communication. The results obtained show that the UAIC tools are comparable with the tools for other languages trained on similar corpora. Another purpose is to develop this resource, the Dependency Treebank for Romanian, not only quantitatively, doubling its dimension in a year, but also changing its format with a new one, compatible with other similar foreign corpora, and adding new, more complex annotation layers. A semantic layer and a discursive annotation will be added, permitting the study of discursive and conversational particularities. Finally, examples illustrating discursive particularities of the chat communication are discussed.

Author Biographies

Cătălina Mărănduc, Al. I. Cuza University - Academic Institute of Linguistics Iorgu Iordan

Recieved his first PhD in general linguistics from The University of Bucharest, Romania. Now she is PHD student at the Faculty of Computer Science at the Al. I. Cuza University of Iași, Romania.

Cenel-Augusto Perez, Al. I. Cuza University

Received his PhD in Computational Linguistic from the Al. I. Cuza University  of Iasi, Romania in 2014.

Radu Simionescu, Al. I. Cuza University

Will receive his PhD in Computater Science from the Al. I. Cuza University  of Iasi, Romania, in October 2016.

Downloads

Published

2016-09-30