Rule-Based Spanish Multiple Question Reformulation and their Classification using a Convolutional Neuronal Network

Alberto Iturbe Herrera, Noé Alejandro Castro Sánchez, Dante Mújica Vargas


Question reformulation allows the creation of different forms of the same question in order to identify the best answer. However, when aspects suchas length and complexity increase, the reformulation process becomes more complicated, consequently also the recovery of the corresponding information. In this research, a method for the reformulation of multiple questions in Spanish is presented, as part of the pre-processing stage in a question-answer system. The lexical category of each word, Named Entities and Multi-Word Terms, were used to reformulate multiple questions into new individual questions, and then a Convolutional Neural Network was used to classify them, allowing to find or build adequate answers to improve the quality of the results, which is fundamental in QAsystems. A dataset with multiple questions was also created to evaluate our reformulation method, since it was not possible to find any. On the other hand, for the evaluation of the question classification model, we used the TREC, Simple Questions, Web Questions, Wiki Movies and Curated TREC datasets, translated into Spanish. Both tasks achieved promising results for further work.


Question reformulation, question classifica tion, convolutional neural networks

Full Text: PDF