Text Classification using gated fusion of n-gram features and semantic features

Autores/as

  • Ajay Nagar Samsung R&D Institute India - Bangalore
  • Anmol Bhasin Samsung R&D Institute India - Bangalore
  • Gaurav Mathur Samsung R&D Institute India - Bangalore

DOI:

https://doi.org/10.13053/cys-23-3-3278

Palabras clave:

Text classification, convolution neural network, universal sentence encoder, BiLSTM

Resumen

We introduce a novel method for text classification based on gated fusion of n-gram features and semantic features of the text. The parallel CNN network captures the n-gram relation between the words based on the filter size, primarily short distance multi-word relations. Whereas for semantic relation-ship, universal sentence encoder or BiLSTM is used. Gated fusion is used to combine n-gram and semantic features. The model is evaluated on 4 commonly used benchmark datasets (MR, TREC, AG-News and SUBJ), which includes sentiment analysis and question classification. The proposed method is able to surpass the existing state-of-the-art DNN architectures for text classification on these datasets.

Descargas

Publicado

2019-09-25