Text Classification using Gated Fusion of n-gram Features and Semantic Features

Ajay Nagar, Anmol Bhasin, Gaurav Mathur

Abstract


We introduce a novel method for text classification based on gated fusion of n-gram features and semantic features of the text. The parallel CNN network captures the n-gram relation between the words based on the filter size, primarily short distance multi-word relations. Whereas for semantic relation-ship, universal sentence encoder or BiLSTM is used. Gated fusion is used to combine n-gram and semantic features. The model is evaluated on 4 commonly used benchmark datasets (MR, TREC, AG-News and SUBJ), which includes sentiment analysis and question classification. The proposed method is able to surpass the existing state-of-the-art DNN architectures for text classification on these datasets.

Keywords


Text classification, convolution neural network, universal sentence encoder, BiLSTM

Full Text: PDF