Enhancing Text Classification Using BERT: A Transfer Learning Approach

Haider Zaman-Khan, Muddasar Naeem, Raffaele Guarasci, Umamah Bint-Khalid, Massimo Esposito, Francesco Gargiulo

Abstract


This paper investigates the application of Natural Language Processing (NLP) techniques for enhancing the performance of document-level classification tasks. The study focuses on leveraging a Transformer-based Neural Language Model (NLM), particularly BERT, combined with cross-validation to exploit trans fer learning algorithms for classification tasks. To address the challenges, the approach has been tested on the two different types of the widely-known 20 Newsgroups benchmark dataset using pre-trained BERT models refined through cross-validation, resulting in notable accuracy rates of 92.29% for the pre processed dataset without noise and 90.08% for the raw filtered dataset. These encouraging results confirm the effectiveness of combining transfer learning, cross-validation, and NLMs in NLP, with a particular focus on the state-of the-art performance achieved by pre-trained BERT models in real-world text classification tasks.

Keywords


NLMs, Transfer Learning, Text Classification, BERT

Full Text: PDF