Enhancing Text Classification Using BERT: A Transfer Learning Approach

Authors

  • Haider Zaman-Khan University Quaid-i-Azam
  • Muddasar Naeem University Giustino Fortunato
  • Raffaele Guarasci National Research Council
  • Umamah Bint-Khalid University Quaid-i-Azam
  • Massimo Esposito National Research Council
  • Francesco Gargiulo National Research Council

DOI:

https://doi.org/10.13053/cys-28-4-5290

Keywords:

NLMs, Transfer Learning, Text Classification, BERT

Abstract

This paper investigates the application of Natural Language Processing (NLP) techniques for enhancing the performance of document-level classification tasks. The study focuses on leveraging a Transformer-based Neural Language Model (NLM), particularly BERT, combined with cross-validation to exploit trans fer learning algorithms for classification tasks. To address the challenges, the approach has been tested on the two different types of the widely-known 20 Newsgroups benchmark dataset using pre-trained BERT models refined through cross-validation, resulting in notable accuracy rates of 92.29% for the pre processed dataset without noise and 90.08% for the raw filtered dataset. These encouraging results confirm the effectiveness of combining transfer learning, cross-validation, and NLMs in NLP, with a particular focus on the state-of the-art performance achieved by pre-trained BERT models in real-world text classification tasks.

Downloads

Published

2024-12-03

Issue

Section

Articles of the Thematic Section