Predicting the Future of Text: A Hybrid Approach to Next-Word Prediction

Sanjit Kumar Dash, Parameswari Khatua, Muktikanta Sahu

Abstract


Text input has become an integral part of modern communication, spanning from everyday conversations to formal content creation. However, manual typing is often slow and prone to errors, which has driven the need for efficient text prediction models to improve user experience and productivity. By anticipating and generating the next likely word in a sequence, next-word prediction systems contribute significantly to faster and more accurate text composition. Early approaches like N-grams established the foundational concepts but were limited in their ability to grasp complex, wide-reaching context. In the recent years, this field has been dominated by large-scale Transformer architectures, which have set new benchmarks in language understanding. However, their significant computational demands often create a barrier to deployment in resource-constrained environments such smartphones or embedded systems . This paper addresses this challenge by introducing a hybrid deep learning model that offers a predictive accuracy with computational efficiency. Our proposed architecture merges CNNs with Bi-LSTM networks. CNNs are highly effective at extracting local, spatial features from text, while Bi-LSTMs excel at learning long-range sequential dependencies. By training this model on the classic Sherlock Holmes dataset, we demonstrate its ability to achieve nearly 76\% contextual accuracy, proving it is a powerful and viable alternative for real-world applications. This work validates the effectiveness of hybrid models in creating intelligent text generation systems for tools like smart keyboards and assistive writing technologies.

Keywords


Next-word prediction, CNN-LSTM hybrid architecture, natural language processing, text generation, sequential data modeling, one-hot encoding, Sherlock Holmes corpus, language modeling, neural networks

Full Text: PDF