Towards Inclusive Fact-Checking: Claim Verification in English, Hindi, Bengali, and Code-Mixed Languages
DOI:
https://doi.org/10.13053/cys-29-3-5914Keywords:
Claim verification, fact checking, low-resourced language, prompt engineeringAbstract
Automated claim verification has gained significant attention in recent years due to the widespread dissemination of misinformation across various digital platforms. While substantial progress has been made for high-resource languages like English, claim verification for low-resource languages and specifically for Code-Mixed texts remains largely unexplored in a multilingual country like India. In the present work, we introduce a novel multilingual dataset for claim verification, covering English, Hindi, Bengali, and Hindi-English Code-Mixed languages. The dataset is developed by engaging large language models (LLMs) as well as human annotators. The dataset contains claims, evidence passages, and veracity labels (\textit{SUPPORTS} or \textit{REFUTES}) on news headlines collected from three important domains (Politics, Healthcare, Law and Order). We proposed a rule-based baseline algorithm and a dual-encoder framework based on transformer models to effectively verify claims across diverse languages. Our results show that XLM-RoBERTa achieves the best performance for English and Code-Mix texts, while IndicBERTv2 outperforms for Hindi and Bengali, respectively. This study highlights the challenges and opportunities in multilingual and Code-Mixed claim verification, offering a step towards building inclusive, language-diverse fact-checking systems even for low resource setup.Downloads
Published
2025-09-28
Issue
Section
Articles of the Thematic Section
License
Hereby I transfer exclusively to the Journal "Computación y Sistemas", published by the Computing Research Center (CIC-IPN),the Copyright of the aforementioned paper. I also accept that these
rights will not be transferred to any other publication, in any other format, language or other existing means of developing.I certify that the paper has not been previously disclosed or simultaneously submitted to any other publication, and that it does not contain material whose publication would violate the Copyright or other proprietary rights of any person, company or institution. I certify that I have the permission from the institution or company where I work or study to publish this work.The representative author accepts the responsibility for the publicationof this paper on behalf of each and every one of the authors.
This transfer is subject to the following conditions:- The authors retain all ownership rights (such as patent rights) of this work, except for the publishing rights transferred to the CIC, through this document.
- Authors retain the right to publish the work in whole or in part in any book they are the authors or publishers. They can also make use of this work in conferences, courses, personal web pages, and so on.
- Authors may include working as part of his thesis, for non-profit distribution only.