Neural-Combinatorial Classifiers for Arabic Decomposable Word Recognition

Authors

  • Zeineb Zouaoui University of Tunis
  • Imen Ben-Cheikh University of Tunis
  • Mohamed Jemni University of Tunis

DOI:

https://doi.org/10.13053/cys-28-3-4368

Keywords:

Convolutional neuronal network, combinatorial optimization, simulated annealing, morphological characteristics, Levenshtein distance

Abstract

Recognition tools and techniques for Arabic script are still under development due to the topological ambiguities and inflectional nature of this language. In this regard, this paper presents an approach based on a combinatorial optimization technique incorporating convolutional neural networks for Arabic word recognition. We handle a wide vocabulary of Arabic decomposable words.  We adopt a design that resembles a molecular cloud with words structured according to their roots and patterns. This conception fits well with the Arabic linguistic philosophy of building words from their roots. Hence, each sub-vocabulary represents a sub-cloud, encompassing neighboring words derived from the same root and following different patterns and forms of derivation, inflection and agglutination (proclitic and enclitic). Hence, each sub-vocabulary represents a sub-cloud, encompassing neighboring words derived from the same root and following different schemes and forms of derivation, inflection and agglutination (proclitic and enclitic). Accordingly, as a first step, we have used a recognition approach based on the metaheuristic method of simulated annealing (SA). In a second work, we implemented the SA algorithm by integrating linguistic knowledge. Extending this work, we choose to integrate a convolutional neural network into the recognition process of the SA algorithm to benefit from the advantages of both methods. To conduct our experiments, which yielded promising results, we use a corpus of Arabic words including samples and agglutinated words from the APTI database.

Downloads

Published

2024-09-12

Issue

Section

Articles