Resurrection: The Khazar Language Reconstruction Using Computer Science Technologies

Elina Makipova, Iskander Akhmetov, Alexander Gelbukh


Decrypting or reconstructing extinctlanguages is challenging, especially when the objectiveis to reconstruct a language with no or very few textsleft, such as the Khazar language or early Slavic andUgric languages. In this paper, we lay out the historicalperspective of the Khazar people, their language, andcontemporary descendant ethnic groups, namely theChuvash and Tatar people. Then we discuss waysComputer Science can help researchers in languagereconstruction and decryption. Finally, we pilot anapproach to find Khazar/Bulgar word candidates inChuvash and Tatar languages by (1) normalizing thewords of two languages and (2) comparing them,accounting for the semantic concepts to solve thehomonymy problem, and (3) excluding common Turkicwords and borrowings from the Russian language.


Khazar, language reconstruction, extinct languages, historical linguistics

