Corpus of Russian spoken in Daghestan

Access to the corpus is restricted due to the sensitivity of the data*

View corpus »

About the project

Daghestanian Russian Project (DagRus) is a project of the Linguistic Convergence Laboratory of the Higher School of Economics (HSE). The aim of the project is to create a corpus of regional variants of Russian spoken in Daghestan.

Another aim of the project is to use the corpus for the systematic study of the morphosyntactic characteristics of Daghestanian Russian.

The project is conducted within the framework of the Basic Research Program at the National Research University "Higher School of Economics" (HSE) and supported as part of the Russian Academic Excellence Project '5-100'.

What is Daghestanian Russian

The republic of Daghestan is linguistically and ethnically the most diverse area of the Russian Federation. Some forty languages spoken here belong to Nakh-Daghestanian (East Caucasian) family, but there are also speakers of a few Turkic languages (Kumyk, Nogai, Azerbaijani) and one Iranian language (Tat).

Russian is one of the fourteen official languages of Daghestan, but functionally its position is exceptional. In multilingual Daghestan it plays the role of interethnic lingua franca. It is the language of higher education and social career. In most parts of Daghestan, there are no visible signs of the loss of endemic languages, and the knowledge of a local native language is usually combined with a good command of Russian. At the same time, the influence of the local languages on the variery Russian spoken in Daghestan is quite strong.

Daghestanian Russian is understudied, and we hope that our corpus will provide a useful tool for researchers to investigate this interesting variety of Russian.


How to cite the corpus

If you use data from the Daghestanian Russian Corpus in your research, please cite as follows:

Nina Dobrushina, Michael Daniel, Ruprecht von Waldenfels, Timur Maisak, Anastasia Panova. 2018. Corpus of Russian spoken in Daghestan. Moscow: Linguistic Convergence Laboratory, NRU HSE. (Available online at, accessed on .)

*How to get access to the corpus

Write an email to