Daghestanian loans database

Authors: Ilya Chechuro, Michael Daniel, and Samira Verhees.

The DagLoans database contains results of elicitation of a list of 160 lexical meanings across languages of Daghestan. Data collection was aimed at assessing the amount of lexical transfer between these languages. The database includes the data on 23 languages (15 collected in the field, 7 provided by experts, 19 based on dictionaries). The data have been collected in 38 villages across Daghestan and 5 villages in the Qax district of Azerbaijan.

The general objective of the DagLoans project is the study of lexical borrowing in the languages of Daghestan on the level of granularity that is sensitive to the difference between village varieties. For this purpose, we developed a method for obtaining comparable lexical data through eliciting a relatively short (146 concepts) wordlist that serves as a litmus paper, a quick field probe for the amount of lexical transfer. Using a fixed list allows discovering quantitative correlates of sociolinguistic differences between areas, such as the spread of a certain lingua franca or the presence and degree of contact with particular languages. In combination with the sociolinguistic data on multilingualism in Daghestan, our data shows that the conditions and the degree of language contact for each village vary and correlate with bilingualism rates as reported in our another project, Atlas of Multilingualism in Daghestan.

The table shows the concepts and their translations into target languages. Translations are grouped into similarity sets, sets of words that look similarly and were used as translations of the same concept. Whenever the similarity is shared by different language families or sufficiently distant branches, we consider this as an indication that the lexical item night have been shared through language contact. Metadata includes the name of the village where the word was recorded and its location, the language spoken in the village, and the list ID. The ID corresponds to a particular speaker or, in some cases, to a written dictionary source.

The DagLoans database has been compiled by Ilia Chechuro and Samira Verhees. The data are copyrighted by Linguistic Convergence Laboratory, HSE University, Moscow, and may be used in other academic projects (see How to cite).

The project was funded by the Basic Research Program at the National Research University Higher School of Economics (HSE) and supported within the framework of a subsidy by the Russian Academic Excellence Project ‘5-100.’

Contents:

              [,1]
target_words 30970
languages       23

How to cite this project

If you use data from the database in your research, please cite as follows:

Chechuro I., Daniel M., and Verhees S. 2019. Daghestanian loans. Linguistic Convergence Laboratory, HSE. DOI.)

Publications

The following articles have been as part of the DagLoans project:

Daniel, M., Chechuro, I., Verhees, S., & Dobrushina, N. (2021). Lingua francas as lexical donors: Evidence from Daghestan. Language 97(3), 520-560.

Chechuro, I., Daniel, M., & Verhees, S. (2021). Small-scale multilingualism through the prism of lexical borrowing. International Journal of Bilingualism, 25(4), 1019-1039.

Chechuro, I. Y. (2021). Lexical convergence reflects complex historical processes. In: Grenoble, L. A., & Forker, D. (2021). Language Contact in the Territory of the Former Soviet Union, 35-57.

The database

For now, the table shows source Concepts and target Words. Each target word is grouped in a similarity Set - a set of words that have the same meaning and look similar. In the future, data will be added on borrowing sources. Metadata includes the name of the Village where the word was recorded, the administrative District it is part of, the Language spoken there, and the List_ID: these ID’s correspond to a particular speaker or in some cases a written source like a dictionary.

The table below can be sorted and filtered, the resulting subset can be downloaded by pressing on the “CSV” button.


Version: 2022-02-06. For questions or comments contact .


Map of the surveyed villages

Hover over and / or click on a dot on the map to know more. The color of the dots corresponds to the number of lists collected in a village. Orange = dictionary data.


References

Auguie, Baptiste. 2017. gridExtra: Miscellaneous Functions for "Grid" Graphics. https://CRAN.R-project.org/package=gridExtra.
Barnier, Julien. 2019. Rmdformats: HTML Output Formats and Templates for ’Rmarkdown’ Documents. https://CRAN.R-project.org/package=rmdformats.
Boettiger, Carl. 2017. Knitcitations: Citations for ’Knitr’ Markdown Files. https://CRAN.R-project.org/package=knitcitations.
Galili, Tal. 2015. “Dendextend: An r Package for Visualizing, Adjusting, and Comparing Trees of Hierarchical Clustering.” Bioinformatics. https://doi.org/10.1093/bioinformatics/btv428.
Gehlenborg, Nils. 2017. UpSetR: A More Scalable Alternative to Venn and Euler Diagrams for Visualizing Intersecting Sets. https://CRAN.R-project.org/package=UpSetR.
Haspelmath, Martin, and Uri Tadmor. 2009. Loanwords in the World’s Languages: A Comparative Handbook. Walter de Gruyter.
Moroz, George. 2017. Lingtypology: Easy Mapping for Linguistic Typology. https://CRAN.R-project.org/package=lingtypology.
R Core Team. 2019. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Sievert, Carson. 2018. Plotly for r. https://plotly-r.com.
Suzuki, Ryota, and Hidetoshi Shimodaira. 2015. Pvclust: Hierarchical Clustering with p-Values via Multiscale Bootstrap Resampling. https://CRAN.R-project.org/package=pvclust.
Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.
———. 2017. Tidyverse: Easily Install and Load the ’Tidyverse’. https://CRAN.R-project.org/package=tidyverse.
Xie, Yihui. 2014. “Knitr: A Comprehensive Tool for Reproducible Research in R.” In Implementing Reproducible Computational Research, edited by Victoria Stodden, Friedrich Leisch, and Roger D. Peng. Chapman; Hall/CRC. http://www.crcpress.com/product/isbn/9781466561595.
———. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. https://yihui.name/knitr/.
———. 2019. Knitr: A General-Purpose Package for Dynamic Report Generation in r. https://yihui.name/knitr/.
Xie, Yihui, Joe Cheng, and Xianying Tan. 2019. DT: A Wrapper of the JavaScript Library ’DataTables’. https://CRAN.R-project.org/package=DT.