EN | RU |
The corpus contains spoken texts in Meadow Mari (also known as Eastern Mari or formerly Cheremis, M.Mari: Олык марий, Russian: луговой марийский язык; glottocode [mhr]).
The texts for the corpus were recorded in 2000–2004 and in 2018 in the village of Staryj Torjal (Novyj Torjal district, Mari El Republic) and belong to the Sernur-Morkin dialect. This dialect forms the basis of the literary Meadow Mari (Bartseva et al., 2012), thus the corpus texts are close to literary Meadow Mari, but display some non-standard lexical and morphological properties.
The corpus texts are written in the Meadow Mari alphabet which is based on the Cyrillic alphabet plus a few additional letters (see table 1). In order to facilitate searching through the corpus the orthography of the texts was normalized as much as possible meaning it does not reflect the phonetic dialectal variation. For instance, the dialect form kəlm-en-ut /freeze-pst2-3pl/ with a dialectal marker -ut instead of -ət used in literary Meadow Mari is spelled as кылменыт.
The mapping between the Meadow Mari alphabet, the IPA transcription and the transcription adopted in Kuznetsova et. al (2012) is shown in Table 1.
For narrow phonetic transcription we recommend consulting the audiofile of the corpus.
|letter||sound (phone): IPA, [Kuznetsova et al., 2012]||letter||sound (phone): IPA, [Kuznetsova et al., 2012]||letter||sound (phone): IPA, [Kuznetsova et al., 2012]|
|А а||a||Л л||l||Ф ф*||f|
|Б б||b||М м||m||Х х||x|
|В в||v||Н н||n||Ц ц||ts, c|
|Г г||g||Ҥ ҥ||ŋ||Ч ч||t̠ʃ, č|
|Д д||d||О о||o||Ш ш||ʃ, š|
|Е е||e (after a consonant), je (in the beginning of a word, after a vowel, after ъ and ь)||Ӧ ӧ||ɵ, ö||Щ щ*||t͡ɕ|
|Ё ё*||o (after palatalized consonants), jo (in the beginning of a word, after a vowel, after ъ and ь)||П п||p||Ъ ъ||-|
|Ж ж||ʒ, ž||Р р||r||Ы ы||ə|
|З з||z||С с||s||Ь ь||- as part of digraphs ль and нь marks palatalization /l’/ and /n’/|
|И и||i||Т т||t||Э э||e|
|Й й||j||У у||u||Ю ю||u (after palatalized consonants), ju (in the beginning of a word, after a vowel, after ъ and ь)|
|К к||k||Ӱ ӱ||y, ü||Я я||a (after palatalized consonants), ja (in the beginning of a word, after a vowel, after ъ and ь)|
Asterisk (*) indicates letters that are used only in Russian loanwords
As becomes evident from Table 1, the letters е, ю, я, ё are used for iotized vowels. However, ё is only used in Russian loanwords.
In words which are not recent Russian loanwords iotized vowels would be spelled using only е, ю, я (еҥ ‘man’, юлгыжаш ‘twinkle’, янлык ‘beast’) or a combination of й and a vowel (йылме ‘language’, йӧраташ ‘love’, йӱлаташ ‘burn’).
Cf. ёлка ‘fir-tree’ which contrasts with йолташ ‘friend’