The Issues of lexical borrowing for comparative reconstruction throughout Indo-Iranian languages (part 1)

Comparative reconstruction and subgrouping is done as if on the presumption that languages neatly separate into clearly demarcated dialects, which have neither subsequent contact with each other after separation, nor with other unrelated languages. When applying the comparative method, one must be aware not only of sound correspondences and inherited changes within a language or subgroup, but also of changes caused by external forces.

Yaron Matras describes borrowing as the replication of ‘linguistic matter’, which are ‘concrete, identifiable sound-shapes of words and morphs’ (Matras, 2009).  It is a complex unit, comprised of a phonological form, a meaning, and distinct status as a lexical item. Any of these, or rather any mix of these features can be borrowed. These differ from patterns, modes of organizing units of speech, which can likewise also influence other languages creating even more confusion.  The term ‘borrowing’, however, has come under fire from various scholars, who instead prefer to use terms such as ‘loaning’ and ‘copying’. Matras uses the term ‘replication’. This article outlines the issues of replication, which pose problems for comparative reconstruction and subgrouping.

Firstly, replication makes it difficult to reconstruct proto-forms for words that have been replaced in the daughter languages.  An enigma to this day is the etymology of the English word ‘boy’, for which no cognate in any Germanic language can be found.  A possible explanation is that it was replicated from a language that no longer exists, meaning the original Old English proto-form for ‘boy’ cannot be reconstructed.  Replications can likewise do the opposite, causing linguists to reconstruct words that never existed, as loans are often used to fill gaps in the recipient language’s lexicon.  Indo-Aryan replicated Persian mazhab, ‘religion’, since the Indo-Aryan dharm was ill-suited for this sense as it could also simply mean ‘duty’.  The two hypotheses for what motivates replication are gaps in the recipient language’s lexicon (as with the Indo-Aryan example above), and prestige. Unlike gap-fillers, prestige loans usually exist alongside the original terms, however can still enrich the lexicon of a language by eliciting associated contexts, although at times even these can replace native terms.  This has happened extensively with the Indo-Aryan lexicon, as loanwords from the language of prestige in Mughal times, Farsi, still exist alongside Sanskrit-derived forms in spoken colloquial Hindi and Urdu.

Many scholars, through typological sampling, have arrived at a general consensus on which words are borrowed more than others. Languages are more likely to copy cultural vocabulary than core vocabulary, and content words more than grammatical words.
The hierarchy offered by Matras is: nouns, conjunctions > verbs > discourse markers > adjectives > interjections > adverbs > other particles, adpositions > numerals > pronouns > derivational affixes > inflectional affixes (Matras, 2009). This is significant in that the forms which can be most easily reconstructed are the ones that are most often replicated. Verbs can come with inflections which can undergo a plethora of modifications according to the inflectional paradigm of the recipient language, whereas the very existence of adjectives and adverbs in some languages is debatable.  Syntactic and morphosyntactic categories such as word order and inflectional morphology, from the heirarchy offered by Thomason and Kaufman (see Thomason and Kaufman YEAR), and particles, adpositions and affixes from that offered by Matras, cannot be straightforwardly reconstructed in the first place, and attract controversy whenever an attempt is made.

Our next problem is that loans are often inflected and modified in the patterns of the recipient language.  For example the Swahili word vitabu, ‘books’, is a replication of the Arabic kitab, ‘’book’. However Swahili speakers have interpreted the first syllable ki as the singular nominal prefix in Swahili, also ki. The counterpart plural marker is vi, giving us vitabu.  Languages with gender classes also assign them to loanwords, however these may not always be the same assignments as those of the donor language.  For example taɁrīx ‘history’ is masculine in Arabic, and the Kurdish replication tarîx is feminine.  It is not only syntactic patterns but also semantic content that is subject to imitate the patterns of the recipient language.  Romani tajśa is a replication of Greek taixiá, ‘tomorrow’. However in Romani, tajśa can mean both tomorrow and yesterday, as it replaced the original kal(iko) with the same meaning, cognate with its it’s Indo-Aryan counterpart kal. Here we see the phonology and semantics partially replicated, however the word has acquired a new meaning.

Other languages use their own lexical verbs in order to integrate loan verb stems. This is common in the area spanning the Caucassus to South Asia, in which the distinction is made between intransitivity and transitvity by using the verb ‘to be’ and ‘to do’ as auxiliaries, respectively:
‘Adyghe (Höhlig 1997) uses its verbs ṡliin ‘do’ and xin ‘become’ with the Russian verb in the infinitive: mešat’ sesli ‘I disturb’, realizovat’ xin ‘to fulfil oneself’. Kurdish likewise has temam kirin ‘to complete something’ from Arabic tamm ‘to complete’ via Persian temmam, with Kurdish kirin ‘to do’, and xilas bun ‘to end’ from Arabic xallas ‘end’, with Kurdish bun ‘to become’. Persian shows elam kardan ‘to announce’ and elam shodan ‘to be announced’, from Arabic iʕlam ‘announcement’, with Persian kardan ‘to do’ and shodan ‘to become’… Hindi has taqsim karna ‘to divide’ from Arabic/Persian taqsim ‘division’ and Hindi karna’ to do” (Matras, 2009).
Although at first it seems this would make it easier to recognise loanwords, these constructions are also used with the language’s native words, as in Hindi nach karna ‘to dance’, kam karna, ‘to work’ and aish karna ‘to have fun’.  Domari, an Indo-Aryan langauge that has seen intense contact with Kurdish and Arabic, takes this even further and is grammaticalising its light verbs into affixes and augments:
‘The verbs –kar– ‘to do’ and -(h)o– ‘to be/to become’ often appear in a contracted form when integrating Arabic verb loans: stri-k-ami ‘I buy’ alongside stri-kar-ami; skunn-o-ndi ‘they reside’ alongside skunn-ho-ndi.  Here… the semantic opposition signifies the subject that is experiencer rather than agent-initiator.  As in other languages that employ light verbs, in Domari too they are equally employed to derive verbs from indigenous nouns: qayis ‘food’, qayisk(ar)ami ‘I prepare food’.’ (Matras, 2009)

Adjectives too successfully disguise themselves in the recipient language. Languages which do not inflect avoid carrying over the inflection. However if both the recipient language and donor language do so, the recipient language integrates the adjective into its own inflectional system, as is the case with replicated Persian and Kurdish adjectives which do not inflect for gender or number (gender inflections do not exist at all in Persian). Adjectives from Arabic follow the conventional Iranian attributive construction which places it after the noun, and simply uses the masculine single gender inflection from Arabic.  Kurdish cirok-en teqlidi (stories-ATTR.PL traditional) ‘traditional stories’. The inanimate plural agreement, with the Arabic adjective, would be taqlidi-yya.

Sounds from replicated words can also enter the language’s phonemic inventory, known as direct phonological diffusion.  Bilingual speakers often try to imitate the pronunciation of the donor language. This spreads to the monolinguial speakers, and new phonemes can enter a language.  This is the case with Kurdish and Arabic, a contact situation in which there is a high degree of bilingualism.  Kurdish did not originally have the pharyngeal fricatives /ħ/ and /ʕ/, as in Kurdish [ħaj’wɑ:n] ‘animal’, or [sæ’ʕæt], ‘hour’.  This process can also replace native phonemes entirely, a phenomenon known as convergence, in which new phonemes diffuse ‘backwards’, replacing inherited phonemes.  Kurdish varieties in southeastern Turkey and northern Iraq, /ħ/ is not only found in Arabic loans but also replaces the glottal fricative /h/ in some inherited words, such as [ħæʃt] ‘eight’.  Certain qualities of phonemes can also only be selected, such as the length of the Arabic vowel /a:/ in [ħaj’wɑ:n], immitating only the length leaving the rest of the vowel unchanged from the original [ħaj’wa:n].  Phonological change arising from integration of loans can be confused with phonological change that has been inherited, especially since the former only occurs in certain dialects.

The above examples are all instances in which loanwords have successfully disguised themselves in the respective recipient languages, and should not be confused for inherited words while subgrouping.

(continued in part 2)



Campbell, L. (2013). Historical Linguistics An Introduction (3rd ed.). Edinburgh: Edinburgh
University Press.

Crowley, T., & Bowern, C. (2010). An introduction to historical linguistics. Oxford: Oxford
University Press.

Klimas, A. (1967). Balto-Slavic or Baltic and Slavic? (The Relationship of Baltic and Slavic
Languages). Lituanus Lithuanian Quarterly Journal of Arts and Sciences, 13(2).

Masica, C. P. (1991). The Indo-Aryan languages (Cambridge Language Surveys). Cambridge:
Cambridge University Press.

Matras, Y. (2009). Language Contact. Cambridge, UK: Cambridge University Press.

Mayer, H. E. (1981). Two Linguistic Myths: Balto-Slavic and Common Baltic. Lituanus Lithuanian Quarterly Journal of Arts and Sciences, 27(1).

Sussex, R., & Cubberley, P. (2006). The Slavic Languages (Cambridge Language Surveys).
Cambridge: Cambridge University Press.

Image Source:



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s