Deeply embedded clauses in Finno-Ugric

A pilot study on Estonian and Moksha Mordvin


  • Edyta Jurkiewicz-Rohrbacher Universität Hamburg, Universität Regensburg
  • Petar Kehayov University of Tartu



recursion in language, complex sentences, deep clausal embeddings, complement, relative, adverbial clauses, time reference, Estonian, Moksha Mordvin


Complex sentences often contain clauses embedded in clauses that themselves are embedded. The properties of such deeply embedded clauses (DECs) and their relations to other parts of the sentence are poorly studied. We address this research gap by studying printed text material from two structurally different Finno-Ugric languages: Estonian and Moksha Mordvin. We investigate the relationship between embedding depth and the type of the embedded clause, its position relative to the superordinate clause and its temporal reference. Combining these variables, we observe associations between specific depths (first-, second-, third-order embedding), clause types (complement, relative, adverbial), positions (left-, right-, center-embedding), and temporal reference (absolute, relative). We show that DECs are not entirely identical with first-order embeddings, i.e., that embedding depth is a factor influencing the grammar of subordinate clauses and conclude that assessing DECs is crucial to the description of clausal subordination in a language.


Bartens, Raija. 1999. Mordvalaiskielten rakenne ja kehitys. (Mémoires de la Société Finno-Ougrienne 232). Helsinki: Suomalais-Ugrilainen Seura.

Blasi, Damian E. & Cotterell, Ryan & Wolf-Sonkin, Lawrence & Stoll, Sabine & Bickel, Balthasar & Baron, Marco. 2019. On the distribution of deep clausal embeddings: A large cross-linguistic study. In Korhonen, Anna & Traum, David & Màrquez, Lluís (eds.), Proceedings of the 57th annual meeting of the Association for Computational Linguistics, 3938–3943. Florence: Association for Computational Linguistics.

Burnham, Kenneth P. & Anderson, David R. 2002. Model selection and multimodel inference: A practical information-theoretic approach. New York: Springer.

Buzakov, Ivan S. 1973. Složnoe predloženie v mordovskix jazykax [The complex sentence in Mordvin languages]. Saransk: Mordovskoe knižnoe izdatel’stvo.

Comrie, Bernard. 1985. Tense. Cambridge: Cambridge University Press.

Cristofaro, Sonia. 2003. Subordination. Oxford: Oxford University Press.

Dik, Simon C. & Hengeveld, Kees. 1991. The hierarchical structure of the clause and the typology of perception-verb complements. Linguistics 29(2). 231–259.

Dobson, Annette J. 1990. An introduction to generalized linear models. London: Chapman and Hall.

EG = Metslang, Helle & Erelt, Mati & Habicht, Külli & Hennoste, Tiit & Kasik, Reet & Teras, Pire & Viht, Annika & Asu, Eva Liina & Lindström, Liina & Lippus, Pärtel & Pajusalu, Renate & Plado, Helen & Rääbis, Andriela & Veismann, Ann. 2023. Eesti grammatika [Estonian grammar]. Tartu: Tartu Ülikooli kirjastus.

EKS = Erelt, Mati & Metslang, Helle (eds.). 2017. Eesti keele süntaks [Estonian Syntax]. (Eesti Keele Varamu 3). Tartu: Tartu Ülikooli kirjastus.

Fejes, Katalin B. 2006. Koreferencia-viszonyok a két- és többtagú összetett mondatban [Coreference relations in two- and multipart compound sentences]. Nyelvtudomány 2. 9–19.

Feoktistov, Aleksandr P. 1976. Očerki po istorii formirovanija mordovskix pis’menno-literaturnyx jazykov (rannij period) [Essays on the history of formation of Mordvin literary languages (early period)]. Moscow: Nauka.

Frank, Stefan L. & Trompenaars, Thijs & Vasishth, Shravan. 2016. Cross-linguistic differences in processing double-embedded relative clauses: Working-memory constraints or language statistics? Cognitive Science 40(3). 554–578.

Gibson, Edward. 1998. Linguistic complexity: Locality of syntactic dependencies. Cognition 68. 1–76.

Givón, Talmy. 2001. Syntax: An introduction, Volume II. Amsterdam: John Benjamins.

Hamari, Arja & Ajanki, Rigina. 2022. Mordvin (Erzya and Moksha). In Bakró-Nagy, Marianne & Laakso,

Johanna & Skribnik, Elena (eds.), The Oxford guide to the Uralic languages, 392–431. Oxford: Oxford University Press.

Hastie, Trevor J. & Pregibon, Daryl. 1992. Generalized linear models. Chapter 6. In Chambers, John M. & Hastie, Trevor J. (eds.), Statistical models in S. New York: Wadsworth & Brooks/Cole.

Hollebrandse, Bart. 2020. Indirect recursion: The importance of second-order embedding and its implications for cross-linguistic research. In Amaral, Luiz & Maia, Marcus & Nevins, Andrew & Roeper, Tom (eds.), Recursion across Domains, 35–47. Cambridge: Cambridge University Press.

van der Hulst, Harry. 2010. Re Recursion. In van der Hulst, Harry (ed.), Recursion and human language (Studies in Generative Grammar 140), xv–liii. Berlin & New York: De Gruyter Mouton.

Hurvich, Clifford M. & Tsai, Chih-Ling. 1989. Regression and time series model selection in small samples. Biometrika 76. 297–307.

Hurvich, Clifford M. & Tsai, Chih-Ling. 1991. Bias of the corrected AIC criterion for underfitted regression and time series models. Biometrika 78. 499–509.

Ikola, Osmo & Palomäki, Ulla & Koitto, Anna-Kaisa. 1989. Suomen murteiden lauseoppia ja tekstikielioppia [Syntax and text grammar of the Finnish dialects] (Suomalaisen Kirjallisuuden Seuran Toimituksia 511). Helsinki: Suomalaisen Kirjallisuuden Seura.

ISK = Hakulinen, Auli & Vilkuna, Maria & Korhonen, Riitta & Koivisto, Vesa & Heinonen, Tarja Riitta & Alho, Irja. 2008. Ison suomen kieliopin verkkoversio [The comprehensive grammar of Finnish, online version] (Kotimaisten kielten tutkimuskeskuksen verkkojulkaisuja 5). ( (Accessed 2023-05-23).

Karlsson, Fred. 2007a. Constraints on multiple initial embedding of clauses. International Journal of Corpus Linguistics 12(1). 107–118.

Karlsson, Fred. 2007b. Constraints on multiple center-embedding of clauses. Journal of Linguistics 43(2). 365–392.

Karlsson, Fred. 2009. Origin and maintenance of clausal embedding complexity. In Sampson, Geofrey & Gil, David & Trudgill, Peter (eds.), Language complexity as an evolving variable, 192–202. Oxford: Oxford University Press.

Karlsson, Fred. 2010a. Multiple final embedding of clauses. International Journal of Corpus Linguistics 15(1). 88–105.

Karlsson, Fred. 2010b. Recursion and iteration. In van der Hulst, Harry (ed.), Recursion and human language (Studies in Generative Grammar 140), 43–68. Berlin & New York: De Gruyter Mouton.

Kehayov, Petar. 2016. Complementation marker semantics in Finnic (Estonian, Finnish, Karelian). In Boye, Kasper & Kehayov, Petar (eds.), Complementizer semantics in European languages (Empirical Approaches to Language Typology 57), 449–497. Berlin & Boston: De Gruyter Mouton.

Kehayov, Petar & Boye, Kasper. 2016. Complementizer semantics in European languages: Overview and generalizations. In Boye, Kasper & Kehayov, Petar (eds.), Complementizer semantics in European languages (Empirical Approaches to Language Typology 57), 809–878. Berlin & Boston: De Gruyter Mouton.

Keszler, Borbála. 2000. A többszörösen összetett mondatok elemzése. In Keszler, Borbála (ed.), Magyar grammatika [Hungarian grammar], 542–554. Budapest: Nemzeti Tankönyvkiadó.

Kiss, Katalin É. 2023. The (non-)finiteness of subordination correlates with basic word order: Evidence from Uralic. Acta Linguistica Academica 70(2). 171–194.

Laanekask, Heli & Erelt, Tiiu. 2003. Written Estonian. In Erelt, Mati (ed.), Estonian language (Linguistica Uralica, Supplementary Series / Volume 1), 273–342. Tallinn: Estonian Academy Publishers.

Laury, Ritva & Helasvuo, Marja-Liisa. 2016. Disclaiming epistemic access with ‘know’ and ‘remember’ in Finnish. In Lindström, Jan & Maschler, Yael & Pekarek Doehler, Simona (guest eds.), Grammar and negative epistemics in talk-in-interaction: Cross-linguistic studies (Special issue of Journal of Pragmatics 106), 80–96.

Laury, Ritva & Helasvuo, Marja-Liisa. 2020. The emergence and routinization of complex syntactic patterns formed with ajatella ‘think’ and tietää ‘know’ in Finnish talk-in-interaction. In Maschler, Yael & Pekarek Doehler, Simona & Lindström, Jan & Keevallik, Leelo (eds.), Emergent Syntax for Conversation: Clausal patterns and the organization of action (Studies in Language and Social Interaction 32), 55–85. Amsterdam: John Benjamins.

Laury, Ritva & Ono, Tsuyoshi. 2010. Recursion in Conversation: What speakers of Finnish and Japanese know how to do. In van der Hulst, Harry (ed.), Recursion and human language (Studies in Generative Grammar 140), 69–92. Berlin & New York: De Gruyter Mouton.

Laury, Ritva & Ono, Tsuyoshi & Suzuki, Ryoko. 2021. Questioning the clause as a crosslinguistic unit in grammar and interaction. In Ono, Tsuyoshi & Laury, Ritva & Suzuki, Ryoko (eds.), Usage-based and typological approaches to linguistic units (Benjamins Current Topics 114), 123–160. Amsterdam & Philadelphia: John Benjamins.

Mazerolle, Marc J. 2023. AICcmodavg: Model selection and multimodel inference based on (Q)AIC(c). R package version 2.3.3. ( (Accessed 2024-03-08).

McCullagh, Peter & Nelder, John A. 1989. Generalized linear models. London: Chapman and Hall.

Mithun, Marianne. 1984. How to avoid subordination. In Dahlstrom, Amy & Macauley, Monica (eds.), Papers selected from the parasession on subordination (Berkeley Linguistics Society 10), 493–509. Berkeley: University of California.

Oakes, Michael P. 1998. Statistics for corpus linguistics. Edinburgh: Edinburg University Press.

Pinker, Steven & Jackendoff, Ray. 2005. The faculty of language: What’s special about it? Cognition 95(2). 201–236.

Progovac, Ljiljana. 2010. When clauses refuse to be recursive: An evolutionary perspective. In van der Hulst, Harry (ed.), Recursion and human language, 193–211. Berlin & New York: De Gruyter Mouton.

R Core Team 2021. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. (URL (Accessed 2023-06-18).

Rácz, Endre. 1968. A többszörösen összetett mondat. In Rácz, Endre (ed.), A mai Magyar nyelv [Contemporary Hungarian language], 443–446. Budapest: Tankönyvkiadó.

Toldova et al. 2018 = Toldova, Svetlana Ju. & Xolodilova, Marija A. & Tatevosov, Sergej G. & Kaškin, Egor V. & Kozlov, Aleksej A. & Kozlov, Lev S. & Kuxto, Anton V. & Prizivinceva, Marija Ju. & Stenin, Ivan A. (eds.), Ėlementy mokšanskogo jazyka v tipologičeskom osveščenii [Elements of the Moksha language in a typological context]. Moskva: Buki Vedi.

Saarinen, Sirkka. 1991. Typological differences between the Volgaic languages. Yearbook of the Linguistic Association of Finland 4. 43–52.

Sgall, Petr. 1990. Absolutes und relatives Tempus. In Wagner, Karl Heinz & Widgen, Wolfgang (eds.), Studien zur Grammatik und Sprachtheorie (Bremer Linguistisches Kolloquium 2), 57–64. Bremen: Milde Multiprint.

Shagal, Ksenia. 2018. Participial systems in Uralic languages: An overview. Journal of Estonian and Finno-Ugric Linguistics (JEFUL) 9(1). 55–84. ( (Accessed 2024-03-12).

Sinnemäki, Kaius. 2004. Complex right-branching clauses. University of Helsinki. (Unpublished master’s thesis).

Skribnik, Elena. 2022. Clause combining. In Bakró-Nagy, Marianne & Laakso, Johanna & Skribnik, Elena (eds.), The Oxford guide to the Uralic languages, 996–1017. Oxford: Oxford University Press.

Stefanowitsch, Anatol. 2020. Corpus linguistics: A guide to the methodology (Textbooks in Language Sciences 7). Berlin: Language Science Press.

Sugiura, Nariaki. 1978. Further analysis of the data by Akaike’s information criterion and the finite corrections. Communications in Statistics – Theory and Methods 7(1). 13–26.

Venables, William N. & Ripley, Brian D. 2002. Modern applied statistics with S (4th ed.). New York: Springer.

Vilkuna, Maria. 2022. Word order. In Bakró-Nagy, Marianne & Laakso, Johanna & Skribnik, Elena (eds.), The Oxford guide to the Uralic languages, 950–960. Oxford: Oxford University Press.

Ylikoski, Jussi. 2022. South Saami. In Bakró-Nagy, Marianne & Laakso, Johanna & Skribnik, Elena (eds.), The Oxford guide to the Uralic languages, 113–129. Oxford: Oxford University Press.




How to Cite

Jurkiewicz-Rohrbacher, E., & Kehayov, P. (2025). Deeply embedded clauses in Finno-Ugric: A pilot study on Estonian and Moksha Mordvin. Finnish Journal of Linguistics, 37, 105–133.


