Karjalan Sanomat -korpus

Petroskoin (käännös)suomen piirteitä

Authors

  • Jukka Mäkisalo Itä-Suomen yliopisto
  • Hannu Kemppanen Itä-Suomen yliopisto
  • Anna Saikonen Petroskoin valtiollinen yliopisto

DOI:

https://doi.org/10.61200/mikael.129453

Keywords:

Finnish language, corpus, minority language, translated language

Abstract

The present article introduces a minority language corpus, the Newspaper Corpus of Karelian Finnish, and the tentative results of lexical and grammatical analysis based on it. The corpus was compiled from the newspaper Karjalan Sanomat and consists of two sub-corpora: texts originally written in Finnish and texts translated into Finnish from Russian. This is the first attempt to analyse a minority language variant of Finnish spoken outside Finland by comparing it to Standard Written Finnish (SWF) and using corpus linguistic methods. The two sub-corpora of Karjalan Sanomat are compared to the newspaper Karjalainen published in Eastern Finland, and the objective is to gain a lexical and grammatical cross-section based on corpus linguistic methods. The analysed features include some quantitative lexical indicators (word length, sTTR), distribution of syntactic categories, and key words (lexical contents and morphosyntactic categories). The comparison shows that the two variants of Karelian Finnish resemble each other more than SWF. Furthermore, it is theoretically interesting that a translated minority language variety is lexically richer (less repetitive) than originally written minority language variety, a result opposite to earlier comparisons between translated and non-translated language variants within majority languages.

Downloads

Published

2016-04-01