Oppijankieliaineistojen annotointi – esimerkkinä ICLFI:n annotoinnin prosessit, ongelmat ja ratkaisut


  • Jarmo Harri Jantunen Jyväskylän yliopisto
  • Sisko Brunni Oulun yliopisto
  • Liisa-Maria Lehto Oulun yliopisto
  • Valtteri Airaksinen Oulun yliopisto


corpus study, learner language corpora, annotation, error annotation


This article illustrates the grammatical and error annotation of learner language with the help of the International Corpus of Learner Finnish (ICLFI). In particular, we will focus on issues arising from handling with at least semi-automatic methods a morphologically rich language. What makes this corpus special compared to, for example, English-language material, is the frequent variation in different forms and related errors, both due to the rich morphology of the target language. This article begins with a description of the design and implementation process of both the grammatical and error annotation, followed by a brief introduction to the material for which the annotations were designed. Finally, we outline some of the problems that have arisen during the annotation process and their solutions.