Avainsana-analyysi annotoidun oppijankieliaineiston tutkimisessa: Alustavia havaintoja

Kirjoittajat

  • Jarmo Harri Jantunen Oulun yliopisto

Avainsanat:

corpus-driven analysis, keywords, annotation, learner corpora, learner Finnish

Abstrakti

This paper documents the preliminary findings from a survey in which corpus-driven keyword analysis is employed to investigate a lemmatised and annotated learner language corpus. Keyword analysis is seldom used to analyse grammatically annotated data, and to my knowledge, never in analyses of tagged learner data. This article illustrates the kinds of over- and underused items that can be found in learner corpus data using keyword analysis. These include grammatical tags, content keywords, and tentative learner language keywords. The analysis reveals that annotated data yield a more complete picture of the nature of the atypical frequencies of linguistic items in learner language. The article also discusses the role of other methodological choices, such as the criteria for defining the level of proficiency (learning hours vs. CEFR).
Osasto
Artikkelit

Julkaistu

2011-10-13