How is a sign language corpus created and for what?
Keywords:
annotation, corpus, Finnish sign language, Finland-Swedish sign language, lexical database, Signbank, sign language corpusAbstract
This article deals with the construction of the corpora of Finnish sign language and FinlandSwedish sign language in the CFINSL project (Corpus project of Finland’s Sign Languages). Sign languages do not have a written form, thus the construction of corpora demands a different approach compared to the spoken languages which have a written form. This article presents the corpora constructed in the Sign Language Centre in the University of Jyväskylä: the collection of the material; the technical processing of the videos; the collection and the processing of metadata; the annotation of the recorded material; and the storage and the publication of the material. In addition to the corpora, a lexical database, Signbank, has been created. It facilitates the annotation process and helps the use of the corpora in research and instruction. The corpora also document the sign languages used in Finland for the language societies today and for future generations.