Automaattinen asiasanoitus Radio- ja televisio-ohjelmatietokanta Ritvassa
Avainsanat:
automaattinen sisällönkuvailu [http://www.yso.fi/onto/yso/p27440], asiasanoitus [http://www.yso.fi/onto/yso/p26984], sisällönkuvailu [http://www.yso.fi/onto/yso/p13380], koneoppiminen [http://www.yso.fi/onto/yso/p21846], ohjelmatekstitys [http://www.yso.fi/onto/yso/p25451], muistiorganisaatiot [http://www.yso.fi/onto/yso/p21159], audiovisuaalinen aineisto [http://www.yso.fi/onto/yso/p6545]Abstrakti
National Audiovisual Institute’s (KAVI) radio and television archive started a joint project with the Finnish broadcasting company (Yle) and the National Library of Finland to develop automated indexing using program subtitles as a source. Project relies on Annif tool originally developed by Osma Suominen. Annif is built upon a combination of existing natural language processing and machine learning tools. It is designed to be multilingual and it can support any subject vocabulary. Annif can use several different backends. During the spring and summer of 2019, 313 Yle programmes were jointly annotated by KAVI and Yle for Annif testing. Analysis was made using a cross-validation technique. It was noted that television programme may be produced so that the central theme is not mentioned at all. When a brief programme description was included, the results improved. Results and quality were promising and the project will continue.
Viittaaminen
Copyright (c) 2020 Tommi Lehtonen, Juha Piukkula
Tämä työ on lisensoitu Creative Commons Nimeä-EiKaupallinen-JaaSamoin 4.0 Kansainvälinen Julkinen -lisenssillä.