The corpora in the Language Bank may be accessible via a web interface (Korp) or by using command line tools in CSC’s computing environment. Some corpora can also be downloaded via the Download service.
Selected versions of the downloadable CLARIN PUB or ACA licensed corpora are also available in CSC’s Puhti computing environment in the directory /appl/data/kielipankki/. They are marked ”Puhti” in the ”Location” column in the list below. To use the servers you need a CSC account.
Some corpora are directly available, some may be accessed by signing in or by applying for individual access rights. Protected corpora can be accessed following the link
in the Apply column.Our language resources have three different levels of support.
A: The resource is under active development. The Language Bank of Finland fixes any issues as soon as possible.
B: The resource is developed only upon user request. The Language Bank of Finland aims to fix issues concerning the resource, but external contributions may be required.
C: The resource is available ”as is”. The Language Bank of Finland does not fix nor develop the resource.
Corpora-specific reference instructions and citations in other papers are available by clicking the quote
link.If you are looking for a corpus not listed here, please have a look in COMEDI or CLARIN Virtual Language Observatory (VLO).
Please find an overview of all our resources sorted by resource families in the resource families of FIN-CLARIN.
Shortname | Name and metadata | License | Location | Cite | Resource group and help | Apply | Publication year | Support level |
---|---|---|---|---|---|---|---|---|
Shortname | Name and metadata | License | Location | Cite | Resource group and help | Apply | Publication year | Support level |