The University of Helsinki Language Corpus Server (UHLCS) was a multilingual data bank founded in the late 1980s. The UHLCS collection includes text corpora of more than 50 languages, including minority languages and various text types. There are also tools specifically developed for analyzing the UHLCS corpora. The use of most corpora is restricted for research and teaching. Read more…
Subcorpora: | |
Chuvash Corpus (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
English Corpus (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Corpus of Erzya and Moksha Mordvin Literature and Journals and Komi Zyrian Literature (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Erzya and Moksha Mordvin Word List Corpus (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Estonian Corpus 1 (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Estonian Corpus 2 (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Finnish Corpus (Bibles) (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Finnish Corpus (Literature) (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
The Helsinki Korp Version of the Finland-Swedish Text Corpus (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Korp |
The Finland-Swedish Text Corpus (UHLCS), source Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Ingrian Corpus (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Khanty Corpus (North Khanty, Corpora and Translations) (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Komi Zyrian Corpus (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Latin Corpus (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Lude (Ludian) Corpus (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Nenets Corpus (Tundra Nenets) (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
North Saami Corpus (Literature) (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
North Saami Corpus (Sámikultuvradoaibmagotti smiehttamush) (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Quantifiers and Quantification in Finnish and Languages Spoken in the Central Volga–Kama Region (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Somali Corpus (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
The Susanne Corpus (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Ume Saami Corpus (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Uralic, Turkic, Indo-Iranian and Mongol languages; languages of Siberia and Caucasia (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Uzbek-English Dictionary (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
Lists of Words Corpus (UHLCS) Metadata and license Attribution instructions |
Apply for access rights Access the corpus in Puhti |
The University of Helsinki Language Corpus Server (UHLCS) is a multilingual data bank founded in the late 1980s and maintained by the Department of General Linguistics at the University of Helsinki until September 2007. When the old server was taken out of use, the UHLCS corpora were moved to servers maintained by CSC – IT Center for Science, and the corpora were made available via the Language Bank of Finland.
At present, the UHLCS collection includes text corpora of more than 50 languages, including samples of minority languages and extensive corpora representing different text types. There are also tools specifically developed for analyzing the UHLCS corpora.
The use of most corpora is restricted for research and teaching. Resource-specific information and license conditions can be found in the metadata record of the corpus in question.
In 2000, the corpora from the Uralic, Turkic, Tungusic, Mongolic, Chukotko-Kamchatkan, Iranian and North-East Caucasian languages were edited for public use with the financial support of the Max Planck Institute for Evolutionary Anthropology, Leipzig. In summer 2003, the basis for the metadata descriptions of the corpora were prepared with the financial support of the ECHO project (ECHO = European Cultural Inheritance Online).
Last updated: 28.2.2024
This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2023030901