Tools

The tools and services maintained by the Language Bank may be accessible via a web interface, or they can be installed via download from e.g. GitHub or Korp. You can also find other tools developed by member organizations of FIN-CLARIN / CLARIN ERIC.

Service levels

Our language resources have three different levels of support.

A: The resource is under active development. The Language Bank of Finland fixes any issues as soon as possible.
B: The resource is developed only upon user request. The Language Bank of Finland aims to fix issues concerning the resource, but external contributions may be required.
C: The resource is available ”as is”. The Language Bank of Finland does not fix nor develop the resource.

If you are looking for a tool not listed here, please have a look in COMEDI or CLARIN Virtual Language Observatory (VLO).

Please find an overview of all our resources sorted by resource families on Resource families Fin-Clarin.

Etsi:

Start	Name (and metadata)	Description	Instructions	Install	Info	Service level
	Korp	A web-based concordance tool that can be used for corpus queries based on morphosyntactic analysis and various other features.	Instructions			A
Download	Download service	Download certain corpora.				A
Aalto-ASR	Aalto University Automatic Speech Recognition System	An automatic speech recognition toolkit that can be used in the CSC computing environment.	Instructions	Install (GitHub)
ANEE Lexical Networks	ANEE Lexical Networks	A graphic semantic dictionary represented as a network. You can use the portal for exploring the meanings of singular Akkadian words in a visual way.
Annif	Annif	Annif is a tool for automated subject indexing and classification, developed at the National Library of Finland.
	CLARIN Federated Content Search	Run a centralized query from all the resources provided by CLARIN centers.
Demo	Demo tools at the Language Bank of Finland	Demos of tools that are in development at the Language Bank of Finland: FinTag and FiNER, FinParse, FinnSentiment, FinnWordNet, HFST POS taggers, HFST morphological analyzers, Lemmamatch, etc. (In Finnish)				C
Dictionary of Contemporary Finnish	Dictionary of Contemporary Finnish	Dictionary of standard Finnish made by the Institute for the Languages of Finland.
digi.kansalliskirjasto.fi	Digi – Digital collections of the National Library of Finland	A search and download service for digital collections from the National Library of Finland. In addition to newspapers and magazines, the collections include, e.g., books, pictures and maps. Note that a large proportion of the newspapers and magazines can also be used via the Korp service in the Language Bank (see KLK).
	ELAN	ELAN is a program for transcribing and annotating audio and video files. It can also be used for searching locally stored collections of annotated material.	Instructions	Install
FinBERT	FinBERT	BERT model trained from scratch on Finnish.		Install (GitHub)
Finland Swedish Online	Finland Swedish Online	A platform offering online courses for learners of Finland Swedish.
FinMeter	FinMeter – Tools for analyzing poetry in Finnish	FinMeter is a library for analyzing poetry in Finnish. It handles typical rhyming such as alliteration, assonance and consonance, Japanese meters and Kalevala meter. It can also be used to hyphenate Finnish and analyse meter. In addition, it can do semantic clustering, metaphor interpretation, concreteness scoring and sentiment analysis.
TDPP	Finnish dependency parser developed by TurkuNLP (TDPP)	An open source dependency parsing pipeline developed by the TurkuNLP group for analyzing Finnish text.		Install (GitHub)
FinTag	Finnish Tagtools	A part-of-speech and morphology tagger and a named entity recogniser for Finnish.		Install Use via Docker		A
FinnONTO	FinnONTO	Finnish and international ontologies, vocabularies and thesauri needed for publishing content cost-efficiently on the Semantic Web.
finnsurveytext	finnsurveytext	Tool set for social science researchers to be able to analyse and understand responses to open-ended questions within their surveys.	Instructions	Install (GitHub)
Gephi	Gephi	A program for network analysis and visualization.		Install
GiellaLT	GiellaLT	GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages
Giellatekno	Giellatekno - Dictionaries and tools	Dictionaries and tools for the analysis of Saami and other morphologically-rich languages.
HeLI-OTS	HeLI-OTS 2.0	HeLI off-the-shelf language identifier with language models for 220 languages.		Install (Zenodo)
	INCEpTION	Text annotation tool. (newer version of WebAnno)	User Guide	Standalone installation		A
Kotus digital collections	Kotus digital collections	The web page offers links to the Institute’s corpora and material available online free of charge.
	Lääketutka	Lääketutka, "the Medicine Radar", provides analytics about health, medicine and symptom-related discussions in the Suomi24 discussion forum.				C
Murre	Murre	The Murre library normalizes non-standard Finnish (puhekieli) to standard Finnish (kirjakieli). The library is maintained by Mika Hämäläinen.
nimiarkisto.fi	Nimiarkisto	Nimiarkisto.fi is a portal with the most important digital resources of names and named entities collected from and archived in Finland.
Nordic Tweet Stream (NTS)	Nordic Tweet Stream (NTS) search & visualization interface	A multilingual monitor corpus of geolocated tweets and associated metadata from the Nordic region.
	openSMILE	Toolkit and library for audio feature extraction, especially for analysing and classifying speech and music signals.	Hands-on tutorial	Installation instructions		C
	OPUS	An interface for open source parallel corpora.
	Praat	Praat is a comprehensive toolkit for annotating, processing, analyzing and visualizing speech. Praat includes a scripting language.	Instructions	Install
	Proto-Indo-European Lexicon	A generative etymological dictionary of Indo-European languages
Sanat	Sanat	A platform for publishing lexica and word lists.				B
	Signbank	Lexical database of Finnish Sign Language.				A
	Sparv	A multilingual toolkit provided by the Swedish Språkbanken for parsing and annotating text in various languages.	User manual (GUI)	Installation and setup
Finnish Internet Parsebank: SETS	Syntax-based search (SETS) from the Finnish Internet Parsebank	Syntax-based search (SETS) from parts of the Finnish Internet Parsebank.	Documentation
tekstiks.ee	tekstiks.ee – Speech recognition: speech to text	Automated speech transcription service for Estonian and Finnish speech and a user interface for transcription editing.
Terminology Forum	Terminology Forum	Terminology Forum – A collection of links to special field glossaries, University of Vaasa
textreuse.sls.fi	Text reuse in the Swedish-language press, 1645-1918	A search engine for searching and analyzing clusters of text reuse in the Swedish-language press from 1645 to 1918.
Texthammer	Texthammer	A search and analysis toolkit for parallel corpora provided by the University of Tampere.	Documentation (PDF)
	The Helsinki Term Bank for the Arts and Sciences	A multidisciplinary project that aims to gather a permanent terminological database for all fields of research in Finland.				A
	Transkribus	A toolkit for transcribing and managing historical documents (e.g., images and scanned text).	Instructions (PDF)	Install
TDPP-LBF	Turku Dependency Parser Pipeline, Kielipankki version (TDPP-LBF)	Finnish Dependency Parsing Pipeline, adapted by The Language Bank of Finland		Install (GitHub)
Turku Neural Parser Pipeline	Turku Neural Parser Pipeline	A tool developed by the Turku NLP group for parsing Finnish text.		Install (GitHub) Demo
TNPP-LBF	Turku Neural Parser Pipeline, Kielipankki version (TNPP-LBF)	Turku Neural Parsing Pipeline, adapted by The Language Bank of Finland		Access via Puhti Install (Docker)
TurkuNLP word embedding	TurkuNLP word embedding demo (word2vec)	A tool developed for analyzing the semantic similarity of words.
UDPipe	UDPipe	UDPipe is a trainable pipeline for tokenization, tagging, lemmatization and dependency parsing of CoNLL-U files.		Install (GitHub)
UDPipe-LBF	UDPipe Kielipankki version	UDPipe is a trainable pipeline for tokenization, tagging, lemmatization and dependency parsing of CoNLL-U files, installed at Kielipankki.		Access via Puhti
UralicNLP	UralicNLP- Natural language processing for many languages	UralicNLP can produce morphological analyses, generate morphological forms, lemmatize words and give lexical information about words in Uralic and other languages.The functionality originates mainly in FST tools and dictionaries developed in the GiellaLT infrastructure and Apertium.
VRT Tools	VRT Tools	Command-line tools for manipulating segmented and annotated text by using VRT (verticalized text) as an interchange format. VRT is related to Corpus WorkBench (used in the backend of the Korp concordancer tool).		GitHub
Wanca	Wanca	Wanca is a portal for websites in Uralic languages.				A
WebMAUS	WebMAUS	A set of tools for automatic segmentation and labelling of speech.	Instructions
Whisper	Whisper	Whisper is a general-purpose speech recognition model trained on a large dataset of diverse audio. Whisper can perform multilingual speech recognition, speech translation, and language identification. Whisper can be used in the CSC computing environment, also in SD Desktop.	Tutorial (CSC)	GitHub: Whisper (OpenAI) and WhisperDO for calling Whisper (by Nicholas G. Cotton)	Tutorial (CSC)	A

Näytetään rivit 1 - 52 (yhteensä 52 )

Vastaa Peruuta vastaus

Sinun täytyy kirjautua sisään kommentoidaksesi.