Shortname | Name and metadata | License | Location | Cite | Resource group and help | Apply | Publication year | Support level |
---|---|---|---|---|---|---|---|---|
Shortname | Name and metadata | License | Location | Cite | Resource group and help | Apply | Publication year | Support level |
These resource versions are not yet available in the Language Bank of Finland.
Shortname | Name and metadata | License | Formats | Support level | Contact Person | Resource group and help | Location | Other information |
---|---|---|---|---|---|---|---|---|
Shortname | Name and metadata | License | Formats | Support level | Contact Person | Resource group and help | Location | Other information |
The Helsinki Korp Europarl Bilingual Corpora are:
The Helsinki Korp Europarl Finnish-English Corpus
The Helsinki Korp Europarl Finnish-Swedish Corpus
The Helsinki Korp Europarl Finnish-German Corpus
The Helsinki Korp Europarl Finnish-French Corpus
The Helsinki Korp Europarl Finnish-Spanish Corpus
The Helsinki Korp Europarl Finnish-Estonian Corpus
The corpora contain texts of the Europarl Parallel Corpus v7.
The Europarl parallel corpus is extracted from the proceedings of the European Parliament. The goal of the extraction and processing was to generate sentence aligned text for statistical machine translation systems. For this purpose matching items were extracted and labeled with corresponding document IDs. By using a preprocessor, sentence boundaries were identified. The data was sentence aligned by using a tool based on the Church and Gale algorithm.
For more information on the Europarl Parallel Corpus see http://urn.fi/urn:nbn:fi:lb-20140730195 and http://www.statmt.org/europarl/
This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021052403
Last modified on 2025-05-14