The corpus comprises circa 0.4 million words (0.5 million tokens) of early Scottish correspondence by male and female writers dating from the period 1540–1750. The corpus consists of transcripts of original letter manuscripts, which reproduce the text disallowing any modernisation, normalisation or emendation. Language-external variables such as date, region, gender, addressee, hand and script type have been coded into the database. The writers originate from fifteen different regions of Scotland; these can be grouped to represent the areas of North, North-East, Central, South-East, and South-West. In addition, there are two categories of informants that have not been defined by geographical origin: representatives of the court and professional people such as members of the clergy. The proportion of female informants in the corpus is 21 per cent.
Latest versions/subcorpora: | |
The Parsed Corpus of Scottish Correspondence, source Metadata and license Attribution instructions |
Will be available for download soon |
Helsinki Corpus of Scottish Correspondence (1540-1750) Metadata and license Attribution instructions |
Select the corpus in Korp |
Helsinki Corpus of Scottish Correspondence (1540-1750), VRT Metadata and license Attribution instructions |
Download the resource |
ScotsCorr is available in the Korp concordance service of Kielipankki (the Language Bank of Finland); direct link: http://urn.fi/urn:nbn:fi:lb-2016121607. Note that you will need to log in to Korp and have access rights to ScotsCorr. For more information, please see the section Accessing ScotsCorr of the ScotsCorr Korp Guide.
ScotsCorr data in VRT format is available in the download service of Kielipankki, the Language Bank of Finland, at www.kielipankki.fi/download. Note that you will need to have access rights to ScotsCorr.
The following documentation has been written by Anneli Meurman-Solin:
In addition, you may find it helpful to consult the on-line Dictionary of the Scots Language.
For The Parsed Corpus of Scottish Correspondence the original resource, produced by Anneli Meurman-Solin in 2017, has been syntactically parsed and annotated in the Penn Parsed Corpora of Historical English (PPCHE) format by Lisa Gotthard in 2024.
More information on the format, as well as the annotation manual, can be found here: https://www.ling.upenn.edu/hist-corpora/annotation/index.html
The same information, as well as information on known issues, can be found here: https://www.lisagotthard.com/the-pcsc
This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-202104191