Kielipankki – The Language Bank of Finland is a service for researchers using language resources. Maija Saviniemi, a university lecturer at the University of Oulu tells us about how she makes use of the resource Iijoki, the University of Oulu Päätalo collection, Kielipankki Korp version in her research.
I am Maija Saviniemi, university lecturer from Oulu, and a researcher in the Finnish language. I defended my doctoral thesis in 2015 on the area of language planning discourses, and my research interests include different sociolinguistic topics.
Before describing the topic, I would like to pay attention to the path in arriving to the present situation. Already during my doctoral dissertation research I found myself thinking about how many different approaches electronic data could open, starting from learning to know the nature of the data, and then particularly in the analysis phase. A couple of years ago, when writing the history of our subject, I spent some time learning about the creation of the Oulu corpus dating from 1967 onwards, which set me into thinking how I could promote the research traditions of making use of electronic resources in our department. A year ago, Sari Keskimaa defended her doctoral thesis at the University of Oulu on Kalle Päätalo’s Iijoki collection as a linguistic biography.
Actually, these three originally independent facts had somehow intertwined in my mind so that I found myself suggesting both to the Language Bank of Finland team and the right holders of Kalle Päätalo’s works to publish the whole Iijoki series as an electronic corpus. The Language Bank was immediately onboard. Also both the author’s family and Gummerus Publishers adopted a positive view to the project from the very beginning, and since Gummerus had published the 26 novels that form the Iijoki series as e-books already, processing the data into electronic format had in practice already been completed. Iijoki, University of Oulu Päätalo Collection was published just before the 100th anniversary of the author in November 2019.
For almost 2.5 years I have been preparing the symposium ”Kalle Päätalo tutkijoiden silmin [Research perspectives to Kalle Päätalo]” organized in November 2019 by the Faculty of Humanities at the University of Oulu in collaboration with the Oulu City Library. The father of the idea is professor Harri Mantila. In the symposium, papers on contemporary research on Päätalo in various fields were presented. My own presentation naturally focused on the Päätalo corpus published in Kielipankki as well as its use, and my ongoing research was presented as an example of how the corpus can be made use of. I am currently working on the first Päätalo discourse analysis research project supported by examples from corpora. I am interested in the reverse humour mentioned by Päätalo, which I am trying to catch with one of the affective characteristics of language, namely swearwords. In the future I am obviously interested in various kinds of key word analyses, where I can compare the Iijoki corpus with other literature corpora. There are actually no limits in finding various topics around Päätalo. The corpus lends itself to research on for example phonetic or morphological features or language use in Finland’s different dialects, White Sea Carelian, or Finnish spoken by Skolt Lapps. Idiolects of fictive figures are readily available: in addition to the main character, the Iijoki series presents around 2,000 minor characters. The corpus offers research settings not only in linguistics but also in many other fields. The Iijoki series contains extensive metalanguage and dialect. Numerous other fields can be studied in addition to linguistic topics. In his works, Päätalo makes ethnographic observations on working practices and folk medicine. In addition to folklore studies, the data can also be studied as a description of the independent Finland’s history: the Iijoki series contains the author’s life history from 1910’s up to 1990’s.
My research on Iijoki focuses on the swear words in the lines of Hermanni Päätalo in the novel Loimujen aikaan and I compare them with other swear words in the work. I aim at functional, syntactic and etymologic ananlysis of the swear words. I now have an option of widening my close reading of one work by comparing my findings with the other works in the whole Iijoki series. For example, I have lately been contemplating on the swear words siivatta and ketehen.
This series of literary works with around 17,000 pages will definitely offer subjects for research for a long time forward. Linguistic material can be more easily searched now that the data has been published in electronic form. The corpus Iijoki, the University of Oulu Päätalo collection, Kielipankki Korp version comprising the Iijoki series has 5,280,750 tokens and 494,614 sentences. This is to my knowledge an exceptionally large text corpus comprising literary works of a single author. The data naturally contains a lot of dialect words, and the automatic processing of them is not straightforward. However, already at the current stage the corpus makes different discourse analysis based research projects possible. A person familiar with Päätalo’s works just recently predicted that Päätalo will be read after 50 years. Maybe the works will become popular again when enough time has passed from the life they describe. In any case, the corpus is now ready.
The papers in the symposium ”Kalle Päätalo tutkijoiden silmin” will be published as an academic refereed article collection around 2021 with the working title ”Iijoelta akatemiaan [From Iijoki to Academia]. My article on corpus based research on Päätalo will be part of this publication.
The FIN-CLARIN consortium consists of a group of Finnish universities along with CSC – IT Center for Science and the Institute for the Languages of Finland (Kotus). FIN-CLARIN helps the researchers in Finland to use, to refine, to preserve and to share their language resources. The Language Bank of Finland is the collection of services that provides the language materials and tools for the research community.
All previously published Language Bank researcher interviews are stored in the Researcher of the Month archive.