Kielipankki – The Language Bank of Finland is a service for researchers using language resources. Jussi Ylikoski tells us about his research on the grammatical properties of Finnish and other Uralic languages.
I am Jussi Ylikoski, a linguist. I have been working at the University of Oulu for five years as a professor of Saami language, but starting in the autumn of 2022, I will be a professor of Finno-Ugric languages at the University of Turku. So, I do research on quite a few languages, including Finnish.
I have worked on quite a large number of research topics on Finnish and other Uralic languages, and partly outside the Uralic family, too. I have mainly focused on grammars (morphology and syntax) of both better- and lesser-known languages, and occasionally also on etymology. When describing present-day languages, I often can’t help looking at them also from a diachronic perspective, and when I study the historical development of these languages, I tend to pay quite a lot of attention to the actual use of modern languages in the light of real text corpora.
I have used the corpora available in the Language Bank of Finland particularly as a researcher of Finnish grammar. As early as in 2003, I published an article in which I used the Finnish Text Collection in the Language Bank to show that the verb form known as the so-called fifth infinitive (-maisillaan/–mäisillään, ’on the verge of doing something’) can be used in many other ways in addition to the periphrastic construction with the verb olla (’to be’), contrary to what had been regularly stated in grammars. For instance, the ’forehead veins’ (otsasuonet) may ‘be on the verge of bursting’ (olla repeämäisillään), but they might also be ‘bulging on the verge of bursting’ (pullistella repeämäisillään), or someone may be afraid and ‘waiting (for something) with his/her forehead veins on the verge of bursting’ (odottaa otsasuonet repeämäisillään).
In recent years, I have been fascinated by the larger and larger text corpora containing billions of words that are available through the Language Bank of Finland and other CLARIN services. In my research, I have used e.g. the Korp version of the University of Helsinki E-thesis collection, the Finnish subcorpus of the Newspaper and Periodical Corpus of the National Library of Finland, the Suomi 24 Corpus, Ylilauta Corpus, and the Corpus of Finnish Magazines and Newspapers from the 1990s and 2000s, version 2. With the help of large corpora, it has been possible to discover, in a way, even new morphological cases also in a well-known and well-described language like Finnish. Among other things, I have studied the syntactic properties of forms traditionally known as the prolative, and I have found them to be used in ways that are much more similar to case forms than what has been suggested by previous research literature. Prolatives are not always only individual adverbs (e.g., maitse ‘by land’ and meritse ‘by sea’), but these forms can also be modified by subordinate clauses (e.g., mailitse jossa on helpompi kaunistella asioita ‘by email where it is easier to embellish facts’ and tekstiviestitse joihin turhan harva vastaa ‘by text messages that tend to be answered by too few’).
I have made my most exciting observations when studying forms that were previously considered as clear-cut derivations, such as lauantaisin ‘on Saturdays’ and viikonloppuisin ‘on weekends’ or kunnittain ‘by/across municipalities’ and aihealueittain ‘by/across thematic areas’. In the multi-billion word corpora searchable through the Korp interface of the Language Bank of Finland, it is possible to find hundreds or even thousands of relatively natural sentences, in which even these kinds of forms can have various modifiers that make them look like noun inflections: elokuun lauantaisin ‘on August Saturdays’, joka lauantaisin ‘on every Saturday’, satunnaisin viikonloppuisin ‘on random weekends’ or, e.g., Suomen kunnittain ‘by the municipalities of Finland’, eri maittain ‘by different countries’ ja tietyin aihealueittain ‘by certain thematic areas’. Since these kinds of temporal and distributive expressions look like case-inflected noun phrases, I have playfully called them “dwarf cases” in analogy to the fact that Pluto that was formerly known as a planet but is now called a dwarf planet.
After working on the hazy boundary between derivation and inflection, I have also ended up studying the abessive case in Finnish (rahatta ‘without money’, internetittä ‘without Internet’, etc.) and the so-called t accusative (minut ‘me’, meidät ‘us’, etc.) more thoroughly than before. Even though I personally like to observe and to describe forms and syntactic structures largely by means of descriptive linguistics, the tools of the Language Bank do also offer a lot of opportunities for those who are interested in quantitative analysis.
In addition to the corpora in the Language Bank of Finland, I have also used the corpora of Saami languages and many other Uralic minority languages that have been produced by the language technologists in Tromsø, Norway. The corpora are available via the Korp service maintained by Giellatekno, i.e., the user interface is similar to that of the Korp service in the Language Bank of Finland. Those who are interested also in other Uralic languages besides Finnish can access the corpora in the Tromsø Korp service, http://gtweb.uit.no/korp/ (Saami) and http://gtweb.uit.no/u_korp/ (other languages). With 63 million words of annotated Mari, what more can a Uralicist wish for?
Ylikoski, Jussi. 2003. Havaintoja suomen ns. viidennen infinitiivin käytöstä. [Summary: Remarks on the use of the proximative verb form (the so-called 5th infinitive) in Finnish.] Sananjalka 45. 7–44. https://doi.org/10.30673/sja.86640
Ylikoski, Jussi. 2018. Prolatiivi ja instrumentaali: suomen –(i)tse ja –teitse kieliopin ja leksikon rajamailla. Sananjalka 60. 7–27. [Summary: On Finnish prolatives and instrumentals: –(i)tse and –teitse in between grammar and lexicon.] https://doi.org/10.30673/sja.69978
Ylikoski, Jussi. 2020. Kielemme kääpiösijoista: prolatiivi, temporaali ja distributiivi. Virittäjä 124. 529–554. [Summary: On Finnish dwarf cases: prolative, temporal and distributive.] https://doi.org/10.23982/vir.76971
Ylikoski, Jussi. 2021. Abessiivin apologia. Puhe ja kieli 41. 139–157. [Summary: Apologia of the Finnish abessive case.] https://doi.org/10.23997/pk.110924
Ylikoski, Jussi. 2021. Mistä voisin löytää sen entisen sinut? Suomen kielen akkusatiivi- ja pronominioppia. – Leena Maria Heikkola, Geda Paulsen, Katarzyna Wojciechowicz & Jutta Rosenberg (toim.), Språkets funktion. Juhlakirja Urpo Nikanteen 60-vuotispäivän kunniaksi. Festskrift till Urpo Nikanne på 60-årsdagen. Festschrift for Urpo Nikanne in honor of his 60th birthday. Åbo: Åbo Akademis förlag. 220–243. https://urn.fi/URN:ISBN:978-952-12-4062-1
The FIN-CLARIN consortium consists of a group of Finnish universities along with CSC – IT Center for Science and the Institute for the Languages of Finland (Kotus). FIN-CLARIN helps the researchers in Finland to use, to refine, to preserve and to share their language resources. The Language Bank of Finland is the collection of services that provides the language materials and tools for the research community.
All previously published Language Bank researcher interviews are stored in the Researcher of the Month archive. This article is also published on the website of the Faculty of Humanities of the University of Helsinki.