Project: FIN-CLARIAH
Grant agreement: Academy of Finland no. 345610
Start date: 01-01-2022
Duration: 24 months
WP 4.1: Report on Harmonization code
Date of reporting: 20-02-2022
Report author: Leo Lahti (University of Turku)
Contributors: Pyry Kantanen (University of Turku)
Deliverable location: Internal
Open source software and algorithms for harmonizing raw bibliographic data downloaded from the Finnish National Library OAI-PMH API in XML format are essential for building open and replicable infrastructures for harmonizing the Finnish National Bibliography (FNB) metadata and other similar datasets. Tailored for our specific use case of the Fennica FNB dataset, we have written harmonization codes utilizing the R programming language and some of its openly available libraries. The harmonization code is accompanied by documentation for individual functions and as well as running the scripts that perform the data harmonization procedures for each field.