Why reinvent the wheel for FAIR?


A message to FAIR-ists

The current discussions around FAIR (stands for Findable, Accessible, Interoperable, Reusable) datasets are raising a lot of confusion. It occurs to me, that the current discussions about FAIR in different forums often seem to freshly start inventing solutions for digital preservation, by describing the problem but not calling it a digital preservation issue. Digital preservation exists already for more than 20 years. For lots of problems there are already solutions. No need to freshly start wondering how to cope with keeping FAIR data findable, accessible, interoperable and re-usable over the years: this is daily work for digital preservationists.  The explicit exclusion of digital preservation in the EU report “Turning FAIR into reality” was a clear statement but is not helpful in solving the range of problems. FAIR should be connected with OAIS and digital preservation in general. We all would benefit from a better interaction between FAIR and digital preservation.

The start of a dataset

When a researcher creates a dataset, the next step is to add the relevant information to make this dataset FAIR-ready: metadata, persistent identifiers, relevant software etcetera. The dataset in itself is not FAIR, but it is the infrastructure around a dataset, such as a website, search keys, persistent identifiers, software etc. which can make it FAIR. As long as a researcher is the only custodian of this data set, s/he is responsible for making and keeping the dataset FAIR. A researcher can take care of the accessibility of the dataset, but he or she might change workplace or die, and then someone else needs to maintain the FAIR principles. That, or right after the FAIR readiness of the dataset, is when a repository comes into view.

The trusted repository

At the moment the dataset is handed over to a repository, it will become the responsibility of the repository to assess the FAIRness of the dataset and to decide whether the repository will be able to keep the dataset FAIR over the longer term. The researcher will become the Producer and the dataset will,  in OAIS terminology,  become a Submission Information Package.  From a preservationist point of view, it is not important whether the repository aims to keep the dataset FAIR for one year or for 10 years:   digital preservation (or curation or stewardship) is the solution to keep the data FAIR  in the long run.  Digital preservation starts when the dataset is deposited in a trustworthy repository with a preservation policy.

The added Value: Authenticity for the long term

This repository will not only keep the dataset FAIR, but does two more crucial things to the dataset: it will keep the dataset authentic over the years. Authenticity and time are crucial aspects, fully covered by OAIS, but not covered by the FAIR principles.  Nor are they covered in the recent published White paper on TRUST. In this paper a proposal is made to assess trusted repositories, based on Transparancy, Responsibility, User Community, Sustainability and Technology. But we do not need new criteria to assess repositories, we already have a framework for trustworthy repositories. Based on OAIS we have the Core Trust Seal, the nestor seal and the ISO 16363 standard, also known as the European Framework (although the 3 partners seems no longer collaborating, in the preservation world this framework is still a starting point). Paragraph 3.1 in OAIS on Responsibilities of an OAIS archive fully covers the 5 topics of TRUST. With the important addition of authenticity for the long term. It is high time that the world of FAIR and digital preservation embrace each other, to avoid costly reinventions of wheels.

So FAIR-ists, you can reuse the vast amount of knowledge and experience of the preservationists!

© 2020 Barbara Sierman

b-s-i-e-r-m-a-n-@-d-i-g-i-t-a-l-p-r-e-s-e-r-v-a-t-i-o-n-.-n-l