Digital Preservation Seeds

by Barbara Sierman

Machines as Designated Community

What if the Designated Community is a machine?

As a response to the growing uptake of the FAIR principles ánd the “inconsistent interpretations” (as phrased in the article) of these principles, the original authors of the FAIR principles wrote the article “FAIR Principles: Interpretations and Implementation Considerations”. In this, every principle got an “Interpretation” and a “Consideration” paragraph, further explaining what challenges currently are identified for the implementation of each principle.

But first the authors start by stressing that FAIR principles (Findable, Accessible, Interoperable and Re-usable) are primarily about machine-actionability, and secondary about human interactions with the data. Data should be FAIR so that machines can find them, access them, handle the interoperability and are able to re-use them. “The machine knows what we mean”.

In my opinion this focus on the machine-actionability is often overlooked in the discussions I hear  around me in the digital heritage (libraries and archives) domain (GLAM). And this focus on machine-actionability will have its consequences for long term preservation of the data. The Designated Community is not only human, but will also be a machine. What are the consequences of that concept and is our current standard OAIS (ISO 14721)  robust enough to cope with that?

When the FAIR principles were first published, there was some confusion in the world of digital preservation whether the R of Reuse also included the preservation of the data. This article makes an end to that discussion. It is not about preservation of the data. Reuse is about the ability “to enable machines and humans to asses if the discovered resource is appropriate for reuse, given a specific task”, to give machines and humans “operational instructions” to do this. Of course, one could stretch this ability for a longer term, but that is not included in the FAIR principles.

But it is the role of the trustworthy digital repositories that will preserve the data for long term. Recently the TRUST principles were developed, describing what one could expect from these repositories, focused around 5 themes. Transparency. Responsibility. User-Community. Sustainability. Technology.

 The “Interpretations and Implementation Considerations” are helpful in figuring out what is expected from these repositories to keep the data FAIR for long term. My conclusion is that this will require a very active Preservation Watch function. The FAIR principles can only be implemented by using standards, vocabularies, searchable resources, access protocols etc. etc. In order to keep up with the developments in all these areas, the Preservation Watch function of the repository need to be a very active one. If not, data might become undiscoverable, inaccessible, no longer interoperable and not usable any more. Are libraries and archives prepared for this, when they accept datasets to preserve for the long term – even datasets that were produced with their own materials as a basis?

The Preservation Watch function is needed to identify changes in domain-descriptive descriptors and possibly will need to add new metadata(F1), to monitor where the “searchable resource” is located to upload these new metadata in order to be registered or indexed (F1), to monitor changes over the years in access protocols and domain specific authentication and authorization procedures  (A1). To keep the metadata FAIR in case the date are no longer available in order to facilitate citations (A2). To monitor changes in standards that are used for “knowledge representation” (I1), like RDF and vocabularies in specific domains. To monitor the validity of the references to other (meta)data.(I3) (In the article the example is given of references to WikiData, but how sustainable is this resource?) And to offer enough information so that both machines and humans can decide whether the resource is appropriate and has a clear and accessible data usage license (if possible, for the data and the metadata separately).

Preservation Watch as a concept is often mentioned in digital preservation and was recently added to the new draft of the ISO OAIS standard. The concept in itself is clear, but the implementation depends on the repository. Implementing FAIR principles by the researchers will also have implications for the repositories in the humanities and digital heritage organisations. To keep the data FAIR for the long term, it will not be enough to focus on file formats or software, but we will need to identify the consequences of machines as part of our Designated Community.

Leave a Reply

© 2020 Barbara Sierman

Theme by Anders Norén adapted by Bob Koeman