Research Data Management (RDM) and Digital Preservation (DP) communities: “do they really collaborate or are they only co-existing”? With this introduction, an interesting article starts, written by Michelle Lindlar, Pia Rudnik, Sarah Jones and Laurence Horton and presented at iDCC 2020, but also available on Zenodo. (as well as the slides)
The question is a very valid one and I have raised the topic before in relation to FAIR data and suggested to start investigating “whether FAIR Data Objects will lead to sustainable Archival Information Packages?”. In their article the authors took a fundamental approach and started an investigation into models, terminology and concepts used in the DP and RDM domain respectively, trying to identify areas overlapping or complementing.
Instead of making a sharp distinction between the RDM and DP activities, it might be more beneficial to get a better knowledge of the models, terminology and concepts, leading to more synergy and more collaboration, and finally to better preservation of the research data, according to the authors.
Six concepts are examined: the well-known DCC life cycle model, the Object Levels of Preservation, Data Management Plans, FAIR, OAIS, and finally PREMIS. All concepts are explained, making it understandable for readers from either the DP or RDM community and showing the importance of the concept in their original field. At the end of the article a mapping of key terminology is added.
Based on this mapping of DP and RDM concepts and terminology, the authors identified three areas where collaboration between the domains can be improved.
- DP could pay more attention to the role of the Data Management Plans which “have not had a significant impact on the DP community” but these Plans – if meeting certain quality levels (BS)- will give context information for the digital object, which is essential for digital preservation.
- RDM could pay more attention to the concept of Designated Community. In this case however it is not only a question of “who is most likely to use the data and how” but an addition should be made for machines and how are they most likely to use the data and how (BS).
- And preservation concepts should be applied to the FAIR principles. For this to realize, the DP community should infiltrate more in the RDM world I think, but I’m not sure whether the original authors of the FAIR principles have this in scope. In their recent explanation of the FAIR principles, they state “Any interpretation or implementation of the FAIR principles may in essence be chosen as long as they lead to machine-actionable results.” So here it is more the other way around: DP should organize the FAIRness of the digital data (which we have tried to address in the TRUST Principles). And especially the “machine-actionable results” over time will be a challenge for DP. After all, the R in FAIR is not about DP but about access “while the focus of R1 is to enable machines and humans to assess if the discovered resource is appropriate for reuse, given a specific task” (see the explanation).
In the article the authors advise the RDM community to adopt the more granular view on a digital object that the DP community has, based on Thibodeau’s’ Object levels of Preservation: Physical, Logical and Conceptual.( In the next version, perhaps this use case could get a more extensive explanation? I was not sure I understood it completely).
Overall, I think the article is showing us that there are plenty of opportunities to collaborate between RDM and DP. Am I convinced now that FAIR data objects can lead to sustainable AIP’s? Well, perhaps if the RDM community accepts the Levels of Preservation and the DP community will benefit from high quality Data Management Plans. The FAIR principles itself are currently not enough, focused as they are on the data objects and not on the longer term. But perhaps we can change that by adopting the TRUST principles.