A Dutch edition of the Keepers registry … but for web archiving!

Last week the Dutch Digital Heritage Network launched a new product: the national registry of web archives in the Netherlands. This collaborative work gives an overview of the websites that are harvested ánd preserved in the Netherlands by a variety of organizations. Not only the KB as national library is collecting web sites, based on the mandate we have (we see websites as “publications”) but many other Dutch organisations are harvesting websites: from the Netherlands Institute for Sound and Vision, to the National Archive and many more. Together we want to save the Dutch web and we want to inform each other about what each of us is doing.

The new registry is a kind of “Keepers Registry ” for the Dutch web archiving. Everyone can see which web sites an institution is harvesting and since when, how often the web site is crawled, with which software and for what reason (for example because of legal mandate or from a collection point of view). Most of the websites can only be viewed on site for legal reasons, but there are exceptions and the access regime is part of the information given. If possible there is also a link to the current live website.

One of the reasons to start this initiative was to avoid duplication of effort. It still might be the case that two organizations are harvesting the same website, but from now on this is intentionally. For example because they have a different perspective (legal mandate versus collection building) or are harvesting it in a different way. We already know of smaller organisations that will not invest in harvesting websites that could be a potential enrichment to their collection, because they are happy to know from the register that another Dutch organisation takes the long term responsibility for it.

Currently we are contacting Internet Archive to discuss whether we could incorporate the Dutch websites in their collection in this registry as well.

Open Preservation Foundations new Strategy

There are several member organizations active in digital preservation. Knowing their position in the preservation landscape will help preservationist to decide which of them fits best to their needs and which to join. The Open Preservation Foundation (OPF) launched recently their new Strategy (2018-2021) and shows the plans for the next coming years. The vision of OPF  “Open sustainable digital preservation” is accompanied by a new mission, thanks to the influence of the new director Martin Wrigley, and states

 

Enabling shared solutions for effective and efficient digital preservation; the Open Preservation Foundation leads a collaborative effort to create, maintain and develop the reference set of sustainable, open source digital preservation tools and supporting resources.

This set of tools (including software and standards) enables organisations to evaluate, validate, document, mitigate risk, and process digital content to be preserved in line with desired policies and community best practice.

One of the core values of OPF is the focus on serving the [currently 26] members with tools they need and to foster their effective and efficient preservation activities. The  OPF members were involved in shaping this strategy during their annual meeting in Tallinn in spring 2018. But as two other values are “openness”  and “collaboration” a larger group of preservationists will benefit from the OPF activities.

At the heart of the planned activities is the OPF Reference Toolset. In general there is a wide range of tools available for various preservation tasks (see Coptr) and of different maturity and robustness. OPF want to improve this situation so that members can be supported in choosing the right tool for their purpose. This will be done by creating a OPF Reference toolset, the development of which will be influenced by the OPF members.  The OPF Reference Toolset will not just being a set of useful tools, but is more. “The reference toolset includes software, standard test data sets (or “test corpus”), other standards and best practice (including policies), and may rely on external components that have a robust support mechanism.” 

As Knowledge exchange and Collaboration are still part of the action plan for the next coming years, the larger preservation community can be part of these development, but as nothing is free, an increase in members will certainly contribute to achieving the goals sooner. More details about the planned activities and a more extensive explanation of the OPF Reference Toolset can be found in the Strategy.

Is it useful to know who is using which preservation system?

My organization is, as are many others,  looking for a replacement of the current digital preservation system. So I’m curious what is on the market and what other national libraries are using. Websites of commercial vendors like Preservica, Ex Libris and Libnova offer sometimes information about their customers. The websites of the library organizations themselves inform us about their infrastructure. Last year a group of Portugese researchers (Rosa, Carlos André (2018): OPEN SOURCE SOFTWARE FOR DIGITAL PRESERVATION REPOSITORIES: A SURVEY) investigated the currently available open source software for digital preservation repositories.  Some of these open source communities have a list of implementations. Combined with suppliers websites we could have a nice overview of what is available and who has implemented which preservation system. Create a list on google docs, use Gephi to make a graph and you have a nice overview. I started with this exercise but was a bit reluctant to continue.

Firstly,  I thought, there is a risk with such a list: terrorists and hackers might plan to use this information to destroy important cultural heritage resources so perhaps it is best not to centralize this information (likewise: nobody should mention anymore the place of their preservation copies in public, like we did in the past when we were proud of what we had achieved).

But, secondly, even if we had information about who is using which system, we still have an incomplete picture because we do not know whether we share the same concepts, despite our shared OAIS language. I realized this when I saw a Dutch survey report.

The Digital Heritage Network in the Netherland started a survey (sorry folks, only in Dutch) to get an overview of the digital preservation systems in use in the Netherlands. Not only out of curiosity, but also to investigate the need for developing generic services and to promote more collaboration between organizations. The researchers Joost van der Nat and Marcel Ras plan to create a map of digital preservation services in the Netherlands and this survey will give the first ingredients. 50 organizations were selected for this survey, 44 of them responded. 27 of them said to have a digital preservation solution in place , although the impression is that not every respondent meant the same with having a “digital archive” so it is more safe to say that 50% has a digital preservation solution (this was based on the answers they gave on other questions). A third of these 27 organizations did the development of the digital archive themselves (9), but amongst the respondents were early adopters that started years ago when there was hardly any system on the market. The other respondents implemented Preservica (2) and Archivematica (1) or a solution created by a 3rd party provider like Data Matters (1) or Picturae (3). In the Category “others”, systems that were mentioned were Islandora, arQive, DSpace, De Ree and Adlib Filemaker (which are not all long term preservation systems in the OAIS sense). A new iteration of this survey will show a different overview, as there are for example more implementations of Preservica and Archivematica in the Netherlands.

Most of the respondents were familiair with the OAIS functional entities. 10 Organisations had all 6 entities implemented (Preservation Planning is absent in most organizations), but 6 respondents out of 27 did not know which functionalities of OAIS were present in their system, although they said to have a preservation system implemented! And despite the explanation given in the survey.

And here I realized that although people were familiar with OAIS concepts, the answers in the survey showed that they did not have the same definition of a digital archive. Although every question was accompanied by an explanation of the survey creators, respondents still gave answers that were for me beside the point.  And that it might not help me either to have an overview of who is using what digital preservation system. It is the way it is implemented and the organization around the digital archive, that matters. But these things cannot be shared in lists.

So perhaps the old fashioned way of picking up the phone and meeting people is still the best way to get your knowledge. However… for that you need a “phonebook” to know who to contact. So a list might be handy after all.

Software preservation in series of webinars

Last week the 5th episode in a series of webinars on software preservation was launched. The series is organized by the Software Preservation Network in the USA and the Digital Preservation Coalition in the UK.

Although preservation of software was already a topic in digital preservation for years, several developments in the last few years made the topic more pressing.

Lees verder

Fixity practices in the preservation community

The National Digital Stewardship Alliance conducted a survey in 2017 about the fixity (best) practices in the preservation community and recently published their results. The survey was intended to get an answer to two questions: (1) what common practices exists for fixity checking and (2) what are the challenges institutions face when implementing a fixity check routine? Lees verder

IDCC 2018 in Barcelona

2018-barcelona-idcc

The FAIR principles in its broadest sense were at the heart of the conference of the Digital Curation Centre, this time held in Barcelona. The FAIR principles about data being Findable, Accessible, Interoperable and Re-usable have seen a massive take up in the research community, but translating these principles into practice is another matter. This was the topic of the conference : “from principles to practice to global join up”.

There were a lot of interesting presentations, but in this blog post I will focus on the reproducibility of the data, as I see this as an important link to digital preservation.

Lees verder

Preservation as a present

20 Years of preservation have brought us valuable insights, useful tools and a large quantity of digital material that is now taking care of.

For the general public, used to their tablets and phones where everything is stored for them somewhere in the cloud and new updates are almost always compatible with older versions, the issue of preservation is invisible. This is very convenient for them, but not for us trying to get political attention and sustainable funding for our invisible activities.

Most people however value their digital stuff. This “digital capital” should be in our story to convince funders when asking for budgets to preserve the digital materials.  Preservation should not be a problem but a commodity. Something that helps you to take care of your stuff in a way you were not aware of. Like water that comes out of the tap: reliable, clean and always available (at least in part of the world). Only a few will know about the organisation behind this clean water. Although often taken for granted, in fact the running water is a present, resulting from a wide range of carefully planned actions.  Similarly the preservation community could mirror this water model.IDPD17_Logo_Dutch1 Lees verder

ENUMERATE 2017 and digital preservation

enumerate

A new version of the ENUMERATE survey results was just published with a separate report on the Dutch results. The ENUMERATE survey monitors the digitization activities in memory institutions in Europe, whereby memory institutions are defined as “ institutions having collections that need to be preserved for future generations”. It is always risky to interpret survey results without the raw data. My knowledge of the context of the participating organisations will also colour the results. As will the knowledge of the persons who supplied the survey answers. But some interesting outcomes in relation to digital preservation are worth pondering about. Around 1000 cultural heritage institutions in Europe replied to the 37 survey questions: libraries, archives, museums etc. .A lot of institutions have not supplied answer to all questions, making interpretation even more difficult. Lees verder

Certification changes: basic becomes core in the CoreTrustSeal

CoreTrustSeal-logoThere is a new organization in the certification world for digital repositories: CoreTrustSeal.The new organization is replacing the original Data Seal of Approval community (consisting of a General Assembly, peer reviewers of the applicants and a Board). The merge between the DSA and World Data System in 2016 led to a name change ( DSA-WDS Core Trustworthy Data Repositories Requirements ), a slightly changed set of 16 requirements, where in some cases the influence of the WDS is visible and now finally to a new organization. In the Netherlands we made a translation into Dutch of these requirements, as various organizations in our country are currently preparing themselves for this seal. Lees verder