20 Years of Digital Preservation


During the preparations for iPRES 2016 the Programme Committee discussed the fact that exactly 20 years ago Preserving Digital Information. Report of the Task Force on Archiving of Digital Information was published. A landmark report by The Commission on Preservation and Access and The Research Libraries Group, published in May 1996. It describes a broad view on digital preservation and is often looked at as one of the first comprehensive reports on this topic.

It was interesting to read it again and I was wondering what the view on preservation was 20 years ago and how this relates to the topics presented at iPRES 2016?

iPRES 2015 in Chapel Hill

Afbeelding1Audit, CD-ROMS, Emulatie, Ingest, OAIS en Web, dat waren in alfabetische volgorde de meest besproken onderwerpen tijdens de jaarlijkse conferentie iPRES 2016, die vorige week plaatsvond in Chapel Hill, North Carolina. Dit is mijn persoonlijke indruk, want natuurlijk kwamen in de lezingen, posters en workshops nog veel meer onderwerpen aan bod. Het is tenslotte een jaarlijkse reünie waarbij iedereen probeert zijn resultaten en toekomstplannen te presenteren. Lees verder

The Sweetshop iPRES 2014

tn_DSCF0687Quietly an important publication was put on the web: in December, just before Christmas, the proceedings of iPRES 2014 were published, almost 400 pages big. A sweetshop for the preservationist!

The proceedings offer a complete overview of the conference with not only all short and long papers and posters but also naming all people that contributed, summaries of the panels and tutorials and the text of the closing remarks by Andrew Treleor. All is put together in one big pdf, which makes it difficult to refer to one paper specifically, but on the other hand, offers the reader the chance for serendipity. Next time I would like to see some pictures of the beautiful venue (the State Library of Victoria) and the audience, but a very good job was done here!

Victoria State Library

Now for the content of the papers. Each contribution has the obligation to add keywords (not always present however). In total there were added around 250 keywords but there is hardly any system in it and they often only partially reflect the content. The contributions are much more interesting than the keywords suggests. May be next time authors can be steered a little bit more in choosing adequate keywords?

Reading these papers, you can see some trends. Self reflection being one of them. Although many papers show confidence in what we have achieved, several of us are wondering: are we doing the rights things, and are previous assumptions still valid?

Ranking file formats on preservation qualities for practical use, is one of them, as described  by Pennock (p. 141) and from a scientific point of view by Ryan (p. 179)  and Graf (p. 160). The challenge lies in bringing these two approaches together, as there is a danger that the two worlds will keep apart (a constant struggle in European projects).

Self certification is an upcoming topic with the growing attention for audit and certification, dealt with in a paper by Elstroem (p. 271). Others introduced the concept of maturity levels (in policies by Sierman p. 259 ) or by analysing the maturity levels of NSLA libraries (Slade, p. 284).

Are our standards still in line with the developments, as discussed in Zierau/McGovern ( p. 209). See my previous blog on this topic. Also questioned is our acceptance of imperfect tools during a panel discussion (p. 293) and a proposal for better PDF checking (Duff p. 39).

Collaboration in preservation is another trend, especially between organisations in one country, like Scotland (Mead p. 232) and Ireland (Webb p. 244).

But also very practical problems were discussed : legacy systems (MacDonald p. 279), the e-book work flows (Derrot p. 239), DRM (Steinke p. 228), archiving the Scholarly web (Treloar/vdSompel p. 194), emulation and costs (Cochrane p. 51, Grindley p.29) and finally the proposal for a technical registry ( McKinney p. 44).

As in a well-stocked sweetshop : there is something for everyone. Go, download and read!

Dasish Workshop on Audit and Certification

Last week I (partially) attended the Dasish Workshop on Audit and Certification in The Hague to give a presentation about the history and current status of the ISO 16363 standard Audit and Certification of Trustworthy Digital Repositories and the ISO 16919 (still not approved) which regulates the accreditation of auditors. I published my slides with some references to literature.

Apart from the usual, I also added one slide about “risks”. The growing focus on audit and certification is encouraging and shows that we all want to meet high standards. But we should also realize that digital preservation is an evolving field. Althought we agree on a set of starting points, the day to day practice shows often the gap between theory and practice. Auditors, funding bodies and organisations should realise that. Also the auditing practice will be a growth path.

You can find a summary of the Dasish Workshop in the blogpost of Kirsty Lee (who attended the whole 2 days!) http://libraryblogs.is.ed.ac.uk/bitsandpieces/

Reflections on iPRES 2014

Perseverance Hotel, Brunswick Street, Melbourne

Recently after 22 hours of flying I attended the iPRES 2014 conference in Melbourne, which was an awesome experience. How often does one have a chance to discuss aspects of the profession of digital preservation with no need of explaining the obvious? Meeting 200 colleagues, gathered in the beautiful State Library of Victoria, was an excellent opportunity to exchange ideas. And because there is no top ten of the buzzwords and no evidence of what was discussed during the breaks, the lunches and the dinners, I will summarize some highlights that inspired me.

First of all: everyone is worried about the fact that we constantly need to defend ourselves. Even in large organisations with mandates to preserve digital material, repeatedly higher management need to be persuaded to think about the consequences of this mandate. What is wrong? I personally wonder whether we use the right language, or as I said in the panel on Friday, maybe we lack the skills of framing digital preservation properly. How to describe an abstract concept of long term preservation to the non-initiated? Our message is often too complicated, speaks of problems and high costs without sketching the clear benefits and does not appeal to the imagination. This is a threat for our profession, as many colleagues agreed.

The second topic was about involving other disciplines in digital preservation, like the industry and science. Mixing ideas from different disciplines might lead to new ideas and innovations. It is important to convince these disciplines of the importance of digital preservation for their business, research or simply their sustainability and our knowledge can help them to get up to speed, while they can help us to make the next step forward. For this we need to develop a language understood by all parties. ( I’ll ignore the fact that the week started with a discussion about whether it is digital preservation, digital curation, data science or whatever term you could use – in my view this discussion distracts us from the main problem.)

But we all expected great benefits of more collaboration amongst the current preservationists, although there are some aspects that need to be taken into consideration. Collaboration without a clear benefit for the participants is doomed to fail. Sometimes projects might lead to good results, but these projects require an organisation and funding. Chances for these kind of projects are little in the current economical climate. But on a smaller scale the magic of the conference atmosphere led to new initiatives between a few individuals to exchange information, do something together or to really start working on an old idea.

The final buzzword on the conference was the DPTR: the Digital Preservation Technical Registry. An initiative to develop a file format registry, with a new data model and filled with information from existing registries. Currently it is a proposal under Horizon 2020, based on work done by the National Library and Archive New Zealand and National Library of Australia. There were mixed feelings about this initiative. One the one hand all agreed that the current registries fail for various reasons and that a good format registry is an absolute must. But what are the “lessons learned” about the previous – not so successful registries? Were these lessons incorporated in this new proposal? And was this new proposal not another example of technique first and then use cases?

It were the presentations  – which will be published soon – that gave us new food for thought but I’m convinced that the discussions during the breaks really gave us renewed energy to proceed with the challenge of digital preservation, although it took another 20 hours flying before I was back home!