Crystal clear digital preservation: a management issue

Digital Preservation of Libraries.final.final.inddRaising awareness for digital preservation was a frequently used phrase when I started in this field ten years ago (never regretted it, hurray!). We preservationists have made progress. But the story is still not explaining itself. So I like reading how others persuade and convince people. Recently I found a book that really does the job. In crystal clear language, without beating about the bush and based on extensive up to date (until 2014) literature, digital preservation is explained and almost every aspect of it is touched upon. Edward M. Corrado and Heather Lea Moulaison have done a great job with their Digital Preservation for Libraries, Archives and Museums , Rowman and Littlefield, 2014. ISBN 978-0-8108-8712-1 (pbk.) — ISBN 978-0-8108-8713-8 (ebook)

In fact, I should start this blog post with “Dear manager, I have found a book that tells you all you need to know about digital preservation. Spare some time and read the chapter that is dedicated to you (part II) , the sooner the better” [preservationist, please forward this to your manager, they might even read the rest of the book!]

The book starts by explaining what digital preservation is not ( like “backup and recovery”, access, “an afterthought”). Followed almost immediately by the (positively phrased) starting point, that guides the whole book:

“ensuring ongoing access to digital content over time requires careful reflection and planning. In terms of technology, digital preservation is possible today. It might be difficult and require extensive, institution-wide planning, but digital preservation is an achievable goal given the proper resources. In short, digital preservation is in many ways primarily a management issue”.

The red line/ metaphor in the book is the authors “Digital Preservation Triad”. The triad is a new variety of the Three legged stool of Nancy McGovern and is symbolized by a Celtic knot. The knot is used in order to better symbolize the interrelated activities.

triad

These activities are divided into :

  • Management-related activities,
  • Technological activities and
  • Content-centred activities.

Each set of activities is further explained in a dedicated chapter. The chapter about Management activities immediately starts to explain the basics of the OAIS model. Clearly showing that this is the essence of digital preservation. Knowledge of OAIS should be present on management level of an organisation. Only then management can deal properly with aspects like human resources (skills and training), and sustainable digital preservation (costs etc).

The Technology part is more concerned with metadata and file formats and the technical infrastructure or repository, which is closely related to mechanisms of trust (audit and certification).

The last part of the book discusses aspects related to the Content, like collection development.

The text is based on a large literature list in which many recently published conference papers, (EU) project results and reports are used. The authors are well informed about what is going on and do not restrict themselves to the US.

What I liked in this book is the very practical approach and the unvarnished description of digital preservation (‘not easy but doable’). The authors stress that preservationists should convince over and over again management “that digital preservation is important to the overall mission of the organization”, and not just “an experimental technology project” and “communicate the multiple ways in which digital preservation brings value to the organization.”

One of the barriers in this process, at least in my experience, it that people often try to connect their experience in analogue preservation with that of digital preservation. Sometimes this leads to monstrous analogies. This book does not try to map the two worlds, but clearly states:

“The digital item created and made accessible as part of a digital preservation system is fundamentally different from an analogue item. Period.”

Unavoidably some recent developments are missing, like the Cost model work that was done in the 4C project and the work on Preservation Planning and Policies in SCAPE.

But if you still need to convince your management, point them to this book – also available as an epub!

“Materials contain the seeds of their own destruction” . A preservation handbook.

harveyRegularly I have discussions whether digital material and analogue material can be treated the same way or whether the digital aspect requires special treatments, sometimes even resulting in different working processes, staffing and policies. Quite too often this discussion takes place with participants that are either representatives of the digital or of the analogue view. The polite way of trying to understand each other by finding analogies often lead to simplified views and unsatisfying outcomes and nobody gets the wiser. Therefore I was triggered when a new digital preservation handbook exactly raised this issue by stating “This book is based on the philosophy that there are preservation principles that apply to all kinds of materials, whether digital or not.” For a preservation handbook this is a realistic perspective, as organisations have both kinds of materials. The authors present this book as the first example of  ” the essential tools and principles of a preservation management programme in the 21st century – one that addresses the realities of diverse collections and materials and embraces the challenges of working with both analogue and digital collections.”

This being stated, the authors start addressing the different issues related to digital versus analogue and refer to the fact that digitization in the past led to destruction of the related physical objects by assuming that “the information” was saved in the new digital object, a debatable point of view nowadays (see Nicholson Baker’s Double Fold. Libraries and the assault on paper. 2001) . They come with a set of shared preservation principles, for both analogue and digital material.

harvey-2Four principles describe the context and aims of preservation, amongst which the needs of the user is mentioned (a point of view we also see in the OAIS model) as well as “Preservation is the responsibility of all, from the creators of objects to the users of objects“. A set of 8 general principles focus on “collaboration”, “advocacy, “active, managed care” and the preference for actions “that address large quantities of material over actions that focus on individual objects” [ although this is highly dependent on the value of these objects I would say] . The following principle describes the key of preservation: “Understanding the structure of material is the key to understanding what preservation actions to take, as materials contain the seeds of their own destruction (inherent vice)”.

This set of Preservation principles and practices is the red line for the rest of the book, which contains a wealth of information. I can recommend this book to both the digital and the analogue preservationists, as it will contribute to mutual understanding so desperately needed! And don’t complain about the price (90 dollars) : this book might be expensive, but a one day course is more expensive and almost all the rest you want to know about digital preservation is freely available on the internet!

The preservation management handbook: a 21st-century guide for libraries, archives and museums. [Edited by] Ross Harvey and Martha R. Mahard. Rowman and Littlefield, 2014. ISBN 978-0-7591-2315-1 (also available as e-book)

Where is our Atlas of Digital Damages?

Sometimes we, digital preservation people, have a tough job to explain the surprises we come across. If similar situations would happen in the analogue or physical world, one might doubt our observation qualities. Who could imagine that a poem in a book that is on a shelf  in a safe and monitored area, and is opened after a few years, suddenly has some new sentences in it, while no human being  did add these sentences?   Or that the title page has changed and that the special font, designed and chosen by a famous typographer, turned into a run-of-the-mill font, thus downgrading its aesthetic appearance?  Again, without human interference. Panic would be huge if this would happen in our libraries and archives.

But these things happen in a digital environment.

Some nice and also scary examples are given in Euan Cochrane’s report Rendering Matters, http://archives.govt.nz/rendering-matters-report-results-research-digital-object-rendering (Archives New Zealand) This study describes the results of a comparison of rendering a set of files in different environments, among them also the recreated original environment with the original software.  Preservation plans that put their cards on the relatively “safe” approach of migrating to or accessing the file in a higher version of the same software, or to the open source version (MS Office vs Open Office) will often be confronted with altered information, as is shown in the report. Checks based on word counts don’t appear reliable. And yes, a sentence in a poem was added, be it with a lot of “rubbish” that certainly was not intended to be a poetical line.

These examples can support us, when we try to convince those, responsible for collections. To think about what they want to preserve and what cannot be modified, added or lost without severe damage to the content.  Evidence will convince. And we need far much more of it. Because showing these examples will make people more aware of what can happen to the digital objects, even when no one touches the objects and they were safely bit preserved.

These examples offer a chance to ask some fundamental questions,  as what makes the original look and feel? What are the important elements in the digital object? What will be the effect of loosing these elements?  Because there lies a real risk for the digital collections, but by making it visible with examples, it will be more convincing than all the conference papers that we have written about the digital preservation challenges.

After all, the collection care specialists in archives and libraries have their own guides with horrifying pictures of manuscripts damaged by mice and insects, ink and mould.

Let’s have our digital version of it! The Atlas of Digital Damages.

Analogue versus digital (1)

Although digital material is existing for several decades, and most people have some kind of familiarity with using computers, smart phones and tablets, this does not mean that all people involved with the care for digital material are fully aware of the ins and outs of the material. Quite a lot of organizations like libraries and archives have employees that have been treating analogue material for years. They know all about the risks of brittle and acid papers, insects and mould. They can debate about  using gloves for handling rare books. They can measure the increase of the acid degree  with sophisticated equipment. And they can put the damaged manuscripts and books right under your eye, pointing to the lost visibility or damaged colours.

And now they need to get a feeling for taking care of digital material.

There is a large digital preservation community that  now has an idea of how to do this. And together we try to develop a shared vision on how to treat various digital data types, file formats etc. We have our standards and our common language, via OAIS, Premis, RAC/TRAC etc.

And there is a even larger community that expects digital files in a computer to be there forever. They don’t even imagine bit rot can happen. They are surprised when you tell them that you need checksums to verify that the file is not damaged. They might never have  seen a tape or a hard disk. They don’t go that far that they say that the computer is a “magic box”, but in fact for them a computer is terra incognita.

And we need to explain it to them, because it will become part of their job. Quite often we try to explain it in our “digital preservation language”, our own OAIS-Esperanto language. But I don’t think that will work.  I’m struggling how to translate the digital preservation concepts to “non digital” oriented people.  I was wondering whether the approach of analogies might help. If we are able to draw parallels between a risk in a digital environment and translate the consequences in the terminology of a analogue environment, might not the people that are used to analogue material have a quicker understanding of the consequences? Because they can translate it to the field of knowledge, they are familiar with?

Take for example the following example. A library customer finds a book in the catalogue and asks for this at the counter. Someone takes the order and will look for the book in the storage facilities. But alas, no book, the place on the shelf is empty and there is no registered proof that someone took it. Perhaps it was returned to a wrong shelf, perhaps it is lost or stolen.

The analogy with a digital repository is, that someone asks for a digital object, and receives a message that it cannot be retrieved. The cause might be, that the descriptive information in Data Management  (perhaps duplicated in a catalogue) does not link anymore to the AIP in the archival storage.  Then something is seriously wrong. We try to avoid that in the repository, by creating a repository system that has taken measures to safeguard this linkage between those two vital elements: the access to the object and the object itself. By adding a unique identifier and metadata, duplicating descriptive information in the AIP, regular checking etc.  And we can explain this (costly) approach by telling the parallel situation in the non-digital world.

No client will take “no” for an answer and the staff certainly wants to avoid this situation, be it a digital or an analogue one.

I’ll think of more examples, your feedback is welcome!