The Beggar (Rijksmuseum)
Based on the 4C Cost Model for Digital Preservation, a dedicated working group in the Network Digital Heritage created and validated an extension to the 4C Cost Model for Digital Preservation. This model will offer better insight into the costs of digital preservation, by detailed information based on activities and processes. Link this information to the institutional and preservation policies and this will offer a better insight in the costs and a better steering mechanism.
The Network Digital Heritage was initiated by the Ministry of Education, Culture and Science, with a focus on exploring more the Dutch digital cultural heritage, by making our digital collections more visible, more connected with each other and more sustainable. This working group is one of the activities in the Program Sustainable acces.
Inspired by the European Project 4C
The financial experts from BMC Research used the results from the European project 4C (Collaboration to Clarify the Costs of Curation). In this project a cost model was delivered (amongst other valuable results) and a tool, the CCEx module, which enables on an international scale to benchmark the costs related to digital preservation.
Gerrit Adriaensz. Berckheyde: Herengracht Amsterdam. Courtesy Rijksmuseum
Recently I attended a meeting of SPUI25, a discussion centre of the University of Amsterdam, where the theme of the evening was Digital Humanities with a focus on research related to the success of 17th century Amsterdam. Those days are characterised by great wealth and a flowering artistic life, with famous painters (Rembrandt van Rijn), famous publishers, important jewellery makers , cartographers, etc. Researchers nowadays try to relate these successes, hoping to find the magic formula that led to this success – and learn something of it. Databases, dictionaries, websites and ontologies of linked data are created. Lots of students are using these resources to finish their thesis, the creative industry is using these collections to build new games, and make a living out of it. This sounds very promising, and the nice thing is that at the basis of these results quite often lies digitized material: books, newspapers, maps, archival records. Digitized and collected by big, publicly funded organizations like the National Library, the National Archive and municipal archives, the Rijksmuseum and owners of specialized collections. This type of research is part of the intended target audience these public organizations had in mind when they started digitizing their collection.
These digitized collections are becoming a source for research, therefore we should call them “research data”. Quite often this phrase is reserved for data generated by instruments in large quantities, but excluding the data sets in the humanities and social sciences from “research data” is incorrect in my opinion. And these sets of data will need to be preserved for a long term. Current and future researches will check the findings of their fellow researchers and for this purpose the digitized source material need to be available. To enable this, long term preservation and curation of this digitized source material need to be in place. Digital preservation however will require budgets, and these budgets will only be available if the benefits of preservation are clear. But currently the relationship between the research done and the “research data” that was at the basis of it, is hardly visible. Sometimes via a reference on the website, a footnote in an article. Some organisations keep statistics about the use of their collections and the amount of downloads, but this is information often kept for internal and reporting purposes. Is this enough to convince the general public of the usefulness of the digitization and preservation activities?
The Blue Ribbon Task Force already pointed to this risk and called this the “free-rider problem”: organisations are investing millions and the ones who benefits from this work won’t pay a penny or give enough credit to these organisations. With shrinking budgets this could become a problem for cultural heritage organisations. And here and there you see initiatives to better estimate the effect of the digitization and preservation activities (see the work done by Neil Beagrie on value and impact), but these are isolated initiatives. Therefore I like the view of the 4C project (Collaboration to Clarify the Cost of Curation), that explicitly aims to investigate not only the cost factors but also the benefits for organisations with a preservation mandate. This way of thinking will force organisations to find better ways to get the credits for their efforts and advertise them, so that the general public, that is often paying for these activities, will see where the money goes. Cultural heritage organisations no longer can afford to do their good work in silence. They need to find ways to be able to advertise in bold characters their contributions to society.
Lately there was much debate on the fact that over the years the digital preservation community mastered to create a collection of more than a dozen of cost models, making the confusion for every one starting in digital preservation even bigger. May be this is part of the way things are going: everyone sees his own situation as something special with special needs. The solution? Tayloring an existing model or developing a new one. We can expect help from the recently started European project 4C ,”The Collaboration to Clarify the Costs of Curation”. In their introduction they state that “4C reminds us that the point of this investment [in digital preservation] is to realise a benefit”. Less emphasis on the complexity of digital preservation, and more on the benefits.
Some people think that talking about digital preservation in terms of complexity and costs sounds more negative than thinking in terms of opportunities (or challenges) and benefits. But in both cases, you will need the same hard-core figures about the costs you make as an organisation and the benefits that raise from it. The latter is not easy to do, but the work of Neil Beagrie and his team shows that it will be possible to measure the benefits.
If we would have better figures of the benefits of preserving digital material, we are in a better position to estimate what it will cost us if digital material is not preserved. Of letting digital objects die, be it intentionally or not. How much damage is done to society if crucial information is not preserved? Recently the question was raised that some interesting websites, containing the research results of a project that lasted for several years, might not be harvested and preserved in a digital archive. Consequence of this would be a tremendous loss for the community in the related research discipline. This is clearly an incentive for preservation!
I remember that when the Planets project was proposed, it was argued that the obsolescence of digital information in Europe, in case no action to preserve it would be taken, could cost the community an astonishing amount of 3 billion euro a year. I could not find a source for this assumption, only a reference to some articles. One of them described the amount of data that was created worldwide. The other article described the costs for an organization if lacking proper tools to manage data (getting access, searching, not finding etc). It could be that the Planets assumption derived from this information was used as an illustration to make the case for digital preservation (the amount of stories in the Atlas of Digital Damages does not prove this assumption).
But in essence, it are these kind of figures (and their related evidence) we also need to have at hand. Not only demonstrating the costs of digital preservation, but also demonstrating what it would cost society if we did not preserve things.