How do you depict digital preservation? I gave it a try with this .png file. It is free for everyone to use under Creative Commons CC-BY-NC-ND! Just in time for the International Digital Preservation Day on 30 november 2017!
How do you depict digital preservation? I gave it a try with this .png file. It is free for everyone to use under Creative Commons CC-BY-NC-ND! Just in time for the International Digital Preservation Day on 30 november 2017!
After Christmas I tried to reduce my digital pile of recent articles, conference papers, presentations etc. on digital preservation. Interesting initiatives (“a pan European AIP” in the e-Ark project: wow!) could not prevent that after a few days of reading I ended up slightly in despair: so many small initiatives but should not we march together in a shared direction to get the most out of these initiatives? Where is our vision about this road? David Rosenthals blog post offered a potential medicine for my mood.
He referred to the article of Richard Whitt “Through A Glass, Darkly” Technical, Policy, and Financial Actions to Avert the Coming Digital Dark Ages.” 33 Santa Clara High Tech. L.J. 117 (2017). http://digitalcommons.law.scu.edu/chtlj/vol33/iss2/1 Lees verder
This is the text of a presentation I held at the combined 4C/DPC conference ‘Investing in Opportunity: Policy Practice and Planning for a Sustainable Digital Future’ in London 17-18 November 2014 at the Wellcome Trust.
A few months ago I was in Copenhagen and as I like shopping in a foreign country and take something with me to remind me of my trip , I bought a golden ring. Well, not gold, it was gold plated and after a few months of wearing is was a silver ring. Was that a disappointment? The design was still nice and the ring still fitted. But it did no longer match my other rings. And I was not expecting this to happen so soon. And if instead of silver, the next layer would have been brass or nickel or copper, I really would have been disappointed and felt betrayed. So it was gold, turned to silver and yes that did matter.
No one would appreciate a trustworthy digital archive turning from gold to silver…
What are we talking about?
The title of this presentation Is there a gold standard and does this matter? was made by William [Kilbride] when inviting me for this panel. But what we are talking about here of course is the ISO 16363 standard on Audit and Certification of trustworthy digital repositories, officially an ISO standard since 2012. Recently the accompanying standard for accreditation of auditors was officially published by ISO and now certification against this standard can start.
The ISO standard is the highest level in the European framework of certification, starting with the basic level of the Data Seal of Approval, a middle layer of the nestor/DIN standard, and the highest level of this ISO standard. We currently don’t speak about layers like bronze, silver and gold, but if you would do, the ISO is the gold level in this model, the gold standard if you wish.
This audit standard has a history of more than 10 years and to really value its importance, we should know a bit of its history. Already in the OAIS model, a reference is made to a systematic approach of checking the conformity of organisations to this model, a standard for accreditation of archives. The first draft of TRAC criteria is based on this OAIS model – our shared view how digital preservation should be done. Ten years ago a draft version of TRAC was tested by performing audits on different repositories, among them the KB, and based on this feedback the Checklist of TRAC was published in 2007. 5 years later, with TRAC as a starting point, the ISO standard was published, which I will call here the TDR-standard. In summary, this standard is based on preservation insights of a wide variety of organisations and people, all knowledgeable in digital preservation and with different background. We sometimes might forget who is behind this standard and I tried to create an overview of the people and organisations that were involved in TRAC 2002, 2005, 2007, ISO 16363 and ISO 16919. A wide variety of organisations, and of people, some already retired, some still going strong and others working daily with digital preservation. The standard was created really with input from the preservation community.
Is a gold standard also a perfect standard?
That need not to be the case. Although a lot of experience is woven into the standard, this standard is not a static, carved in stone, document. For the TDR standard, I think it is too early to judge. No audits did take place with this standard yet.
Let me give you an example of another “gold” standard we all share: the OAIS model ISO 14721, out there for more than 10 years. The OAIS standard has proven its value, on various areas but especially as the Esperanto of the digital preservation community. However, standards operate in a changing world and that has an influence on the value of that standard. If a standard is not adapted to developments in the changing world – and digital preservation is an evolving area – it will lose its value.
Some people think the OAIS model is outdated and does not keep up with the developments in preserving digital material. Since 2002, when the OAIS was first published, the preservation insights have changed. As an example: Emulation is now an accepted approach, migration less then 10 years ago. Quite often, in practice, the well know functional model of the OAIS is extended with a function “pre-process”. Sometimes the OAIS standard is interpreted more as a straitjacket then as a conceptual model. This will lead now and then to strong twitter opinions, like suggestions to burn OAIS, never to mention it in a presentation etc. Other people contribute in a more constructive way by blogging about specific shortcomings of OAIS or by further developing the model. Like we did in the Planets project, where Paul Wheatly and I added the Preservation Watch concept, extended in the SCAPE project. Others created a framework to apply OAIS to distributed digital preservation. These contributions will help to keep a standard alive and we, as a preservation community, should try to incorporate new insights into the standard.
That is why ISO standard has a process of reviewing the standard. Every 5 years, theoretically at least, changes can be made to the standard. The first opportunity for the OAIS model (and coincidentally for the TDR standard as well) is 2017.
The ISO process is not quite clear, but there is enough reason for the preservation community to investigate how to be involved in this process and keep the standards up to date. I’ll talk with the Dutch Standard Organisation NEN soon and ask them what is the official procedure and how we can collaboratively act. I’ll report on this at our KB research blog.
Back to the TDR standard.
The TDR standard was developed to audit trustworthy digital repositories. A big difference with the OAIS model, where the individual institute decides whether it is compliant or not, is that it is a group of auditors who are making this judgement. As currently everyone can say that his repository is conform OAIS, this will no longer be correct if we speak about the TDR standard. There an external body will make the decision whether your repositoriy is a Trustworthy Digital Repository, according to the “gold” standard and based on the information you gave the auditors. And, not less important, based on the knowledge and experience of the auditors. And this is a crucial factor in the process: who are these auditors? Sometimes the more cynical people tell me that an audit process starts with giving the auditors a good dinner, have a chat and then you’ll get the certificate. Needless to say that this is not a good auditing process. Certainly not one that is covered by the rules and regulations that are described in ISO 16919. This standard describes the qualities an auditor need to have in general, but the standard is specifically extended with qualities related to digital preservation. Despite this, how do we, as a community get to the point that we trust the auditors and that the certificate really reflects a trustworthy repository? Two elements contribute to this: transparency and the time limited validity of the certificate.
One of the measures that were taken is “transparency”: all documentation, except for really confidential material, will be publicly available after the certificate is given. There are already a few examples out there based on TRAC audit by CRL, LOCKSS, Portico, Hathi Trust, Chronopolis and Scholars Portal. This transparency will help to foster discussion in the community about the certification.
The other element is that a certificate, in contrast to the repository itself, is not for “long term” . Instead it is lasting 5 years or so. Then the repository will need to start a new audit process to get the certificate again, based on changed conditions no doubt.
Will a certified repository be a successful preservation environment?
Chances are yes, but no guarantee can be given. As a community we agreed on a approach to digital preservation by accepting the OAIS model and the derived audit standards, described and hopefully regularly discussed and updated in this golden approach so to speak. By the way, the monetary gold standard is not in use any more, but replaced by other mechanisms. Whether our gold standards will lead to success, time will tell.
I blogged before about the enormous amount of time it takes before a draft standard becomes an approved ISO standard and finally this week happened what we were waiting for: ISO officially published the ISO 16919 standard. Fully named: Requirements for bodies providing audit and certification of candidate trustworthy digital repositories. This mouthful of words all comes down to one element: consistency. An audit done with auditors from one part of the world should be comparable with an audit done by auditors in another part of the world. This is a problem that ISO has tackled by having a standard that regulates the accreditation of auditors : ISO 17021 Standard requirements for A&C general management systems. The PTAB group adapted this standard in order to make it complementary to the ISO 16363 -2012 standard Audit and Certification of Trustworthy Digital Repositories. What additions were needed? In short: specific digital preservation knowledge is introduced as a requirement in this standard. This is done by an explicit reference to the OAIS standard ISO 14721-2012, and the ISO 16363-2012. A list of competencies is added, describing the qualifications an auditor should possess to participate in the audit process. This also is focused on digital preservation aspects, adding to general auditors requirements like confidentiality, impartiality, responsibility etc. Experience in digital preservation is expected and where the knowledge is lacking, training might fill the gaps. The next step will be that qualified auditors will be appointed. Here the national standard bodies play an important role, as they monitor this process. So we are not there yet, but an important milestone is reached. The European framework of audit and certification sees an external audit according to ISO 16363 as the highest level of certification. An organisation can start with certification according to the Data Seal of Approval, followed by the Nestor/DIN 31644 standard. This will leave some time to train qualified auditors and get experience with the concept of certification in the evolving world of digital preservation.
I never realized that the procedure of getting to an ISO standard could take several years, but this is true for two standards related to audit and certification of trustworthy digital repositories. Although we have the ISO 16363 standard on Audit and Certification since 2012, official audits cannot take place against this standard until the related standard Requirements for bodies providing Audit and Certification (ISO 16919) is approved, regulating the appointment of auditors. This standard, similar to the ISO 16363 compiled by the PTAB group in which I participate, was already finished a few years ago, but the ISO review procedure, especially when revisions need to be made, takes long. The latest prediction is that this summer (2014) the ISO 16919 will be approved, after which national standardization bodies can train the future (official) auditors. How many organizations will then apply for an official certification against the ISO standard is not yet clear, but if you’re planning to do so, it might be worthwhile to have a look at the recent report of the European 4C project Quality and trustworthiness as economic determinants in digital curation.
The 4C project (Collaboration to Clarify the Cost of Curation) is looking at the costs and benefits of digital curation. Trustworthiness is one of the “economic determinants” of the 15 they distinguish. As quality is seen as a precondition for trustworthiness, the 4C project focusses in this report on the costs and benefits of “standards based quality assurance” and looks at the 5 current standards related to audit and certification: DSA, Drambora, DIN 31644 of the German nestor group, TRAC and TDR. The first part of the report gives an overview of the current status of these standards. Woven in this overview are some interesting thoughts about audit and certification. It all starts with the Open Archival Information System (OAIS) Reference Model. The report suggests that the OAIS model is there to help organisations to create processes and workflows (page 18), but I think this does not right to the OAIS model. If one really reads the OAIS standard from cover to cover (and should not we all do that regularly?) one will recognize that the OAIS model expects a repository to do more than designing workflows and processes. Instead, a repository needs to develop a vision on how to do digital preservation and the OAIS model gives directions. But the OAIS model is not a book of recipes and we all are trying to find the best way to translate OAIS into practice. It is this lack of evidence which approach will offer the best preserved digital objects, that made the authors in the report wonder whether an audit that will take place now might lead to a risky outcome (either too much confidence in the repository or too little). They use the phrase “dispositional trust” . “It is the trustor’s belief that it will have a certain goal B in the future and, whenever it will have such a goal and certain conditions obtain, the trustee will perform A and thereby will ensure B.”(p. 22). We expect that our actions will lead to a good result in the future, but this is uncertain as we don’t have an agreed common approach with evidence that this approach will be successful. This is a good point to keep in mind I think as well as the fact that there are many more standards applicable for digital preservation then only the above mentioned. Security standards, record management standards and standards related to the creation of the digital object, to name just a few.
Based on publicly available audit reports (mainly TRAC and DSA, and test audits on TDR) the report describes the main benefits of audits for organisations as
These benefits are rather vague but one could argue that these vague notions might lead to more tangible benefits in the future like more (paying) depositors, more funding, etc. By the way, one of the benefits recognized in the test audits was the process of peer review in itself and the ability for the repository management to discuss the daily practices with knowledgeable people.
The authors also tried to get more information about costs related to audit and certification, but had to admit in the end that currently there is hardly any information about the actual costs of an audit and/or get certified (why they mention on page 23 financial figures of 2 specific audits without any context is unclear to me) and base themselves mainly on information that was collected during the test audits that the APARSEN project performed and the taxonomy of costs that was created. For costs we need to wait for more audits and for repositories that are willing to publish all their costs in relation to this exercise.
Reading between the lines, one could easily conclude that it is not recommended to perform audits yet. But especially now the DP community is working hard to discover the best way to protect digital material, it is important for any repository to protect their investments and to avoid that current funding organizations (often tax payers) will back off because of costly mistakes. The APARSEN trial audits were performed by experts in the field and the audited organizations (and these experts) found the discussions and recommendations valuable. As standards are evolving and best practices and tools are developed, a regular audit by experts in the field can certainly safeguard organizations to minimize the risk for the material. These expert auditors need to be aware of the current state of digital preservation, the uncertainties, the risks, the lack of tools and the best practices that are there. The audit results will help the community to understand the issues encountered by the audited organizations, as audit results will be published.
As I noticed while reading a lot of preservation policies for SCAPE, many organisations want to get certified and put this aim in their policies. Publishers want to have their data and publications in trustworthy, certified repositories. But all stakeholders (funders, auditors, repository management) should realise that the outcomes of an audit should be seen in the light of the current state of digital preservation: that of pioneering.
For everyone who is interested in an article in Dutch, which explains the main topics of the OAIS model, please read my recently published article
Het OAIS-model, een leidraad voor duurzame toegankelijkheid.
This article was published in the Handboek Informatiewetenschap, december 2012, part IV B 690-1 pp. 1-27. It will also appear in the IWA database, at www.iwabase.nl. And later I’ll add it to this blog. Happy reading!
In an earlier post I already announced the new version of OAIS, but today I saw on the ISO site that it is officially published! What was a Recommended Standard for CCSDS in 2002, is now a Recommended Practice in 2012. An overview of the main additions are described in my post http://digitalpreservation.nl/seeds/standards/oais-2012-update/
Some people might have noticed that downloading the OAIS model from the CCSDS site now results in an improved version of OAIS 2012 http://public.ccsds.org/publications/archive/650x0m2.pdf Meanwhile this new standard is awaiting approval from ISO, which is expected in short notice. So we have now an updated version of the most important standard in digital preservation. The new version contains a change bar in the margin indicating the textual differences from the previous version. Personally I found it more convenient to have an overview of the main differences and created this summary. But please have a look yourself if you want to know the details.
The main changes are the following:
Access Rights Information
Access Rights information is added as an element to the Preservation Description Information. Access Rights information is not only restricted to access by the Consumer, but encompasses the permissions for preservation operations, distribution and usage of Content Information, so it offers a broader scope on the rights of an archive to handle material. Most of these will be regulated in the Submission Agreement with the Producer.
Emulation and relation to software
Emulation as a preservation strategy was a bit underestimated in OAIS-2002, but developments in emulation shown in various projects seems to have led to an higher appreciation of emulation as a strategy. At various places in the text it shows that migration is no longer the only strategy. Chapter 5 discusses emulation as a strategy to preserve access services or the original look and feel and explains how different varieties of emulation will fit in the OAIS model.
Preservation Planning Functional entity
There is more interaction between the Administration Functional Entity and the Preservation Planning Functional Entity. The Preservation Planning Functional entity will create preservation plans (and not only “migration plans” as mentioned in the previous version), based on its monitoring activity, and will send these to the Administration functional entity to be performed. But the Administration Functional Entity (cq. the function The Establish Standards and Policies) will also receive periodic risk analyses created by Preservation Planning to act upon, which gives the Preservation Planning Functional Entity a more active role in monitoring not only the outside world but also the OAIS itself. Interaction also takes place when Preservation Planning sends recommendations on AIP updates, and The Administration Functional Entity replies with preservation requirements (added to the already exisiting “migration goals and approved standards”). So for creating new migration packages not only the preservation requirements resulting from monitoring the Designated Community are input, but also the preservation requirements from Administration. In general some loose ends seems to be united here.
Authenticity and Information Properties
The confusing use of the word “authentication” has now changed and the term “authenticity” is defined (adding the much used definition of “The degree to which a person (or system) regards an object as what it is purported to be.” With an important addition: “ Authenticity is judged on the basis of evidence.” Two paragraphs describe ways to create this evidence. One of the ways lies in the new concept of Transformational Information Properties . A Transformation was already defined as a digital migration where the bits of the Content Object and/or the PDI change, ultimately resulting in an AIP version. This is a risky operation as you might loose important information contained in the original AIP. By defining Transformational Information Properties, you can define beforehand which properties need to be kept after the transformation. The fact that these properties are still there in the new AIP version will contribute to the Authenticity of the object. The Information Property is related to the commonly known but not always clearly defined term “significant property”, but I think more discussion is needed to define better where the differences and similarities between the two concepts lie and how to translate this into the daily practice.
Definition Information package
The term Information Package is redefined to : “ A logical container composed of optional Content Information and optional associated Preservation Description Information. Associated with this Information Package is Packaging Information used to delimit and identify the Content Information and Package Description information used to facilitate searches for the Content Information.”
(original definition: The Content Information and associated Preservation Description Information which is needed to aid in the preservation of the Content Information. The Information Package has associated Packaging Information used to delimit and identify the Content Information and Preservation Description Information.)
A new definition is introduced “Other Representation Information” and described as “Representation Information which cannot easily be classified as Semantic or Structural. For example software, algorithms, encryption, written instructions and many other things may be needed to understand the Content Data Object, all of which therefore would be, by definition, Representation Information, yet would not obviously be either Structure or Semantics. Information defining how the Structure and the Semantic Information relate to each other, or software needed to process a database file would also be regarded as Other Representation Information.” The more variety and complexity in digital material to preserve, and the growing understanding on what we need to describe to keep the material accessible and understandable, the more we will need to adapt the standard.
Chapter : Preservation Perspectives
This chapter is adapted with the above mentioned changes and refines the definitions for an AIP version (An AIP whose Content Information or Preservation Description Information has undergone a Transformation on a source AIP and is a candidate to replace the source AIP. An AIP version is considered to be the result of a Digital Migration.) versus an AIP edition (An AIP whose Content Information or Preservation Description Information has been upgraded or improved with the intent not to preserve information, but to increase or improve it. An AIP edition is not considered to be the result of a Migration.)
Minor additions are for example the mentioning of “temporary storage” to place information packagages that are in the midst of a process.
This updated version of the OAIS standard was published almost 10 years after the first publication. The worldwide research done in digital preservation and the insights gained from that research will expect that the 5 year review period of ISO will be needed to keep the OAIS model in pace with the developments.
[i] Reference Model for an Open Archival Information System (OAIS). Magenta Book. Issue 2. June 2012.