The National Digital Stewardship Alliance conducted a survey in 2017 about the fixity (best) practices in the preservation community and recently published their results. The survey was intended to get an answer to two questions: (1) what common practices exists for fixity checking and (2) what are the challenges institutions face when implementing a fixity check routine?
The NDSA produced two documents in the past that encouraged organizations to think about fixity checking. One of them is the Levels of Digital Preservation, in which fixity checks are included in each of the four levels but each level required more fixity related activities. The other document is from 2014 What is Fixity, and When Should I be checking it? This is an overview of fixity related topics and as such takes a neutral position without providing preferences for various options and was very much input for phrasing the questions in the survey, without being a check whether the suggested practices were also implemented.
The digital preservation community is very much in need of shared (best) practices after 20 years of practice. Especially such a basic starting point as fixity checking could do with a clear approach.
The respondents of the survey (not only NDSA members but from all over the world, in total 89 respondent completed the survey) showed they are very aware of fixity checking at receiving materials, transferring materials and checking it again at regular intervals. These checks at regular intervals are done on the whole content, not just on a sample set. In general the fixity checking is software based, none of the 72 repondents indicated that they used only hardware for fixity checking. – this might be helpful in discussions with IT when they tell you that the hardware will take care of fixity checking .
The majority is using the MD5 algoritm, although I remember that years ago this was already mentioned as less reliable than SHA 256 which ended in the second place. In most cases the system administrators are responsible for the fixity checking process, but preservation managers and digital archivists are in the second place.
A point of discussion is often where to record the fixity information. Q 19 in the survey asked this question and gave the choice between 1. In databases and logs 2. Alongside content 3. In the files themselves (e.g. stored in the file header of an AV file). Most respondents choose more than one option, with 40% in the database or logs and 29% as part of the metadata record.
The survey shows that various challenges still prevent implementing a best practice. Quite often the respondents receive no fixity information from depositors, despite requiring it in the agreement. Most of the respondents create then their own fixity information after receiving the content. Lack of available time and resources is also of influence of planning the frequency of checking.
Luckily there is a Next steps section in the document, with a suggestion to gather more information about how often fixity checks fail and what is done in those cases . NDSA might also start working on a “maturity model with good/better/best practices”. It is time that the preservation community agrees about their basic starting points so I would welcome such a model!