Porter Olsen from the Maryland Institute for Technology in the Humanities (MITH) hosted a full-day webinar at Stanford University on Friday, August 29, 2014 to introduce archivists from Stanford University Libraries, the Hoover Institute, and UC Berkeley’s Bancroft Library to BitCurator, an open-source all-in-one suite of digital forensics tools. The BitCurator project is a joint venture between MITH and the School of Library and Information Science at the University of North Carolina, Chapel Hill (SLIS). Funding is provided by the Andrew W. Mellon Foundation. Many of the archivists in attendance were already using commercial software such as FTK Imager to perform digital forensics work at their institutions, but were interested in moving to a more cost-efficient open-source solution. The BitCurator project seeks to fill this niche by providing a sustainable and open-source platform that evolves with the needs of libraries and archives as they acquire increasingly larger and more complex digital collections.
Libraries and archives with large digital collections need methods to securely migrate data from a variety of media (3.5-inch floppy diskettes, hard drives, email archives, etc.) into a preservation repository. Once transferred this data also needs to be checked for file integrity, and scanned for information that may need to be redacted (credit card numbers, social security numbers, home addresses, etc.). Olsen demonstrated two of the digital forensics tools in the BitCurator suite that address these types of problems:Guymager and Bulk Extractor.
Guymager can be used to create disk images off storage media and save them in one of three formats – Linux dd raw image (disk dump utility), Expert Witness Format (.E01), and Advanced Forensics image format (.AFF). A disk image is a bit-by-bit copy of the bit stream, which means it copies all sectors of a disk, even parts that might read as “available” empty space. Olsen demonstrated how Guymager’s GUI-based interface could be used to safely mount and image a USB drive by right-clicking on a list of attached drives and choosing “Acquire Image.” There was also an option to perform the same operations using a command line interface.
Olsen also ran Bulk Extractor, which scans files according to user parameters, and highlights information that may be sensitive. For example, certain 9-digit numbers that start with the letters “UID” followed by a nine digit number could be a student identification number. Since false positives may happen when searching for numbers of a certain length, Bulk Extractor provides fields where users can enter additional context and help narrow the search.
The BitCurator project will undergo a transition in September 2014 from an Andrew W. Mellon Foundation funded venture to one that will be sustained going forward by the BitCurator Consortium. This group of experts includes members of Stanford University Libraries, and will be instrumental in the support and growth of BitCurator as an essential component of digital forensics work performed in libraries and archives.