Preservation and Conversion Strategies at the Bentley Historical Library

The Bentley Historical Library is committed to the long-term preservation of and access to its digital collections. Because the library must contend with thousands of potential file formats, Digital Curation Services has adopted a three-tier approach to facilitate the preservation and conversion of digital content:

This document provides further information on the Bentley Historical Library's accepted preservation formats and conversion strategies.

PDF Icon Download this document as a PDF file.

Tier 1: Preservation of Sustainable Formats

The library has identified a number of sustainable file formats that are widely used and/or non-propietary, many of which have been recognized as international standards by bodies such as the International Standards Organization (ISO), ECMA International, and the Organization for the Advancement of Structured Information Standards (OASIS). The longevity of these formats has furthermore been acknowledged by various peer institutions and experts in the digital curation community, including the Library of Congress's National Digital Information Infrastructure and Preservation Program.

Digital materials stored in these file formats should remain usable to researchers and administrative units at the University of Michigan for the foreseeable future and beyond. The Bentley Historical Library will therefore preserve the original version of content stored in these sustainable formats at the time of accession. Digital Curation Services will monitor community best practices and technological advances in case a migration to alternative preservation formats should prove necessary.

Click here to find basic descriptions of file formats or search the PRONOM Technical Registry for specifications and more in-depth information.

Media Type Sustainable Preservation Formats
Audio Files WAV: Waveform Audio File Format
AIFF: Audio Interchange File Format
MP3: Moving Picture Experts Group Layer 3 compression
FLAC: Free Lossless Audio Codec File
OGG: Ogg Vorbis Audio File
MIDI: Musical Instrument Digital Interface File (including SMF and XMF wrappers)
Office Documents and Text-Based Files DOCX: MS Word Open XML Document
XLSX: MS Excel Open XML Document
PPTX: PowerPoint Open XML Presentation
PDF/A: Portable Document Format (Archival)
PDF: Portable Document Format
TXT: Plain Text File
RTF: Rich Text Format File
XML: Extensible Markup Language Data File
CSV: Comma Separated Values File
TSV: Tab Separated Values File
Database Files CSV: Comma Separated Values File
SIARD: Software Independent Archiving of Relational Databases (open XML format)
MySQL SQL: Structured Query Language file (MySQL is an open source relational database management system)
Email Files MBOX:Mailbox File
Raster Image Files TIFF: Tagged Image Format File
JPEG/JFIF: Joint Photographic Experts Group JPEG Interchange Format File (lossy compression)
JPEG 2000: Joint Photographic Experts Group (lossless compression)
GIF: Graphic Interchange Format
PNG: Portable Network Graphic
Vector Image Files SVG: Scalable Vector Graphics File
Video Files MPEG-1/2: Moving Picture Experts Group
AVI: Audio Video Interleave File (uncompressed)
MOV: Quicktime Movie (uncompressed)
MP4: Moving Picture Experts Group (with H.264 encoding)
MJ2: Motion JPEG 2000
MXF: Material Exchange Format File (uncompressed)
DV: Digital Video File (non-proprietary)

Tier 2: Conversion of At-Risk Formats

The digital curation community has long acknowledged the disadvantages posed by proprietary formats (for which only specific software may be used) and content encoded with "lossy" compression (i.e. compression that reduces the quality of the data to conserve space). The Bentley Historical Library will therefore convert the most common at-risk formats to preservation-quality sustainable formats. The original version of content will also be maintained alongside the preservation copy to ensure the authenticity of the Bentley Library's digital collections. These conversion strategies reflect the policies and practices of peer institutions as well as the National Digital Information Infrastructure and Preservation Program.

Visit the Library of Congress Sustainability of Digital Formats site for more information on preservation issues and descriptions of preferred formats.

Media Type At-Risk Formats Preservation Target
Audio Files WMA: Windows Media Audio File WAV Format (preferably Broadcast WAVE)
RA: Real Audio File
SND: Apple Sound File
AU: Sun Audio File
Office Documents and Text-Based Files DOC: MS Word 1997-2003 Document MS Office Open XML (OOXML) Format
PPT: MS PowerPoint 1997-2003 Presentation
XLS: MS Excel 1997-2003 Spreadsheet
Database Files ACCDB or MDB: MS Access Database Files SIARD Open XML Format
MS SQL Server Database Files
Oracle Database Files
Email Files EML: Email Message File MBOX Format
PST: Outlook Personal Information Store File
Eudora Mail and approx. 40 other formats
Raster Image Files BMP: Windows Bitmap TIFF Format
PSD: Adobe Photoshop Document
RAW: Raw Image Data File
FPX: FlashPix Bitmap
PCD: Kodak Photo CD Image
PCT: Apple Picture File
TGA: Targa Graphic
Vector Image Files AI: Adobe Illustrator SVG Format
WMF: Windows Metafile
PS: PostScript PDF/A Format
EPS: Encapsulated PostScript
Video Files SWF: Shockwave Flash MPEG4 (with H.264 encoding)
FLV: Flash Video
WMV: Windows Media Video
RV (or RM: Real Video

Tier 3: Bit-Level Preservation

Because it is infeasible to create conversion plans for the tens of thousands of formats in existence, the Bentley Historical Library will ensure that digital holdings in other formats (i.e. ones not specifically identified in this document) will receive bit-level preservation. The use of integrity checks and regular replacement of storage media (conducted by trusted partners in the University of Michigan Library Information Technology division and Information and Technology Services) will preserve the raw data stored in these files (i.e. the "stream" of 0s and 1s) in its original state. The library concedes that hardware or software obsolescence may reduce the functionality of these files or render them inaccessible. At the same time, the faithful preservation of the bitstreams will allow the library to take advantage of future developments in emulation technology.

Please contact Digital Curation Services with questions or comments regarding the Bentley Historical Library's digital preservation and conversion strategies.