The following is a guest post by Carla Miller, Administrative Specialist for the Office of Strategic Initiatives.
On March 23, 2012, the Still Image and Audio Visual Working Groups of the Federal Agencies Digitization Guidelines Initiative (FADGI) held a joint meeting hosted at the National Archives and Records Administrations (NARA) College Park campus. This is part one of a two-part post; part one covers the Still Image Working Group meeting, and part two will cover the Audio Visual Working Group meeting.
The Still Image Working Group, led by Steve Puglia of the Library of Congress, discussed ongoing work (meeting slides are here), including the work of two sub-groups: File Format and Embedded Metadata. Currently, the File Format Sub-group is assessing five raster image formats the members consider acceptable for master image files for the cultural heritage community: TIFF, JPEG 2000, JPEG, PNG, and PDF. For many years, the trend has been to use uncompressed TIFF files, but the files are large and thus are burdensome in networks and in requirements for storage. These factors have motivated a new look at file formats and at options for compression. The group is looking at more than just sustainability factors for these formats, including costs (storage, network, ongoing, tools/start-up, access, and preservation) and implementation considerations. Stay tuned!! Meanwhile, the Embedded Metadata Sub-group has adopted a set of guidelines from the Smithsonian as a recommendation.
The preceding should not suggest that image quality is not important! Although lossless compression will be best for certain categories of materials, Steve discussed ongoing research at LC on evaluation methods and the effects of lossy compression on raster images. How to determine the effects of compression is a tricky business, though, because one wants to strike a balance between objective and subjective analyses. With this in mind, the approaches being considered fall into three categories: visual/subjective evaluation, metric/objective evaluation (e.g. mean squared error [MSE], peak signal to noise ratio [PSNR], structural similarity index (SSIM), etc.), and task accuracy (e.g. optical character recognition [OCR]).
Dr. Lei He at LC has been conducting an interesting analysis that looks at the correlation between objective measurements of the changes effected by image compression and the subjective reaction to those changes as noted by human observers. This is a work in progress, but the idea is that objective measures can be used to anticipate the likely subjective reactions. So far, the results are comparable to similar work done by other organizations. Preliminary findings indicate at low to moderate lossy compression, the brightness and color values of a majority of pixels are altered to a very minor degree and the changes are well within a subjectively acceptable range. Also, the effect of compression varies for different collection types: cartoon drawings show the most effect, followed by fine prints and black & white negatives; color photos show the least effect.
Members of the Still Image Working Group are also undertaking a CIE Color Accuracy Study. The targets and samples being imaged for the study have been sent to and imaged by a second group of labs in the United States, including Harvard University, Stanford University, the Art Institute of Chicago and the National Gallery of Art in Washington, DC. The next step is to analyze the data collected from all seven North American labs, and a second phase of imaging by European labs has started. An update on the study findings will be presented at the IS&T Archiving Conference in Copenhagen, Denmark, this summer.
One challenge for institutions with collections of historic photographic negatives is deciding on what scanning resolution is appropriate for digitization. Steve talked about how the Library of Congress and other institutions are analyzing the spatial frequency response (SFR) of negatives from different collections using an SFR analysis described by Don Williams. Sample negatives from a collection are scanned at higher-than-expected resolution (verified with a target), and selected features in the digital images are analyzed to determine appropriate resolution for capturing all the scene detail.
The Library and other institutions are also using similar techniques to monitor production scanning of negatives using an SFR target and M-Scan/ImCheck software. This is accomplished by scanning the target on a daily basis (for larger targets, scan all four corners and the center) and then plotting over time to monitor the change and variability.
Of interest to many in the community is the announcement that a new working group has been formed within ISO. WG26 of ISO TC42 plans to develop standards relating to digitization tools such as targets.