Mike Wash is an engineer, technologist, inventor and visionary who holds 18 patents (search on the U.S. Patent Office site for “Wash; Michael L”), designed and implemented many of the standard automatic functions in modern digital and film cameras and — in an incredible feat of engineering — helped create a new data infrastructure for the Government Printing Office. Today he is the chief information officer of the National Archives and Record Administration and he has ambitious plans to whip NARA into an efficient 21st century federal institution.
Wash, a lifelong photographer, graduated from Purdue University in 1977 with a bachelor’s degree in electrical engineering and went directly to work for Eastman Kodak, where he had worked as a student since 1973.
Initially he helped improve the integrated circuits used in exposure control and other automated camera functions. By the 1980s, when Kodak realized that the demand for film – their largest source of income – would fade as digital photography evolved, they explored other potentially profitable consumer products. Wash researched removable digital storage devices for cameras, specifically high-capacity magnetic storage devices, the ancestors of modern camera SIM cards.
Wash then worked on Kodak’s last consumer film format, the Advantix system. “I was tasked with creating an information exchange model and design that would allow metadata to be captured on conventional photographic film,” said Wash…technical metadata such as date, time and lighting.
While metadata capture is straightforward for a digital camera, Wash’s lab was working with a unique hybrid of digital data and emulsion on film. He said, “Our solution was to apply a virtually transparent magnetic coating on the back of the film to capture the digital data, while the emulsion on the front side of the film captured the optical data. It allowed the photographic system to be unimpeded by the digital technology. It was really very cool.”
His last project at Kodak — and the one that may have a pivotal relationship to his transition to government data systems — was creating a large-scale distributed photo-processing system. The result was Kodak Perfect Touch processing and it was scalable enough to meet a variety of demands, from walk-up kiosks (such as the ones in pharmacies), to retail one-hour photo stores to high-volume wholesale operations.
Wash left Kodak and worked for awhile for Gerber Scientific Products, a sign manufacturing giant, helping them transition to digital inkjet systems. During his time there, a recruiter invited him to help the Government Printing Office transform from traditional print-centric operations to digital publication operations. After several conversations with the Public Printer, Wash was hired in 2004 as GPO’s chief technical officer.
At that time, GPO’s existing digital publications system was ten-years old and had limited functions. “You put publications on a server, then online, and you provide some rudimentary search capabilities to get access to documents that you hope are correct,” said Wash. “They needed a system based on the mission of GPO…capable of accepting input from the federal government, assuring the user that it was authentic information, preserving that data in perpetuity and providing permanent public access.” Wash’s team spent five years researching and developing this new system and in 2009 released the Federal Digital System, referred to as FDsys.
Authentication is crucial to FDsys’s credibility, so FDsys initiates a metadata “chain of custody” at the point of content submission — where the data came from, who the author is and so on — so the provenance is always available. “And when the content is rendered for access, we could apply a digital signature to it,” said Wash.
When Wash came to NARA in March 2011, some of the challenges he faced were comparable to those at GPO and some were different. Both require an information-management system that preserves, authenticates and makes information permanently accessible. However, at NARA the information scope is larger, since GPO works solely with publications. And Wash had to replace GPO’s old content-management system; at NARA, their Electronic Records Archives, built in the mid-2000s, did not need replacing. He said, “The challenge at the National Archives is to support (NARA’s ERA), enhance it and allow it to continue to move forward to meet the agency and the government’s needs.”
Wash said that another challenge down the road is to enable federal government content management systems to communicate with each other. He pointed out that NARA’s ERA does communicate with FDsys but in a limited way: NARA has to submit information to FDsys to be hosted and made available.
One reason NARA uses FDsys to publish data is because FDsys is a workhorse that reliably responds to intense periods of user downloads. Wash cites the release of the Nixon testimony transcripts as an example. The 37 transcript files range in size from 5 to 10 MB or 250 MB bundled. On the day of the transcripts’ release, FDsys easily handled about 14,000 individual file requests and over 6,000 bundled requests. In all, over 1.6 TB of downloads.
A few times during our conversation, Wash emphasized GPO and NARA’s missions of creating permanent public access to government content. It’s apparent that he has a solemn and sincere dedication to his responsibility.
Access applies not only to the front end — the browser through which users get at the data — but also to the back end and how efficiently institutions transfer, store and retrieve the data. Government digital collections can be massive and difficult to move. For example, the 2010 census, which was entirely electronic, is over 300 TB. Wash said, “In the federal government there are lots of agencies that constantly generate an awful lot of data.”
Wash is planning NARA’s place in the cloud. He points out that government institutions must comply with the Federal cloud computing strategy and its “cloud first” policy, which states, “Beginning immediately (December 2010), the Federal Government will shift to a “Cloud First” policy….When evaluating options for new IT deployments, OMB will require that agencies default to cloud-based solutions whenever a secure, reliable, cost-effective cloud option exists.”
A distributed federal government cloud structure will free institutions from the costly cycle of buying, maintaining and disposing of hardware. And resources can be scaled to serve a lot of information during periods of peak demand and dialed back when the demand subsides.
Wash said that the General Services Administration is working to identify qualified federal cloud providers. Cloud storage, then, will become part of each institution’s budget. “It becomes almost like a GSA buy,” said Wash. “Like you’re buying pencils, you’ll buy storage off of a GSA schedule from a certified cloud provider.”
Part of Wash’s cloud-based data-architecture strategy for tracking data is the implementation of management metadata. So it won’t matter where the hardware is as long as NARA’s content is always immediately accessible. “We won’t need data centers where you can go out and hug your server once a day and say, ‘That’s where my data is,'” said Wash “You won’t know where the records are, you’ll just know that they are accessible.”