The following is a guest post by Barry Wheeler, Digital Projects Coordinator, Office of Strategic Initiatives.
With the large size and amount of my personal digital archives, my archiving problem may be a bit extreme, but I think a description of my archiving system may be helpful to many people who want to save and preserve their digital files. I am a photographer and I have tens of thousands of pictures, including both professional and personal work, to archive each year. Add in my routine papers, spreadsheets, and data files and I often have over 500 GBs of digital files to preserve at the end of the each year.
I do maintain backups – but in my system, backups cannot be considered part of my archive. Backups preserve current work. Over the years, as drives fill up or I change computers, old files are removed from the primary drive on my main computer. I want to preserve all these files that are no longer part of my backups. My backups are also compressed and maintained by proprietary software. I’d prefer to have my archive in common standard formats that will be readable in the future – or at least will be easy to convert to an accessible format. Therefore I have developed a system of yearly archiving. Again, this system is mainly for those with above average amounts of digital material. For others, do not be discouraged by this! At the end of this post, I’ve provided a simpler version for those with less material.
I prepared for my archiving task well in advance. All my content files are stored in directories by broad general topic, then in subdirectories named by date and specific subject. (All of my data subdirectories begin with the date in YYYYMMDD format so they sort automatically most recent to oldest.) Once the files are organized, I follow these six steps in my end-of-the-year archive processing:
First, I purchase a new external hard disk drive each year. I name the new drive – physically with a tape labeler and electronically in my drive properties tool. Thus, this years’ drive is ARCHIVE11. I also name a top-level directory “archive11”.
Second, I copy all documents for the past year to the new drive into the “archive11” directory. This should take about 33% of the drive space.
Third, I setup a powered USB hub connected to my primary computer and connect each of my yearly archive drives to the hub – each drive should appear on my computer desktop as ARCHIVEXX. (By now, I have 6 USB drives! A forest of drives and a tangle of cables as Figure 1 above shows.)
Fourth, I check the available space on last years’ drive. If I used approximately 33% of the space last year I should have enough space, so I create another top-level directory named “archive11”. Again, I copy all documents for the past year into this “archive11” directory. I now have two copies of my past years’ documents, each on a separate drive. As part of my archive plan, I also keep a copy of my very best images on a remote, “cloud” site, but that’s another blog. (Figure 2 shows my computer desktop, directory listing, and file info panel.)
Fifth, I use a disk utility to check (and repair if necessary) the integrity of each external drive. Then I check file permissions with repair on each external archive drive. I also randomly select and open a number of files from each drive and each archive directory.
Finally, I import both the new archive directories into my cataloging database (again, perhaps a suitable subject for another blog) and then power down and store each drive with its’ power supply until next year. At this point I can delete all archived files I do not intend to work with immediately from my working drive. (Figure 3 shows my drives ready for storage.) My yearly archiving processing is now done!
As promised, this process can be simplified and used by anyone who wants to archive their most important digital documents. The basic process a user can follow is:
- Gather all your important data files into one master directory.
- Arrange them by year – especially if you archive tax files.
- Make copies of the files. If the total directory size fits on a CD (ie. less than 600 MBs), then make separate copies on two archival gold CDs. (The life expectancy of an inexpensive or standard CD is uncertain so I think the archival gold CDs are worth the extra expense.)
- Continue copying each year, on two new gold CDs.
- Check all CDs yearly. If your document collection is too big for CDs, use external hard disk drives – and be even more vigilant in checking the drives!
(See Barry Wheeler’s previous post for The Signal, on photo sharing sites.)