A principal goal of the Copyright Office is to digitize the content of the card catalog records. This work is already underway. The card catalog is considered the most up-to-date index to copyright records prior to 1978. It has been updated over time to reflect corrections and changes sometimes with handwritten annotations and sometimes with new cards. The digitization and indexing of no other Copyright records can provide the completeness and accuracy that can be realized from the digitization and indexing of the card catalog. To the extent it remains in paper form, this nucleus of the non-digital Copyright Office records will remain off-line. Many of the works recorded in the card catalog are still under copyright protection. As presently envisioned a searchable data record would be created from the cards and each data record would have links to images of the respective cards, applications, record book pages and documents.
Another index to copyright records is the Catalog of Copyright Entries (CCE) which was published periodically between 1891 and 1977. Scanning of the CCE volumes is also underway. If a high success rate for optically recognizing, verifying, and parsing the character strings in the scanned images could be achieved followed by automatic indexing, it could provide an alternative search capability as well as achieving to a degree the preservation goal. It would also provide a means for better searching of the CCE data for situations where that is recommended such as a search for works pre-dating 1938. But the CCEs do not contain all of the updates that have been made to the catalog cards and they do not include assignment and transfers of copyrights, so they cannot provide the full record of ownership of a particular copyright. Nevertheless, a second goal is to digitize the contents of the CCEs and to explore ways to capitalize on the results of applying OCR. Including links to the CCE page images in the respective data records is also being considered.
A third goal is to digitize the content of the bound Record Books of applications and requests for copyright registration and to link each image to the respective data record. Microfilm copies exist for all Record Books and scanning of the microfilm might facilitate digitization. However, the quality of the output may not be as good as scanning the Record Book pages and may not be acceptable.
A fourth goal is to link the existent PDF images of assignments and transfers of copyrights, made from the microfilm copies of the documents, to their respective data record. An assessment will be made of the quality of the PDF images and if found acceptable the PDF files will be used as is. If they are not acceptable this goal will be expanded to determine what would be required and what it would cost to produce improved images from the microfilm.
Achieving these goals could result in a single searchable database of copyright records with links to images of currently non-digital records. The database could eventually cover the period from 1790 to the present.