The Latin American Materials Project (LAMP) at the Center for Research Libraries received funds in 1994 from The Andrew W. Mellon Foundation to explore aspects of digitization from microfilm. Working in cooperation with the Biblioteca Nacional do Rio de Janeiro, LAMP selected its collection of Brazilian government documents because of their scarcity, importance, and volume. Completed in December 2000, the project digitized more than 673,000 images of government publications, freely available over the web.
Read more in the detailed final report of the project.
Technical Information
The Brazilian documents were scanned from microfilm copies of the originals, which were filmed by the Biblioteca Nacional do Rio de Janeiro. The images were originally stored as GIF and TIFF images in the Center for Research Libraries’ Electronic Document Storage and Distribution Facility. TIFF images were produced at 300 dots per inch ("dpi") and GIF derivatives for presentation on the web were produced at a resolution of 100 dpi.
In 2001 CRL transferred the project files originally stored on a magnetic-optical storage medium ("jukebox") array to large-capacity hard disks. In 2009 CRL migrated the project files into its Drupal-based system.
At the time of project implementation, Optical Character Recognition (OCR) software had yet to fully mature to produce reliable results for Portuguese-language material presented in various typefaces and font sizes (density and optical resolution of the microfilm “originals” also precluded meaningful OCR at the time).The project team utilized manual page-level indexing to provide navigational assistance within each report. Selected tables of contents were keyed later in the project, with hyperlinks added to the appropriate page of each table.
In 2018 CRL reprocessed the project file images (TIFFs) using state-of-the-art OCR software, generating full-text results for each item and page, and producing coordinated OCR output to highlight search terms found within the documents. Files were migrated to CRL's newest generation Digital Delivery System (DDS), presenting the contents in a IIIF-compliant image management system, with full text search and download capabilities.
Image Quality and Selection
The images in this database are uneven in quality. Most images are legible, but some are not. The quality of images can vary considerably from one page to the next. Poor image quality is due to the poor condition of the paper copy when it was filmed. The damaged paper copy resulted in degenerated microfilm images, which then migrated to the electronic medium. In a few isolated cases, such as the entire Piaui Provincial Presidential Reports, the film images were not scannable at the time of the project (Piaui files scanned separately in 2009 and added to CRL's DDS). Documents not available on film have been considered unavailable and have been omitted from these files. Blank pages included in the original page sequences, which were also copied during microfilming, have not been scanned. The corresponding page numbers for these pages were dropped from the original index.