The archives of the Afro American Newspaper in Baltimore MD contain over 1.5 million historical photos spanning 115 years of the city’s African American history. One of the largest Black history collections in the world, the Afro’s archives include thousands of photos which have never been seen by the public.
Why? Of the paper’s 1.5 million photos, only around 10,000 exist in a digital form; the Afro, like many small archives, simply does not have the human resources to manually digitize its collections. As a result, photos with incredible value for scholars, educators and community members alike are available only to the select few with the access, specialized skills, and time to travel to the physical archive and locate them.
Project Gado was founded in 2010 to address these challenges. The project seeks to create an open source archival scanning robot which small organizations like the Afro can use to autonomously digitize their photographic holdings. The Gado 1, a proof-of-concept machine built using Python and Arduino, has successfully scanned over 1,000 photos to date.
At present, Project Gado is developing the Gado 2 (pictured below as an early prototype), a second-generation machine which will cut scanning time by a factor of four, occupy a footprint half the size of the Gado 1’s, and require no specialized skills to assemble and operate. The project is also developing a photographic licensing site (launching May 2012) which will allow archival partners to generate a lasting revenue stream from their digital collections, creating an incentive for more small archives to adopt the Gado technology.
This talk will provide an overview of Project Gado and the Gado 2, and will address specific challenges faced and lessons learned from using Python as the primary language for an open robotics project and a major archival digitization initiative.
Technical topics covered will include Python and Arduino interfacing for machine control, Python/TWAIN integration, use of PIL and OpenCV for post-processing, and MySQL integration for image management and metadata annotation. These topics will be presented primarily in the context of a case study, rather than a tutorial; the main goal will be to show how Project Gado used these Python technologies to solve problems, and to demonstrate how the technologies could be used to solve similar problems in other cases.
The talk will conclude with a discussion of opportunities for interested developers to contribute to the Gado codebase, and for interested institutions to implement the Gado 2 in their own archives.