Crowdsourcing the transcription of digitized archival records

Increasingly, researchers who use primary source materials hope their research can be done remotely through the use of digitized copies of archival records.  However, simply providing access to these digital surrogates is not always enough to optimize their utility for research. Though optical character recognition (OCR) technologies, under ideal circumstances, can do fairly well to provide keyword access to modern typescript documents, the contents of digitized handwritten and non-standard archival records are generally not searchable without the creation of modern transcriptions, translations or tags. Transcribing digitized archival records is an incredibly time-consuming and resource-heavy activity that archivists do not have time to undertake. Consequently, crowdsourcing the transcription of these types of materials can be a great way to share the work, introduce people to an institution’s archival holdings, provide experience working with primary sources, and add research value to those digitized records. This work, along with the human labour that goes into the digitization of primary source documents in the first place, forms the foundation of many digital scholarship research projects.

Given the current global restrictions on physical access to archival holdings due to the COVID-19 pandemic, this seems like a good moment to highlight the many projects that provide non-archivists with the opportunity to engage with primary source records like letters, photographs, government documents, maps, diaries, audio recordings and more. Whether you’re a history or literature buff seeking a quarantine project, an educator looking for online, experiential ways to engage your students with primary sources, or a home-schooling parent hoping to mix up your daily curriculum, there is a project out there for you!

United States

There are quite a few archives and libraries in the United States that manage their own crowdsourced projects for the public. While there are too many to list, here are several particularly interesting projects.

The Library of Congress’ By The People webpage details their current transcription campaigns, including Suffrage: Women Fight for the Vote, Letters to Lincoln, and Walt Whitman at 200.

The Newberry Library in Chicago has transcriptions projects relating to letters and diaries of Midwest families, American Indians, and US western expansion.

The New York Public Library has three projects of note currently available. For those map lovers with NYC knowledge is the Building Inspector project, which encourages participants to identify buildings and other details on historic maps. Foodies and food historians may want to participate in their What’s On The Menu project to transcribe the NYPL’s collection of restaurant menus. Also of note is their Community Oral History Project, which provides participants with the task of transcribing oral history recordings that document the lives of NYC citizens, their communities and neighbourhoods.

For fans of military history, the papers of the US War Department, which existed only between 1790 and 1800, once thought lost to time, are available online, and a transcription project is underway.

The Smithsonian Institution has a transcription center relating to its various branches. Current projects include astronaut Sally Ride’s handwritten speech notes and the diary of an American teenager during World War II, Doris Sidney Blake.

The US National Archives describes its crowdsourced projects as citizen archivist missions and organizes them by level of complexity, for beginners and more experienced transcribers. Current options include Alan Turing’s Treatise on the Enigma and a 1967 criminal docket in the matter of U.S. vs. Cassius Clay Jr. aka Muhammad Ali.

On a smaller institutional scale, there are several other US projects worth highlighting. The Old Weather project might appeal to sailing fans, where one can transcribe weather observations from mid-19th century ships’ logs. The DIY History project, from the University of Iowa Libraries' Special Collections, University Archives and Iowa Women's Archives, allows participants to keyword search documents to transcribe, including their collection of science fiction fanzines. Hamilton College’s American Prison Writing Archive allows supporters to transcribe handwritten essays submitted by inmates to Prison Legal News, making searchable writing that documents the experiences of American prisoners.


Canadian archives and libraries, plagued by perpetual resource and funding issues, have fewer crowdsourced projects than our counterparts in the United States. This is a good reminder that even projects that utilize volunteer labour require staff oversight and management, which is not always easy for small institutions to facilitate.

However, Library and Archives Canada (LAC) has gotten into the game with their Co-Lab tool, which allows participants to work on a variety of transcription, translation and tagging challenges while working with LAC’s digitized records. Items like diaries, politicians’ letters, photographs, index cards, First World War records and government documents relating to Indigenous peoples are some of the materials available to work on. Not surprisingly, their transcription project for the 1918-1919 Spanish Flu Pandemic records is now 100% complete.

The Nova Scotia Archives is another government archives with a transcription pilot project. Its current featured pr is Commissioner of Public Records - refugee Negroes.

Edited to add:

The Community Archives of Belleville and Hastings County also currently hosts a crowdsourced project that invites participants to transcribe the 1871 Hastings County assessment roll. The University of Guelph Library has its own transcription project that encourages participants to transcribe the diaries of rural Ontarians dating from 1800 to 1960 on their Rural Diary Archive website.

United Kingdom

Jeremy Bentham fans should check out University College London’s Transcribe Bentham project, which aims to increase access to original and unstudied Bentham manuscripts.

The British Library has a separate website for its crowdsourced projects, Libcrowds. Current highlighted projects include transcribing digitized catalogue cards and transcribing the contents of historical theatre playbills.


Last but certainly not least is Zooniverse, a well-known platform for what they call “people-powered research”. Zooniverse provides a space for institutions that may not have the resources to host and promote a crowdsourced project on their own. There are always a wide variety of projects available on different subject matter. Here is just a selection of current transcription projects:

Anti-slavery manuscripts from the Boston Public Library

A multi-institutional project to translate and transcribe the Cairo Geniza

A project to track the lives and records of Australian prisoners.

A multi-institutional project to transcribe the military records of African-American Civil War soldiers

A Universidade de Coimbra project to track plant species and scientists from 19th correspondence [correspondence in Portuguese]


3 comments on “Crowdsourcing the transcription of digitized archival records

  1. Hi - wondered if you would like to include a (relatively!) local crowdsourcing project - over at the Community Archives of Belleville and Hastings County we are working on transcribing an assessment roll which covered all the municipalities in Hastings County in 1871. The project is described in more detail at Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *