Data

All community (user-contributed) transcriptions of the U.S. Census will be released under the CC0 1.0 Public Domain Dedication. This permits both commercial and non-commercial uses. We will make the data available for download on this page as transcription progresses. We encourage you to contribute if you can to help offset the cost of developing, maintaining, and hosting the project, however there is no requirement to do so. If you redistribute the data elsewhere or incorporate it into products, we respectfully request that you provide a link to opengendata.org or a short attribution statement such as "U.S. Census data courtesy of opengendata.org." If you use the data for research that results in publications, please cite the JCDL 2018 paper.

Census images (without transcripts) for 1790 through 1930 are available for free online at archive.org, made available through collaboration between the Allen County Public Library Genealogy Center in Fort Wayne, Indiana and the Internet Archive. Images for 1940 are available from the National Archives at archives.gov. We already downloaded the images for 1940 from the National Archives, so if you would like a bulk copy instead of downloading them yourself, contact us to make arrangements (cost is $1,500 including the external hard drives and may take a few days to process after we confirm the order and receive payment; USA destinations only).

For researchers building and improving automatic handwriting recognition / transcription software: we will be making a dataset available for free download here in the future that you can use for training, testing, and comparing your algorithms.