Big Public Databases
1. Kin Lane’s Federal Dataset Tool
http://federal-agency-dataset-adoption.publicprivatesector.org/index.html
Many of the following listings refer to US Federal Government datasets. These are some of the biggest public datasets available. Unfortunately, much of this data is messy, published without much regard to its consumption and use.
This project, from Kin Lane, is both an index of a huge number of Federal Government datasets, and a way to access versions of these datasets on GitHub as these are cleaned and repurposed by other users – including descriptions of the alterations made. This tool requires a GitHub account.
Keywords: big data, public data, Open Data Policy-Managing Information, GitHub
Audience: social scientists, economists, advocacy groups
2. IPUMS project (Minnesota Population Centre)
https://www.ipums.org/
The Integrated Public Use Microdata Series (IPUMS) is an enormous database of individual level microdata. The data in IPUMS USA is drawn from the United States census, the American Community Survey and the Current Population Survey. IPUMS International includes census data from 73 countries, harmonized to allow comparisons across different times and places.
Combined, these datasets comprise a truly massive resource of census-type information, carefully prepared for ready comparison and interrogation. The website has a built-in SQL selector, a useful FAQ covering the basic questions about the data and its use, and a forum covering more technical questions.
Keywords: IPUMS, Minnesota Population Centre, big data, microdata, census
Audience: population survey researchers, marketers, social scientists, political scientists
Continue reading →
Like this:
Like Loading...