The following notes are from a talk by Myron Gutmann of the ICPSR at the Charleston Conference. Gutmann talked about the long history of open data in the social sciences.

Open data wasn't invented yesterday: The Roper Center for public access to polling data opened in 1946. ICPSR founded in 1962. Research data sharing really took off in '60s with census data, major surveys, etc. In social sciences, large number of students & faculty build career on public data. This was made possible, in part, due to post-war and 1960s increase in funding for the social sciences.

"Putting People on the Map" report from National Academies of Science. How do we make sure social data is shared, public protected, public investment reaps value?

Social sciences benefit from the DDI Metadata Standard which defines an XML standard for open data sharing within social sciences.

What do you mean by data? Traditionally: closed-end survey results. Increasingly though questions are data? Actually this is a more "normal" way to approach the literative for a social scientist. Qualitative data? Video & Audio?

Minding our Knitting (or what social scientists can't lose focus on):

  • curation matters

  • The scientific authority of ICPSR is critical, and is established by reports that mention ICPSR as key infrastructure

  • must protect the human subjects of research who are the basis of studies

What are the imperatives/challenges?

  • Financial. Open data doesn't mean free data. Who owns the data? Costs of curation and preservation. Different funding models for data (e.g. membership, subscription, sponsorship)

  • Protecting Confidentiality. All of the most interesting data has the potential for use in identifying individuals.

  • Maintaining Provenance and Authority

"Preservation related transformations" must preserver provenance.

Open data is reality in social science.