Research Data Infrastructure Are the Social Sciences on the Main Street or a Side Street? Chuck Humphrey University of Alberta 2 Outline What does infrastructure mean today? Five examples of events or initiatives that have happened or started since IASSIST 2010. A quick look at how research infrastructure has been described. Using the five earlier examples to rethink what research data infrastructure is. End with some comments about the importance of cooperation and collaboration in developing infrastructure. 3 Infrastructure Research data infrastructure has become associated with e-Science and cyberinfrastructure. 4 One event & four initiatives One event SPARC Digital Repositories Meeting, Nov. 2010 Four initiatives OECD Global Science Forum on Data and Social Science Infrastructure Canadian Association of Research Libraries Research Data Management Infrastructure Canadian Research Data Strategy Working Group Data Summit Canadian International Polar Year (IPY) Data Assembly Centre Network 5 Digital repositories The program included themes on open data and global repository networks 6 Digital repositories What I have learned: Many repository librarians are interested in including research data in their local repositories and they certainly do not seem to be intimidated by research data. On the other hand, there does not seem to be a wide understanding of what is involved in managing research data. They do, however, understand digital collections. Their discussions about global repository networks tend to focus on facilitating federated searching rather than cross-repository collaboration. 7 Global science forum An OECD Global Science Forum on Data and Research Infrastructure for the Social Sciences. 8 Global science forum What I have learned: An on-going dialogue is critical if we are to discover new ways of educating key stakeholders about the need to share research data and about the obstacles preventing this sharing. We need to engage a wide range of stakeholders to find solutions that will facilitate data sharing. Some countries are way ahead of other countries in making progress on sharing data across national borders. There is not a sense of developing a commonly shared data collection but rather the focus is on access to discrete microdata across nations. 9 Research libraries The Canadian Association of Research Libraries established a steering committee to oversee a proposal to the Canada Foundation for Innovation for a Collaborative Research Data Management Infrastructure. 10 Research libraries What I have learned: It is possible to expand the mandates of long- standing “memory” institutions to incorporate research data in both their digital collections and their preservation activities. These are intentional, long-term commitments by institutions to collections of research data. The challenge is in finding a balance among a number of competing interests across stakeholders, while ensuring that this collaboration is somewhat supportive of all. This means that organizational structure is as critical as technical structure. 11 Research data summit The Canadian Research Data Strategy Working Group, sponsored by Canada’s national science library (CISTI), is hosting a data summit. 12 Research data summit What I have learned: The moment for research data in Canada is “now!” When the word was spread about the data summit, sponsors came -- with money in hand -- asking to contribute to this event. There seems to be a heightened sense of urgency among senior public officials around data at the moment. When it comes to data management, the roles and responsibilities of policy makers and senior managers across organizational sectors share a lot in common. 13 Data assembly centre network The Canadian International Polar Year Data Assembly Centre Network came out of a request for proposals by the Government of Canada in June 2010. 14 Data assembly centre network What I’ve learned: Collaboration across large institutions and sectors can work but it requires building a culture of trust and a sense of an urgent, common mission. A community cloud is a possible platform for preservation activities. Micro-services can be developed in the cloud that support functions and activities around a large, shared research data collection. Many stakeholders do not understand preservation but think almost exclusively about access. Preservation often becomes explained in terms of immediate, mid-term and long-term access. 15 Research infrastructure There is no consensus [definition for] “research infrastructure” … There is general recognition, however, that [it] extends beyond large centralised facilities (such as telescopes or research vessels) to include physically distributed resources for research, such as computing networks and large collections of data or physical objects (p3). Large Research Infrastructures OECD Global Science Forum December 2008 16 Research infrastructure Research Infrastructures are facilities, resources or services of a unique nature that have been identified by European research communities to conduct top-level activities in all fields. Strategy Report on Research Infrastructures Roadmap 2010 European Strategy Forum on Research Infrastructures March 2011 17 Cyberinfrastructure Cyberinfrastructure integrates hardware for computing, data and networks, digitally enabled sensors, observatories and experimental facilities, and an interoperable suite of software and middleware services and tools (p6). NSF’s Cyberinfrastructure Vision for 21st Century Discovery July 2006 18 Cyberinfrastructure [C]yberinfrastructure is the set of organizational practices, technical infra- structure and social norms that collectively provide for the smooth operation of scientific work at a distance (p6). Understanding Infrastructure: Dynamics, Tensions, and Design P. Edwards, S. Jackson, G. Bowker and C. Knobel January 2007 19 Cyberinfrastructure 20 Research data infrastructure The concepts of collections and services in support of collections are missing from these earlier definitions of research infrastructure. It could be argued that the distribution along the technical-social dimension defines services. From my perspective, the distribution along the collection-services dimension defines the social support for collections. 21 Research data infrastructure Global Local SocialTechnical Services Collections 22 Data infrastructure model Social Technical ServiceCollectionServiceCollection LocalGlobal 23 Global science forum Social Technical ServiceCollectionServiceCollection LocalGlobal 24 Research libraries Social Technical ServiceCollectionServiceCollection LocalGlobal 25 Digital repositories Social Technical ServiceCollectionServiceCollection LocalGlobal 26 Research data summit Social Technical ServiceCollectionServiceCollection LocalGlobal 27 Data assembly centre network Social Technical ServiceCollectionServiceCollection LocalGlobal 28 Cooperation & collaboration This infrastructure landscape is complex. But all one needs to do is look at an example of a successful “big science” project to realize that complex does not mean impossible. New technology continues to provide ways of breaking complexity into smaller, more manageable pieces. The difficult task is achieving the organizational or social change that is a vital part of infrastructure. Developing and maintaining a cooperative and collaborative spirit among stakeholders is an important ingredient to achieving social change. Where are the social sciences? Main Street or a Side Street Main street! Many exciting initiatives are happening internationally around social science research data infrastructure and the stories behind many of these are being presented here at IASSIST 2011. Enjoy the rest of IASSIST 2011!