As the profound benefits of research data sharing have been recognised by institutions, funders, and individual researchers, we have seen a number of high-profile funding agencies implementing open data policies to complement their existing open access mandates. Adopters of such policies include the European Commission’s Horizon 2020 programme, the Wellcome Trust, and the Bill and Melinda Gates Foundation.
These mandates, and the resultant need to meet their requirements, have seen institutional libraries and research offices rapidly develop processes to assist researchers with compliance. This has taken a number of forms, including support for data management planning and best practice advice on licencing, copyright, and sharing platforms.
Within the outreach programme that we have undertaken, we see several topics surface regularly (many of which are reflected in our recent white paper: Practical Challenges for Researchers in Data Sharing).
As you would imagine, there is strong support within institutions for the sharing of research data. However, there are numerous questions that have arisen around this endeavour.
Here are some of the key questions emerging from these conversations:
- What are research data?
In many STEM and social sciences disciplines, this is a relatively well-understood concept. In the humanities, however, there is a considerably less consensus on what constitutes research data. Reaching a consensus will require collaboration between research communities to agree and establish standards.
- Which research data can be shared?
Some data cannot be shared either due to restrictions on data ownership (e.g. corporate-sponsored research), privacy (personally-identifiable data), or potential intellectual property infringement. The ethos of “As open as possible, as closed as necessary” has been taken on board by some institutions.
- Where can/should research data be shared?
Repository options are numerous and growing - these often include subject-specific repositories (e.g. GenBank and FlyBase), institutional data repositories, and general purpose repositories (e.g. figshare and Zenodo). Because of the range of choices, tools now exist to help researchers choose the best repository for their research data. Two examples are the re3data repository registry and the recommended repository list (which is maintained by the Springer Nature Research Data team).
- How can institutions support researchers to develop the skills to meet the FAIR principles?
Meeting the FAIR principles is likely to be an on-going expectation. Ultimately, this could lead institutions to provide best practice support for post-graduate research students as part of their academic training. Many institutions are currently seeking to provide personalised support and answer researchers’ questions relating to individual datasets.
- How can institutions and researchers ensure compliance with funder policies?
In a world where failure to meet funder mandates may negatively influence future funding requests, how can institutions track their compliance and direct support services to the necessary areas? There are many ways this could be approached, and one common theme is the desire to create data catalogues where metadata records can be stored within an institutional repository. Identifying these metadata, however, can be difficult and time-consuming.
These challenges are not insurmountable, and with collaboration between research communities, funders, publishers, and data infrastructure providers, we will be able to maximise the value of publicly funded research, minimise duplication, and ultimately support improvement in reproducibility of research. This is a goal we can all agree is worth pursuing.
I’ll be at UKSG in Glasgow next week - if you’re there it would be great to discuss any issues around research data sharing your institution is facing.