Desirable Characteristics of Data Repositories for Federally Funded Research

The US government’s National Science and Technology Subcommittee on Open Science (SOS) recently published guidance on good characteristics of data repositories to help their agencies provide better instruction to researchers they fund. The guidance aims to be FAIR, and will likely global influence.

Like Comment

In 2013, the White House Office of Science and Technology (OSTP) issued the Memorandum on Increasing Access to the Results of Federally Funded Scientific Research, directing Federal agencies with over $100 million in annual R&D expenditure to require researchers they fund to prepare Data Management Plans of how the digital research data they produce will be managed and shared.

The SOS recently published a guidance report titled Desirable Characteristics of Data Repositories for Federally Funded Research (and announced via the OSTP Blog) to "identify a consistent set of desirable characteristics for data repositories that all agencies could incorporate into the instructions they provide to the research community for selecting data repositories" and remaining "consistent with the principles of making data  FAIR and promoting equitable access to research products, and that integrate necessary protections of privacy and security, including human subjects’ protections".

The guidance does not adopt any existing certification criteria (to avoid costs and complexity associated with certification, and to allow for the differences between different agencies and research communities), but drew upon the aspects of certification criteria to compile their desirable characteristics.

The desirable characteristics are as follows (though see the original report for more details on each):

Desirable Characteristics for All Repositories:

  • Organizational Infrastructure
    • Free and Easy Access
    • Clear Use Guidance
    • Risk Management
    • Retention Policy
    • Long-term Organizational Sustainability
  • Digital Object Management
    • Unique Persistent Identifiers
    • Metadata
    • Curation and Quality Assurance
    • Broad and Measured Reuse
    • Common Format
    • Provenance
  • Technology
    • Authentication
    • Long-term Technical Sustainability
    • Security and Integrity

Additional Considerations for Repositories Storing Human Data:

  • Fidelity to Consent
  • Security
  • Limited Use Compliant
  • Download Control
  • Request Review
  • Plan for Breach
  • Accountability

While the guidance was produced for and by the United States, their position as the largest producer of research, particularly in these times of international research collaboration, means the influence of guidance will be global.

This guidance on desirable characteristics of data repositories is particularly timely as the Research Data Alliance also currently has a Working Group focussed on producing “a list of common attributes that describe a research data repository and to provide examples of the current approaches that different data repositories are taking to express and expose these attributes”. The guidance also comes not long after Springer Nature modernised our approach to repository guidance, moving from maintaining a list of repositories that we have verified to recommending our authors use community repository registries such as re3data.org or FAIRsharing.org to find appropriate repositories.

Tristan Matthews

Research Data Specialist, Springer Nature