Six things publishers can do to support open data and reproducible research
This post is based on a short talk I gave at the SpotOn London 2018 conference, and describes six different areas in which scholarly publishers can make practical changes to improve the reliability and reproducibility of published research.
Open data and open research are means to an end: more reliable and more valuable research. As scholarly publishers we help to maintain the integrity of the published record. But evidence is mounting that much published research, costing billions of dollars to fund, is not reliable; not reproducible .
Some of the causes of irreproducible research relate to how research is conducted. This might include how researchers are supervised, or trained, in the lab. It might be due to poor record keeping, data archiving or experimental design. Academic and research assessment culture may also be leading to pressure to publish, and to publish certain results – the remarkable, the “positive” results. When, in fact, most advances in research are a result of steady, iterative steps, rather than giant leaps.
Other causes of poorly reproducible research can relate to what is published, how it is published, and in what formats - things publishers can influence. We see, sadly, in published, peer-reviewed research:
Incomplete descriptions of methods 
Unavailability of research data upon which conclusions are based 
Publication bias 
If we don’t get the full story of research, decisions - including government policy decisions - are not based on complete evidence. We can’t have reliable, reproducible research without transparency. And this - transparency - is what journals and publishers should support, and demand.
So, beyond providing open access to publications, how can publishers increase transparency, to support reproducibility?
Firstly, publishers can help to raise awareness of issues and support behavioural change
As well as drawing attention to issues through editorials, surveys and outreach, publishers can make a difference by introducing stronger, more consistent transparency policies for their journals. Following an initiative by Springer Nature in 2016, multiple large publishers have introduced clearer, more consistent policies on sharing research data in the last three years, to thousands of journals .
Secondly, we can improve the objectivity of the peer-review process
By requiring consistent reporting of key information about data availability, methodological details and statistical reporting, and adherence to reporting guidelines for research can be better scrutinised by editors and peer reviewers. Implementation of editorial checklists at Nature journals has, for example, been shown to improve reporting of in vivo research .
Thirdly, publishers can diversify and improve scholarly infrastructure
This includes journals, and content types, that better reflect the wide variety of research outputs. Journals and articles for describing data, protocols, and software all support transparency. As do journals - such as the BMC Series, Scientific Reports, PLOS One and PeerJ - that publish the steady, incremental research, provided it is technically sound and regardless of the perceived impact or importance of results. There is, indeed, no shortage of places to publish “negative” results.
It needs to be easier for researchers to do the right thing – the reproducible thing – when submitting and publishing their work. Such as, connecting journal submission and peer review systems with databases and repositories for research data. When we enable data to be as easily shared in a repository, as it is to upload supplementary files, authors and peer reviewers can engage more readily with supporting data . Publishers can also innovate by creating new products, services and technology to make it easier to share and curate all research outputs, such as Springer Nature’s Research Data Support service .
Fifthly, publishers can provide more incentives
For researchers to be more transparent, and to be credited for doing so, availability and quality of research, and how its reused, need be better measured. We now have the means, in journals and books, to enable researchers to cite not just their papers but also datasets and software - so the availability and reuse of these materials can be measured and counted . Some funding agencies are also moving to track all research products, not just publications . Publishers can also support the adoption of new measures of transparency such as digital badges on papers. Initial experiments with these digital badges for transparency - in data, code and materials - have shown that they encourage data sharing by authors .
Sixth, we can be open ourselves
Practically, this means taking pragmatic, progressive steps forward such as through content sharing and by opening up more metadata from our publications and platforms. The Nature journals recently made the data availability statements in their articles freely available in front of the paywall . And many large publishers now make their reference lists available openly, as open citations. Furthermore, we can share our own resources and knowledge openly: policy texts, recommended repository lists , data from open research surveys , and knowledge we've gained about implementing open data policies - the benefits as well as the costs . If we can't make evidence-based decisions about implementing open research, our progress is going to be limited.
Finally, publishers can promote transparency through collaboration. Impactful policy and infrastructural changes - such as in data citation, unique researcher identifiers (ORCIDs), reporting standards, policy harmonisation - can only be tackled by multiple publishers collaborating with one another, and with other organisations that support the conduct and communication of research.
1. Freedman, L. P., Cockburn, I. M. & Simcoe, T. S. The economics of reproducibility in preclinical research. PLoS Biol. 13, e1002165 (2015).
2. Glasziou, P., Meats, E., Heneghan, C. & Shepperd, S. What is missing from descriptions of treatment in trials and reviews? BMJ 336, 1472–1474 (2008).
3. Ioannidis, J. P. A. et al. Repeatability of published microarray gene expression analyses. Nat. Genet. 41, 149–155 (2009).
4. McGauran, N. et al. Reporting bias in medical research - a narrative review. Trials 11, 37 (2010).
5. Hrynaszkiewicz, I. & Grant, R. Embedding research data management support in the scholarly publishing workflow. (2018). doi:10.31219/osf.io/brzwm
6. Macleod, M. R. & The NPQIP Collaborative group. Findings of a retrospective, controlled cohort study of the impact of a change in Nature journals’ editorial policy for life sciences research on the completeness of reporting study design and execution. BioRxiv (2017). doi:10.1101/187245
7. Let referees see the data. Sci. Data 3, 160033 (2016).
8. Cousijn, H. et al. A data citation roadmap for scientific publishers. Sci. Data 5, 180259 (2018).
9. Piwowar, H. Altmetrics: Value all research products. Nature 493, 159 (2013).
10. Kidwell, M. C. et al. Badges to Acknowledge Open Practices: A Simple, Low-Cost, Effective Method for Increasing Transparency. PLoS Biol. 14, e1002456 (2016).
11. Hrynaszkiewicz, I. & Swaminathan, S. Nature Research journals improve accessibility of data availability statements. at http://blogs.nature.com/ofschemesandmemes/2018/09/18/nature-research-journals-improve-accessibility-of-data-availability-statements
12. Scientific Data recommended repositories. at https://figshare.com/articles/Scientific_Data_recommended_repositories_June_2015/1434640
13. Stuart, D. et al. Whitepaper: Practical challenges for researchers in data sharing. (2018). at https://figshare.com/articles/Whitepaper_Practical_challenges_for_researchers_in_data_sharing/5975011
14. Grant, R. & Hrynaszkiewicz, I. The impact on authors and editors of introducing Data Availability Statements at Nature journals. BioRxiv (2018). doi:10.1101/264929