On the cusp of a paradigm shift: FAIR data and open science on the horizon

Go to the profile of Anna Holderbaum
Feb 11, 2020
2
0

The “Better Science through Better Data 2019” (#scidata19) conference that took place on the 6th of November at the Wellcome Collection in London centred around two key topics, namely digital footprints and open science in practice. Recurring themes were advocacy for the FAIR principles, data sharing, and open science.

Data management and the FAIR principles 

Major research funders (such as the Wellcome Trust, the European Commission, Cancer Research UK, and the Irish Health Research Board) presented their current strategies that require the implementation data management plans promoting FAIR, i.e. findable, accessible, interoperable, reusable data [1]. The FAIR principles are still relatively unknown in the wider research community: 52% of respondents of the State of Open Data Survey 2019 [2] were completely unfamiliar with FAIR. This is likely to change rapidly as not only funders but also many publishers such as Springer Nature and PLOS now encourage data sharing and FAIR data practices. As Dr Varsha Khodiyar, data curation manager at Springer Nature, has emphasised during moderating the conference: “What is FAIR is not necessarily open, and what is open not necessarily FAIR.” Although not all data can be open, e.g. due to legal restrictions, all data can be FAIR.
FAIR data management support

For individual researchers or principal investigators who already have to juggle multiple tasks and commitments, FAIR data management may appear daunting. Help and services for FAIR data management have not yet been truly embedded in academia, likely due to a lack of adequate funding and therefore infrastructure. The fact that successful open and FAIR research data practice is labour extensive and requires explicit funding, e.g. for technical support, training, data management, and data publishing, may commonly be overlooked. For example, the steps needed to adapt datasets and documentation from clinical trials to an accessible format can cost over £3000 per trial.[3] A prime example of embedding research data management support within a university is the network of data stewards coordinated by Dr Marta Teperek at the TU Delft. Every faculty has a dedicated data steward equipped with disciplinary expertise to provide assistance with researchers’ data questions, e.g. relating to costs, compliance with funders’ policies, data management, tools, and training.[4] Initially the idea of providing scientists with research data support at the university level was established by the University of Cambridge in 2016 by launching a network of data champions.[5] While data stewards have a professional role and get paid for their services - this may be on a project basis or as permanent faculty staff, data champions are typically volunteers. Volunteering as data champion or setting up a community of data champions can provide an opportunity for skills and career development. At #scidata19, Connie Clare, PhD student at the University of Nottingham, mentioned in her lightning talk how her engagement as data champion has been a rewarding experience which led to further opportunities.[6] She is one of the contributors to the recently published “Engaging Researchers with Data Management: The Cookbook”, which is openly available, features 24 case studies and is a definite must-read for anyone interested in research data management.[7]

Data sharing - a part of open science

Research has become increasingly data-driven with progress in scientific knowledge closely connected to data accessibility. The paradigm is rapidly shifting within the scientific community to the consensus that data itself without the traditional results and conclusion sections of publications is a valuable output. Therefore, not only publications should be openly accessible, but also the factual evidence – the research data. Scientifically valuable datasets can be shared by submitting them to journals that specifically publish research data e.g. Scientific Data and BMC Research Notes. Springer Nature also has a dedicated team for research data support and offers a service to organise and easily share data.

Open science throughout the research lifecycle

Open science has and will further affect all areas of science from funding policies to publishing in order to maximise the value and impact of research outputs. How can researchers prepare for this new way of science? A resource that resonated with me was the “Rainbow of open science practices”[8] that Yasemin Turkyilmaz van der Velden, data steward at the TU Delft, mentioned in her talk on reproducible research [9]. It is part of the 101 Innovations project and features freely available tools that can be used to foster open science practices throughout the research lifecycle. Thomas Knapen, assistant professor in Experimental and Applied Physiology at the University of Amsterdam, pointed out in his keynote speech on open science that there has been an emphasis on open data and open publishing, or the beginning and end, with all steps in between giving opportunity for non-reproducibility. He called for open methods, including more shareable analyses and the implementation of a version control system that fully documents all changes by utilising tools such git or Jupyter notebook.

Although studying FAIR data and open science practises and their means of implementation requires an investment of both time and effort, it is undoubtedly a worthwhile one that will not only increase the reproducibility, quality, number, and engagements of your research outputs but also save your future self a lot of hassle. FAIR data and open science are coming – and not a moment too soon. 

Anna Holderbaum is a is Marie Curie Early Stage Researcher at Queen's University Belfast. She is a winner of the Better Science Through Better Data writing competition. Read Anna's winning entry here.



References

[1]      M.D. Wilkinson, M. Dumontier, Ij.J. Aalbersberg, G. Appleton, M. Axton, A. Baak, N. Blomberg, J.-W. Boiten, L.B. da Silva Santos, P.E. Bourne, The FAIR Guiding Principles for scientific data management and stewardship, Sci. Data. 3 (2016).

[2]      G. et al. Science, Digital; Fane, Briony; Ayris, Paul; Hahnel, Mark; Hrynaszkiewicz, Iain; Baynes, The State of Open Data Report 2019, Figshare. (2019).

[3]      C.T. Smith, S. Nevitt, D. Appelbe, R. Appleton, P. Dixon, J. Harrison, A. Marson, P. Williamson, E. Tremain, Resource implications of preparing individual participant data from a clinical trial to share with external researchers, Trials. 18 (2017) 319.

[4]      M. Teperek, M.J. Cruz, E. Verbakel, J.K. Böhmer, A. Dunning, Data Stewardship–addressing disciplinary data management needs, (2018).

[5]      R. Higman, M. Teperek, D. Kingsley, Creating a community of data champions, BioRxiv. (2017) 104661.

[6]      C. Clare, “Open Science” opens doors: How #Scidata18 helped me unlock career opportunities, 2019. DOI 10.5281/zenodo.3527201.

[7]      C. Clare, M. Cruz, E. Papadopoulou, J. Savage, M. Teperek, Y. Wang, I. Witkowska, J. Yeomans, Engaging Researchers with Data Management: The Cookbook (epub), (2019).

[8]      B. Kramer, J. Bosman, Rainbow of open science practices, 2018. DOI 10.5281/zenodo.1147025.

[9]      Y. Turkyilmaz van der Velden, Reproducible Research - Why and How?, 2019. DOI 10.5281/zenodo.3530485.

Go to the profile of Anna Holderbaum

Anna Holderbaum

PhD student, Queen's University Belfast

No comments yet.