Better Research Through Better Data: Q&A with Michael F. Huerta, PhD, NIH
We put your questions about research data to Michael F. Huerta, PhD – Director of the Office of Strategic Initiatives and Associate Director of the National Library of Medicine, NIH.
These questions were asked during Better Research Through Better Data Live. Catch up on the recording here. The presentation slides are available here.
Q: Should I share my unpublished research data?
Dr Huerta: If they are well managed, with good documentation, etc., research data not reported on in the literature can still be very valuable to science. For example, if studies are done to test whether an intervention of some sort would lead to a particular outcome and it turns out that the intervention did not affect the outcome, that might be very useful to someone who might be thinking about conducting a similar study. Since journals typically would not be as interested in publishing a paper describing the lack of effect and since scientists might not want to spend their time writing such a paper, sharing the data with those interested in the topic could be an excellent way to advance the science.
Q: What is the incentive for using shared data? Can a project solely based on shared data be funded?
Dr Huerta: There are many possible benefits of using shared data, but one example is if data are collected in a standard way, one could combine data from another lab with data from your lab to increase the amount of data and increase the statistical power to the analysis that was not available with only the data from your lab.
Yes, a project based on shared data can be funded and I know that such projects have been explicitly solicited by NIH Institutes and Centers in the past.
Q: Are we allowed to use other researchers' data?
Dr Huerta: The terms and conditions of use are set by the person (or entity) that generated the data. Terms and conditions of use are usually stated wherever it is the data resides. The person reusing the data does not set the terms.
Q: Will research data remain protected and accepted, for my future publications by me?
Dr Huerta: That would depend upon the terms and conditions of use that you would place on the data.
Q: Do you have any plans to fund research which can facilitate secondary analysis of public omics data by creating meta repositories (agglomerating all public omics data, provide improved metadata, providing more usable omics formats, etc.)?
Dr Huerta: For funding from NIH, it is always a good idea to talk to a program officer to see whether there is interest in any particular project. The basic way to do this is to first identify an NIH Institute or Center whose mission is relevant to your research, then look through the extramural research programs listed on its website, identify a program that seems best aligned to your interest, and contact the program officer responsible for that program.
Q: Is data only for use by health professionals, or also patients and other independent non-medical persons?
Dr Huerta: This depends on many factors. If it is data from people, such factors include the nature of the consent those people gave for its reuse, whether it complies with privacy and confidentiality laws and regulations, etc. On the other hand, there are some initiatives that “crowdsource” analysis of data, inviting the general public to participate; this is called citizen science. You can learn more about this at citizenscience.gov
Q: How will the quality of sharing data be maintained?
Dr Huerta: This is a major challenge and will vary greatly. Some research generates and manages data in highly standardized ways. The quality of such data would be more obvious than for data that is generated and managed in more idiosyncratic ways.
Please note that some questions have been edited for clarity.