Discovering and reusing data: Putting research into practice
How do researchers discover, make sense of and reuse data? After conducting a global survey investigating this question, it seemed fitting to make the resulting data available for others to reuse.
For the past three years, my research has focused on exploring how researchers discover, evaluate and (re)use data which they don’t create themselves. Sharing my own data, and describing them in a data paper, provided a chance for me to put what I was learning through my own research into practice.
The data in this descriptor paper were collected via a survey designed to capture information about respondents’ data needs; the sources and search strategies they use to locate data; and how they evaluate the data they find. Nearly 1700 respondents from across disciplines, professional roles and geographic locations completed the survey, creating a rich dataset with both quantitative and qualitative responses.
When deciding how best to share the data from this survey, I incorporated findings from my own research. Researchers bring together contextual information from many sources - articles, README files, metadata, codebooks, personal conversations - when making sense of data. Data are also not always easy to find or cite correctly. A data descriptor paper offered me a way to provide both another layer of contextual information as well as to improve the discoverability of the survey data.
The data descriptor also provided a place for me to identify potential research questions which the data could be used to address. Other researchers could use this data to analyze data discovery and reuse practices according to respondents’ career stages, countries of employment or disciplinary domains. The data could also be used to potentially identify correlations between data needs and the sources used to locate data. Repository managers, data stewards and designers developing data search systems could all make use of the data to better tailor their services.
As has been suggested in other work, data creators may have difficulty identifying completely novel uses for their own data. Extensively documenting the many decisions I made throughout the data administration, collection, preparation and analysis phases is one way to potentially foster novel data uses.
Researchers rely on more than just thorough documentation when reusing data, though; they also turn to their own social and professional networks to find, understand and access data. The documentation in this data descriptor aims to provide a starting point for such conversations, hopefully stimulating and supporting data reuse.