As a physician-scientist who has worked in different roles on clinical trials, I had been discussing the culture of data sharing in clinical trials and medicine with my brother, Will, over family holidays and get-togethers for several years. Will was sure that the prevailing approach to sharing clinical trial data—selectively and frequently on the promise of co-authorship—throttled the impact of clinical trial investigators’ hard work. Initially counterintuitive to me, these ideas were ultimately persuasive. So, when a prominent medical journal published an editorial with concern about what some were calling “research parasites” who analyze clinical trial but do not generate data of their own, I took note. And I also noted of the reaction of talented data analysts I had come to know--who certainly did not see themselves as parasites.
One of those brilliant data analysts responded by founding the Research Parasite Awards, a tongue-in-cheek award to recognize those who do something unusual and interesting with data generated by others.
Inspired by this idea and wanting to also recognize the contribution of scientists who enable others’ research through data sharing, I co-founded the Research Symbiont Awards. We announced the Symbionts in the New England Journal of Medicine in 2017.
Two Research Symbiont Awards are given annually. We award the Early Career Clinical Research Symbiont Award to someone who has impressed the committee at an early stage of that person’s career. The award for sustained contributions to data sharing is the General Symbiosis Award. The awards’ full criteria and the committee members who generously donate their time to select the awardees can be found on the Research Symbiont Awards website.
I am grateful to the sponsors of the 2018 and 2019 Research Symbiont Awards. In both 2018 and 2019, Springer-Nature provided support for the awards. In addition, in 2019, the Dragon Master Foundation—a pediatric brain cancer foundation—provided support.
In this blog post, I introduce some of the winners of the Research Symbiont Awards, giving them a chance to reflect on their work, their philosophy of data sharing, and the award.
Benjamin Mako Hill, Winner of the 2019 General Symbiosis Award
I am an Assistant Professor of Communication at the University of Washington and—for the year—a Fellow at the Center for Advanced Studies for the Behavioral Sciences at Stanford. My research involves studying online communities.
The award was given to recognize work that went into releasing a large dataset drawn from the scratch online community. Scratch is a website created by a team at MIT that has been used by tens of millions of children to write computer programs using a graphical programming language, and to publish those programs and interact around them in fun, social, and creative ways.
The dataset I prepared and released along with Andrés Monroy-Hernández includes comprehensive public data from the first five years of the Scratch online community including tables with data about users, projects, code, social interactions, and much more. To my knowledge, it is the largest and most comprehensive research dataset on computing and learning anywhere.
The award also recognized work that I’ve done with the Community Data Science Collective—a research founded with Aaron Shaw at Northwestern University—to release research datasets and to promote the creation of replication datasets in social scientific communication research.
This award is valuable because systematic data sharing in my part of the social sciences and computing is still quite rare. Recognition for the work and time that goes into sharing data provides a signal of value to my colleagues and those who evaluate my work. More importantly, it sends a strong signal to others in communication and social computing that sharing data is at least as valuable to our field as work as the production of another empirical paper. I also hope that the award means that more people to hear about the datasets I've produced—especially the Scratch dataset.
I also deeply appreciate the recognition from the committee from a committee of bioscientists. Work on the systematic sharing of biological data has provided me with both inspiration well as infrastructure for all the things I've done that this award is recognizing. When Dr. Monroy-Hernández and I contacted the editors of Scientific Data about publishing a descriptor of the Scratch Data, they had never published a paper using behavioral or social data. So although the editors had to work with us to figure out many details, I have benefited as a type of “research parasite” by drawing on the work of folks in the biological sciences who have made it possible to publish data descriptors in reputable journals and deposit data into archival repositories.
Sharing data is often thankless and largely unrecognized work. In addition to expressing my thanks for the recognition, I also need to recognize and thank Aaron Shaw, Andrés Monroy-Hernández, Sayamindu Dasgupta and all my students and collaborators in the Community Data Science Collective for doing all the work that this award recognizes. The award is for all of them!
Fabio Zanini, Winner of the 2018 Early Career Clinical Research Symbiont Award
I am a postdoctoral researcher at Stanford University in the department of Bioengineering and I study the impact of human viruses on health and disease.
In 2018 I won the Early Clinical Symbiont Award for a data resource on the human immunodeficiency virus (HIV) which I created as a graduate student with Richard Neher at the Max Planck Institute for Developmental Biology (he’s now at University of Basel) and Jan Albert at the Karolinska Institute. In that study, we collected almost 100 blood samples from HIV infected patients and analyzed the genomic changes affecting the virus during many years of chronic condition. Because samples from this type of medical research are difficult to obtain and process, it is very important to share the resulting data with the research community to foster statistical analyses that may lead to new discoveries on this devastating disease. In addition to writing articles on scientific journals, I devoted half a year of my graduate training to building HIVEVO, a website that can be browsed by other scientists and doctors but also harnessed effectively by computer programs.
Unfortunately, it is very uncommon for scientists in biology and medicine to allocate such amount of time to provide a resource for future scientists without a clear payout (an additional article). The Research Symbiont Award committee recognizes that time spent on data sharing is worth exactly as much as writing a scientific publication and I fully share this view. I meet every day with brilliant and passionate colleagues and the Research Symbiont Award is often a topic of animated discussion for its visionary attitude and its paradigm shifting ethic. In the computer software community open resources have a long tradition and are at the core of many technologies powering the internet, but that tradition needs to be translated into biomedicine to democratize access to research, diagnosis, and personalized treatment, and the Research Symbiont Award is spot on pushing in the right direction.
I was so honored to receive the award and I am indebted to many mentors and colleagues for their support and contagious passion about science, medicine, and open data. In particular I was fortunate to work with Richard Neher, Jan Albert, Lina Thebo, Johanna Brodin, and Andre Noll in Tuebingen, Germany and Stockholm, Sweden: those people really embody the rare combination of outstanding and understanding and would deserve a separate award just for that!