For Open Data, think twice before applying Non-Commercial conditions

Though researchers are increasingly sharing the data underlying their research, the openness of these data is often still unintentionally impaired by unnecessary application of a ‘non-commercial’ (NC) condition, such as the Creative Commons CC-BY-NC licence.
For Open Data, think twice before applying Non-Commercial conditions
Like

At first glance, an NC condition may appear to promote openness by ensuring companies, who tend to guard data as intellectual property, cannot take advantage of the data. However, preventing data being used for commercial purposes is very much against the ethos of open science. 

Commercial entities refine research discoveries and deliver them to society

Good research involves building upon the data and theories of other researchers, but the goal of research is not simply more research--it is to improve the state of our world. One major pathway towards societal improvement is the dissemination of knowledge, technology, medicine and solutions as products. Researchers play the crucial role of laying the foundations for the most cutting-edge technologies and advancements.

However, researchers' outputs need to be developed into usable products, and this is often undertaken by commercial organisations. Moreover, this development takes a lot of time, expertise, resources and risk, none of which come cheap (and certainly not for free). Commercial entities are best placed to accept these challenges. 

Your data remain open even if they are used for commercial purposes

Placing non-commercial conditions upon data can render them unusable even for non-commercial purposes (see below for examples). This can be completely opposite to what the sharer intended, so it is worth reiterating: 

If you share a dataset openly and a commercial entity uses it, your original version will remain available and open under the licence you gave it. This is true whether the commercial entity shares their output data and results or not, and whether they make money or not. In other words, restrictions can only be applied to the derivatives of the original dataset, never to the original dataset itself.

No definition of ‘openness’ includes non-commercial

As part of the recent push towards Open Science, various definitions have been proposed and refined. NC conditions are not included in any of these, as can be seen below.

  • The Open Knowledge Foundation’s Open Definition: “Open data and content can be freely used, modified, and shared by anyone for any purpose
  • A phrase in the Open Source Initiative’s Open Software Definition: “6. No Discrimination Against Fields of Endeavor: The license must not restrict anyone from making use of the program in a specific field of endeavor.
  • Creative Commons marks their most permissive licences as “Approved for Free Cultural Works” (meeting the Freedom Defined definition of a “Free Cultural Work”).
    The three licences in this category are CC0 (not strictly a licence, but a waiver of rights), CC-BY and CC-BY-SA (though SA also causes unexpected problems). The CC-BY-NC, CC-BY-NC-SA and CC-BY-NC-ND licences are not categorised as Free Cultural Works.

Adding an NC clause is probably more debilitating than you realise

What comprises “commercial” usage is not clearly defined. For example, the Creative Commons licences have no clear definition of this, and so commercial use could be interpreted differently within different jurisdictions.

The Open Data Institute provides a surprisingly long list of cases in which use of data could be classified as commercial (so if the data had an NC, they would be unusable for these purposes). Here are a few of the points more pertinent to researchers:

  1. academic research partially supported by private funding, even where undertaken without the intention to commercially exploit any discoveries
  2. a free-to-access not-for-profit application/service built using the data which uses advertising revenue or sponsorship to help cover operational costs
  3. a research organisation using the data in a product that may be later commercialised
  4. use of a dataset on a personal blog whose free hostings is covered by advertising revenue, such as WordPress.com
  5. use of a dataset by an organisation (e.g., university) that draws revenues from commercial services, including training
  6. use of a dataset by a charity or other non-research organisation that has a commercial arm

Additionally, this example is from my personal experience reviewing datasets for authors. The researchers had gathered identifying data about participants, along with consent to share the data only for "research/education/clinical" (essentially non-commercial) purposes. They had also received permission to share the data openly, both from the participants and from their Institutional Review Board. However, the only way to ensure that participants' data would not be used for commercial purposes was to place the data in a controlled-access repository and require users to sign a Data Use Agreement (DUA) prior to accessing the data. The researchers' intent had been to make the data as available as possible for research, educational and clinical purposes, but the NC restriction actually hampered access for non-commercial purposes.

Non-commercial conditions do not protect participants

If a researcher has gathered data containing identifying information about participants, the data should be de-identified and/or shared via a controlled-access repository with a DUA. Applying a non-commercial condition is not a sufficient protection of participants’ identities. In fact, non-commercial conditions are only designed to protect the rights of the rights holder.

Similarly, care must be taken not to confuse the use of non-commercial conditions with appropriate acknowledgement of contribution. For example, if knowledge was gathered from indigenous groups, non-commercial conditions do nothing to ensure that such groups receive the benefits of research based on that knowledge. For such cases, the CARE Principles for Indigenous Data Governance aim to provide guidance.

When it’s okay to apply non-commercial conditions

In general, do your best not to apply non-commercial conditions. Make your data as open as possible, free of any unnecessary restrictions (NC, SA, ND). Nevertheless, partnerships with commercial/private entities may come with the requirement that any data produced be shared under a licence that includes a non-commercial condition in order to prevent other companies taking advantage of the data. Under such circumstances, it is better to share under a less open licence than not to share at all.

However, hopefully you now have a better grasp on the unexpected, counter-productive implications of non-commercial conditions and will choose not to apply them to datasets you share.