Open licensing of scientific material
This memo gives recommendations for selecting open source license for scientific material, including software, data and documents. Collected by Leo Lahti. Further contributions are welcome; please feel free to improve the page directly.
Step-by-step instructions to open licensing
1. Select a standard license (suggestions below)
2. Check license compatibility if you utilize external source code
3. Mention the license in your software by adding the line "License: FreeBSD" to your documentation and source files.
4. Include the full license as a text file or provide a link to licensing terms.
5. Add your personal information (i.e. name, email, affiliation) in the license.
Recommended open licenses for academic publishing
Software and source code
The modified BSD licenses (in particular FreeBSD) and MIT license set minimal restrictions on the end user, and have therefore been recommended for academic purposes, for instance at ICML/MLOSS workshop 2010 (V. Stodden and G. Bradski). Note that none of the standard licenses below excludes commercial use. I suggest the following preference order for selecting an open license:
- FreeBSD (or other modified BSD licenses). Allows essentially free reuse of the code, and relicensing by others, assuming that the original open licensing statement is distributed with the code. Preferred over MIT since makes an explicit statement concerning binary versions of the code and contains a notice prohibiting the use of the name of the copyright holder in promotion. See also other BSD licenses that are slightly more restrictive than FreeBSD.
- MIT license Essentially similar to FreeBSD but slightly less explicit regarding binary versions of the code.
- GPL (>=2) is more restrictive than FreeBSD or MIT licenses. One reason to license software under GPL occurs when your code contains parts of GPL-licensed code: the complete source code utilizing portions of GPL-licensed code needs to be released under GPL since this is a viral license. For compatibility, GPL(>=2) is often preferred over GPLv2 to allow end user select between GPLv2 and any later version. GPL should not the default choice for academic publishing since its requirements of viral distribution are somewhat incompatible with the general scientific standard of unrestricted reuse.
- LGPL requires that modified versions of your code are also published under LGPL, but not the whole software that utilizes the code. It is therefore less restrictive than GPL, but more restrictive than BSD or MIT that only require preservation of the license note in the code. See also reasons not to use LGPL.
There are many other open licenses but they may set restrictions that are not appropriate for academic purposes, or may have compatibility problems with other licenses. For instance, Apache2.0 is nearly identical to FreeBSD and MIT, but not always GPL2-compatible which may prevent reuse (see also this link). Many people tend to think that GPLv3 sets too extensive restrictions, including requirements on hardware (see also this and this for further discussion). Public domain or missing license do not imply open source and can prevent the reuse of your work since the concepts are legally ill-defined. Non-commercial clause is sometimes assigned to license (not for commercial purposes); note however that none of the standard open licenses sets such restrictions- commercial use of academic research results is not restricted in general, so why should source code be different? For further info, check comparison of free software licenses.
- Open Data Manual
- Open Data Commons Attribution License (ODC-BY)
- Public Domain Dedication and License (PDDL)
Publishing your work
Explicit open licensing policies at research institutions would help to enforce the core scientific standards of transparency and unrestricted reuse of scientific material now as increasing proportion of research details is embedded in data and code.
Why license my work?
Minimally restrictive licenses can help to promote the core scientific standards of publicity, transparency, reproducibility, and unrestricted use of research results. Motivations for open licensing include:
- Guarantee your own rights to your work
- Encourage the reuse of your work in a legally sustainable manner with minimal effort; missing licensing statements can prevent reuse
- Enforce core scientific standards of transparency and reproducibility (see papers by V. Stodden)
- Valued by funding organizations, other scientists, fellow geeks, and laymen.
- It is simple
- Publish your computer code: it is good enough (Nature News)
Links and References
- Pohdintoja avoimen datan lisensoinnista Suomessa
- Hietanen, Herkko: The Pursuit of Efficient Copyright Licensing — How Some Rights Reserved Attempts to Solve the Problems of All Rights Reserved. Doctoral dissertation, 2008.
- Oksanen, Ville: Five Essays on Copyright in the Digital Era. Doctoral dissertation, 2008.
- Stodden, Victoria: Research on open standards for computational science