Everyone should go read Brooks Moses on Free-software licenses: requirements vs. requests. His post has made me re-think the license we use for our group simulation code. I’ve never like GPL because it essentially guarantees that friends in the corporate world won’t be able to use our code in their products; the simplicity of the BSD-style license has always appealed to me. As many people who adopt the BSD-style license have done, I threw in this attribution clause:
Acknowledgement of the program authors must be made in any publication of scientific results based in part on use of the program. An acceptable form of acknowledgement is citation of the article in which the program was described (Matthew A. Meineke, Charles F. Vardeman II, Teng Lin, Christopher J. Fennell and J. Daniel Gezelter, “OOPSE: An Object-Oriented Parallel Simulation Engine for Molecular Dynamics,” J. Comput. Chem. 26, pp. 252-271 (2005))
I know how often people forget to attribute code to the original author. Brooks points out that this places a big barrier in the way of adopting small bits of code (subroutines, individual fortran modules, etc.) into other packages. Pretty soon, users of big packages are citing hundreds of papers in fields that are very distant to the use of the code.
His suggestion is a “Requests” section of a license that would make the request for citation, and remove the forcefulness of the attribution clause. I like the idea. A lot.
Personally, I tend to like the LGPL — you can use the software, but if you make changes, you have to make them available.
But you (and Brooks) raise the point that I hope Science Commons is addressing — what licenses make sense in science research?
You raise the point about code reuse, although I’d guess that small snippets don’t really matter. Why? Case in point, the FSF doesn’t require copyright forms for patches less than 10 lines of code. Or at least they didn’t a few years ago.
I think a similar question comes with data. Maybe I have an archive of data that I’d like to open. But the license should be something like: you’re free to use this however you want, free to distribute it as much as you want, but please cite this paper if you publish something involving this data.
Furthermore, reuse of data implies that the data not be changed. Changing code is fine, but what does it mean to have data under the BSD or GPL license. Can someone make arbitrary changes to my coordinate files?
First, thanks for the mention, Dan! I’m glad my post provoked some thought.
Geoff, I think we’re thinking about a very different idea of “small snippets”. I’m referring to things like the Fortran self-adjusting vector implementation that Dan mentioned in a post a few months ago, which happens to only be distributed as part of a larger molecular-dynamics application. That one isn’t a particularly good example because it’s also an obvious candidate for releasing as a standalone library; sometimes there are things that aren’t nearly as independent, but can still be reused with a little work.
As for data, I think that it’s probably not the best idea to release data under the GPL, for rather similar reasons to why the FSF has a non-GPL license for documentation — a license for source code isn’t necessarily a good fit. Certainly a data license should address whether the data can be changed, and in what ways, and how those changes need to be documented if the changed version is distributed.