I was recently asked to define what Open Science means. It would have been relatively easy to fall back on a litany of “Open Source, Open Data, Open Access, Open Notebook”, but these are just shorthand for four fundamental goals:
- Transparency in experimental methodology, observation, and collection of data.
- Public availability and reusability of scientific data.
- Public accessibility and transparency of scientific communication.
- Using web-based tools to facilitate scientific collaboration.
The idea I’ve been most involved with is the first one, since granting access to source code is really equivalent to publishing your methodology when the kind of science you do involves numerical experiments. I’m an extremist on this point, because without access to the source for the programs we use, we rely on faith in the coding abilities of other people to carry out our numerical experiments. In some extreme cases (i.e. when simulation codes or parameter files are proprietary or are hidden by their owners), numerical experimentation isn’t even science. A “secret” experimental design doesn’t give skeptics the ability to repeat (and hopefully verify) your experiment, and the same is true with numerical experiments. Science has to be “verifiable in practice” as well as “verifiable in principle”.
In general, we’re moving towards an era of greater transparency in all of these topics (methodology, data, communication, and collaboration). The problems we face in gaining widespread support for Open Science are really about incentives and sustainability. How can we design or modify the scientific reward systems to make these four activities the natural state of affairs for scientists? Right now, there are some clear disincentives to participating in these activities. Scientists are people, and we’re motivated by most of the same things as normal people:
- Money, for ourselves, for our groups, and to support our science.
- Reputation, which is usually (but not necessarily) measured by citations, h-indices, download counts, placement of students, etc.
- Sufficient time, space, and resources to think and do our research (which is, in many ways, the most powerful motivator).
Right now, the incentive network that scientists work under seems to favor “closed” science. Scientific productivity is measured by the number of papers in traditional journals with high impact factors, and the importance of a scientists work is measured by citation count. Both of these measures help determine funding and promotions at most institutions, and doing open science is either neutral or damaging by these measures. Time spent cleaning up code for release, or setting up a microscopy image database, or writing a blog is time spent away from writing a proposal or paper. The “open” parts of doing science just aren’t part of the incentive structure.
Michael Faraday’s advice to his junior colleague to: “Work. Finish. Publish.” needs to be revised. It shouldn’t be enough to publish a paper anymore. If we want open science to flourish, we should raise our expectations to: “Work. Finish. Publish. Release.” That is, your research shouldn’t be considered complete until the data and meta-data is put up on the web for other people to use, until the code is documented and released, and until the comments start coming in to your blog post announcing the paper. If our general expectations of what it means to complete a project are raised to this level, the scientific community will start doing these activities as a matter of course.
If you meet a scientist who tells you that they did a fantastic experiment and have wonderful data, you naturally ask them to email you a reprint. Any working scientist would be perplexed if the response was: “Oh, I’m not going to be writing this work up for publication.” It would be absolute nonsense in the culture of science to not publish a report in a journal on the work you have done. And yet, no one seems surprised when scientists are too busy or too secretive to release their data to the community. We should be just as perplexed by this. Instead of complaining about the reward and incentive systems, we should be setting the standard higher: “What do you mean that you haven’t got around to putting your data on the web? You aren’t done yet!” Or: “How can I possibly review this paper if I can’t see the code they were using? There’s now way for me to tell if they did the calculation right.” We’re going to have to raise the expectations on completing a scientific project if we want to change the culture of science.
Thank you for a detailed explanation.
Open-source software (such as Linux, Apache, or Python) is popular not so much because it is open/free but mostly because it is better than closed-source. Similarly, open science will prove itself when it is shown to be more powerful than closed/secretive (pseudo)science. To promote it, we should focus on success stories of scientists getting further in their research by documenting and sharing the details of computational experiments.
I agree wholeheartedly with this 🙂 I also recommend the following article; although it’s meant for computational linguists I think the points it makes are relevant for all scientists:
Empiricism is Not a Matter of Faith (Pedersen), Computational Linguistics, Volume 34, Number 3, pp. 465-470, September 2008. [Journal Citation Reports Index Factor 2007: 2.367]
Pedersen’s article is wonderful. Thanks for posting that link, Kevin!
Pingback: Defining Open Science «
Great piece. One addition: although it is true that institutional motivation factors to collaborate are not present, success of the networking (Internet Engineering Task Force), RFCs, is food for thought that scientists tend to overlook and instead just focus on openness of data, final results, methodology. What about Open-process?. The question is, can sciences improve if we reuse aspects of what i call the Internet Model (Free Software + IETF)? I’m convinced the answer is positive. Are there institutional rewards? I don’t think there were any for the open source/data/methodology part either. But the situation changed, as academics and volunteers of all kinds, promoted, used and developed the model in areas other than software. If we do the same for the volunteer-core open-process model that gave us the Internet and the Open Source type collaboration, we are justified to expect significant leaps in the productivity of sciences that adopt this full Internet Model (open source + open process). Open source/data is half of it, volunteer open-process is the half that we’re missing to make the leap. best, toni.
Pingback: O'Reilly Radar
Pingback: Four short links: 30 July 2009 | Tech-monkey.info Blogs
Pingback: Moonlit Minds @Moonlit Minds
Pingback: Science, publishing, and such - elearnspace
Thanks for the post! I totally agree that transparency is essential. There will always be some aspects of science that will remain closed (eg, real-time data sharing will be difficult for the reasons you stated above). But to add the perspective of a wet-lab scientist about the transparency of techniques- 99% of what we do at the bench on a daily basis is not a trade secret. In fact, in 15 years at the bench, I did a lot of experiments- not one of them was novel. Most of the time the novelty comes in how those techniques are applied.
So the fact that protocols and techniques are not openly shared is insane. Everyone wins with increased access. Moreover, the idea that a traditional “publication” is the only point at which information can be shared is also unfortunate. There are many grad students, postdocs, and professors who are excellent scientists with impeccable technique who will experience long droughts in between papers. There should be a way for others to learn from them and for them to gain recognition in their field during these downtimes.
In fact, I believe so strongly in this, I just launched a website called BenchFly.com that addresses this exact issue by allowing anyone with an internet connection to upload a video of their protocols, techniques, tips, tricks- basically anything that will help another scientist out. The same way we learn in the lab, just now on the internet… We can’t overhaul the entire scientific process overnight, but we’ve got to start somewhere.
Thanks for a thought provoking post. I’m interested in the idea of transparency in research, and am attempting to update the progress of my PhD study through my blog. However, I find it difficult to justify the time needed to write a considered post on the process. I’m also uncertain how much of my work my institution will be comfortable with me sharing in this way. I think it’s going to be difficult for researchers to dedicate time to this “additional” work, and in order to move forward, institutions must formally recognise it as part of the research process.
Pingback: Four short links: 30 July 2009 | Design Website
Great article!
Regarding your second point, “Public availability and reusability of scientific data”, you may be interested in the Open Knowledge Definition (OKD), which provides criteria for openness in content/data:
http://opendefinition.org/
For anyone interested in open data in science, there is a Working Group on this at the Open Knowledge Foundation:
http://wiki.okfn.org/wg/science
Pingback: New Technologies And Media By George Siemens, 07-31-09 | graphics and innovation
Pingback: Robin Good's Latest News
Pingback: Media Literacy: Making Sense Of New Technologies And Media by George Siemens - Aug 1 09 | Write a Blog Site
Pingback: Free Readings Online » Blog Archive » Media Literacy: Making Sense Of New Technologies And Media by George Siemens - Aug 1 09
Pingback: Media Literacy: Making Sense Of New Technologies And Media by George Siemens – Aug 1 09 | Digest I Realize
Pingback: Recent links on Open Access « Free Our Books
Pingback: Media Literacy: Making Sense Of New Technologies And Media by George Siemens – Aug 1 09 « Argument
Pingback: What *Is* Open Science? « Software Carpentry
Pingback: Collaborative mathematics, etc. « What Is Research?
A couple of you made some good points about open-process. And I think that the opening of academia itself (the free availability of course content, lesson plans, syllabi, etc) will help break down the institutionalized nature of academic science, and speed up evolution of ideas a bit more.
But I also think that a lot of closed science is a result of ego/pride/desire for acclaim on the part of some scientists – those who are more concerned about building a legacy for themselves.
Ultimately, I think there are some aspects of the evolutionary process that can’t be shared. Those that can synthesize multiple ideas and are inspired by a consequent vision, those that are driven by intuition, etc.
You can deconstruct these things mechanically, perhaps, and devise theories about their process. But they are intrinsically a “property” of the originator. Just as disciples of a great leaders deal with permutations of the original philosophy
Thanks for this wonderful article.
I’ve been thinking about these issues lately because of my role as a technical editor for a new journal called “Mathematical Programming Computation”, (Mathematical Programming already has series A and B, the C series is for “computational” papers.) Papers submitted for publication in MPC have to be submitted together with their source code, and I’m one of the group that reviews these codes.
One very common issue that we face is that an author will call routines from a closed source commercial library (such as ILOG’s CPLEX) within their code. This creates all kinds of problems for the review process and later replication of the results, as the version of the library originally used by the author must be available to the reviewer, and then after a few years it’s extremely likely that that particular version of the library will no longer be available to anyone.
Thanks for this nice article.
What I am missing here is a comment concerning the free availability of scientific papers. I think this is a very fundamental requirement for a scientific work as well. Else, scientific results are unavailable by institutions, universities and individuals who do not have the resources for often embarrassing high-priced
subscriptions and journal papers during a very long period of time after issuing. This, while producing costs have been reduced drastically due to the change from paper to electronic based
journals. The editor and publisher tasks should be performed by non-commercial driven and non-political biased institutions. A good way might be to put the entire (commercial orientated!) scientific publishing industry to the ground and charge public-founded, non-political-biased universities. A collaboration between librariens, university-press and informatics services might work well to do the organization of the peer reviewing, composing journals and providing (for free) on their servers.
Currently, the scientific industry is suffering from a typical lock-in situation: funding is based on publications in traditional journals with high impact factors. Moving out to a new and free available and open journal is just scientific suicide. A big change will have to happen to break this situation. Now it is a good opportunity regarding the whole climate-gate affair, causing many discussions what has gone wrong in this industry.
A remark to Brian Borchers:
In the Debian/GNU system (and probably other FOSS systems) a piece of software is only regarded as free if the software itself AND its depending libraries have been issued under a FOSS licence. Else, it will not be included in the ‘main’ repository, but in the contrib or non-free repo, without further support.
So, concerning MPC: software that is free, but depends on non-free libraries should be rejected for publication.
Pingback: simplekaywa - Was ist Open Science?
Pingback: Scholarly Communications @ Duke » What is Open Science?
I am currently taking a Digital Civilizations class at Brigham Young University. We have been talking about the importance of open source information. This is really interesting to look at it from a science research point of view. I agree that it is important for the research that is being done to be where others can review it, critique it, and learn from it. Thanks for the great post!
I would suggest we generalize a little from the Open Source Software experience.
Open source software has followed a dynamic which is very interesting. To begin with there seemed no economic incentive to produce it. But we see now there is plenty of economic incentive to use it. In fact Yahoo, Google and Facebook have exploded as companies because they use open source for their sites. A much smaller set of nasty fees to pay to greedy software companies.
So now, working on open source is a sign of maturity. Lots of companies expect you to know how these open source engines are working. So if you actually worked on an open source project you have desirable experience. It may reach the point where this becomes more important than obtaining an academic degree.
The Open science movement will do well to focus on major science initiatives that young people can relate to. Then focus on bringing in the talent to get things done.
I am curious about the phenomena of a significant population of Egyptian graduates with no work. Can we not involve some of them in an open science project. Maybe they would swap their time and energy for a post graduate education. We just have to figure out how to move the $ requirements out of the way. Open source software does that. Why not open science?
Pingback: The Promise — and Peril — of Open Science to Extension | Mission Extension: The Weblog
Pingback: A Post about Open Science and Incentives « Grey Matters
Pingback: Open Science: What the Fuss is About | DCXL
Pingback: The “Open” Prescription — Why It Doesn’t Always Make Sense « The Scholarly Kitchen
Pingback: Universidades, Patentes y Ciencia Abierta(Open Science) : AsoVAC – Caracas
Pingback: The Pyramid of Open Science – Which Way is Up? | WebScio
Pingback: Open access: Opening the Science | Australian Science
Pingback: Someday [soon] you’ll help patrons with: Visualization & 3D Printing - The Ubiquitous Librarian - The Chronicle of Higher Education
Pingback: Opening up and letting my guard down | Banff Ecologist
Pingback: Nature is not a Book » Is Open Science Open Enough?
Pingback: Sci-Fi vs. Sci-Fact: Are we close to a real biological ‘Bladerunner Replicant’?
Pingback: Ciencia Abierta Âżun nuevo modelo cientĂfico? | WebAyunate
Excellent post!
A related topic is the availability of sequences, including those of completely-sequenced genomes. I recently wrote a post on that in my blog: treevolution
Pingback: open science blog: open science, a primer « urbanmolecule.me
Pingback: Universidad, ciencia y su (r)evoluciĂłn. Ciencia abierta « Fluyendo Libre-mente
Pingback: What does primatology 2.0 look like? | Beast Ape and the Bleeding Heart Baboons
Pingback: Is the Open Science movement prone to call-outs, and is that a good thing? | Randal S. Olson
Pingback: Open science | inkubc
Pingback: Week 10: Hyperpothesis = Technological Innovation + Open Science « ARTS3091: Advanced Media Issues
Pingback: Is Open Science Open Enough? | Nature is not a Book