The Journal Nature just published a very interesting article:
“Computational science: …Error”
“…why scientific programming does not compute.”
by Zeeya Merali
In this article, Zeeya Merali discusses how scientific research has come to strongly depend on software development, but may not have caught up with the good practices of software engineering that are required for developing high quality software.
- “A quarter of a century ago, most of the computing work done by scientists was relatively straightforward. But as computers and programming tools have grown more complex, scientists have hit a “steep learning curve”, says James Hack, director of the US National Center for Computational Sciences at Oak Ridge National Laboratory in Tennessee. “The level of effort and skills needed to keep up aren’t in the wheelhouse of the average scientist.”
Unfortunately, essential practices of software engineering are not taught or practiced enough by scientists:
- “As a general rule, researchers do not test or document their programs rigorously, and they rarely release their codes, making it almost impossible to reproduce and verify published results generated by scientific software, say computer scientists. At best, poorly written programs cause researchers such as Harry to waste valuable time and energy. But the coding problems can sometimes cause substantial harm, and have forced some scientists to retract papers.”
and clearly identify that Openness is an important element for fixing this situation
- “As recognition of these issues has grown, software experts and scientists have started exploring ways to improve the codes used in science. Some efforts teach researchers important programming skills, whereas others encourage collaboration between scientists and software engineers, and teach researchers to be more open about their code. “
The lack of training in software engineering becomes evident once you start looking into the matter:
- “In 2008, he and his colleagues conducted an online survey of almost 2,000 researchers, from students to senior academics, who were working with computers in a range of sciences. What he found was worse than he had anticipated1 (see ‘Scientists and their software’). “There are terrifying statistics showing that almost all of what scientists know about coding is self-taught,” says Wilson. “They just don’t know how bad they are.”
There are many documented cases on how low quality software leads researchers and their communities to errors, and make them lose time, resources, and credibility:
- “As a result, codes may be riddled with tiny errors that do not cause the program to break down, but may drastically change the scientific results that it spits out. One such error tripped up a structural-biology group led by Geoffrey Chang of the Scripps Research Institute in La Jolla, California. In 2006, the team realized that a computer program supplied by another lab had flipped a minus sign, which in turn reversed two columns of input data, causing protein crystal structures that the group had derived to be inverted. Chang says that the other lab provided the code with the best intentions, and “you just trust the code to do the right job”. His group was forced to retract five papers published in Science, the Journal of Molecular Biology and Proceedings of the National Academy of Sciences, and now triple checks everything, he says.”
The adoption of
- Collaboration Platforms
- Reproducibility in scientific publications
are also identified as important elements that can help correct this dramatic situation.
It is well known that closed source code, home-grown in labs over generations of graduate students becomes unmaintainable:
- “Problems created by bad documentation are further amplified when successful codes are modified by others to fit new purposes. The result is the bane of many a graduate student or postdoc’s life: the ‘monster code’. Sometimes decades old, these codes are notoriously messy and become progressively more nightmarish to handle, say computer scientists.”
- “The mangled coding of these monsters can sometimes make it difficult to check for errors. One example is a piece of code written to analyse the products of high-energy collisions at the Large Hadron Collider particle accelerator at CERN, Europe’s particle-physics laboratory near Geneva, Switzerland. The code had been developed over more than a decade by 600 people, “some of whom are excellent programmers and others who do not really know how to code very well”, says David Rousseau, software coordinator for the ATLAS experiment at CERN. Wilson and his students tried to test the program, but they could not get very far: the code would not even run on their machines.”
Open Source Software to the Rescue:
By opening and sharing the code, communities can distribute the burden of maintenance cost, and improve their quality control practices by having many eyes looking for defects in the code. The Visualization Toolkit (VTK) is presented as a successful example on how to manage complexity in scientific software:
- Some software developers have found ways to combat the growth of monster code. One example is the Visualization Toolkit, an open-source, freely available software system for three-dimensional computer graphics. People can modify the software as they wish, and it is rerun each night on every computing platform that supports it, with the results published on the web. The process ensures that the software will work the same way on different systems.
and Merali, immediately put the finger in the problem:
- That kind of openness has yet to infiltrate the scientific research world, where many leading science journals, including Nature, Science and Proceedings of the National Academy of Sciences, do not insist that authors make their code available. Rather, they require that authors provide enough information for results to be reproduced.
But in today’s software-driven research environment, the truth is that
- It is impossible to reproduce research without having access to the software used to run the experiments, and
- It is impossible to fit, in the pages of a standard paper, the instructions required to re-implement any piece of software (assuming that another group was willing to spend months reimplementing that software).
Therefore, the papers that describe work relying on computational methods can only give the appearance of reproducibility; however, they fail to deliver on that promise to the readers.
The fact that scientific journals and conference publications have abandoned the healthy practice of requiring REPRODUCIBILITY, which used to be the hallmark of scientific work, has lead us to this quagmire. In today’s scientific work, in order to enable reproducibility by external groups, the following are indispensable elements:
- Open Source Software
- Open Data
- Full disclosure of parameters
The Scientific paper must to abandon it current “marketing” speech, and return to its origins, rooted in the principle of
Enabling Independent Groups to replicate the Work
There have been courageous attempts to solve this crisis
- “In November 2009, a group of scientists, lawyers, journal editors, and funding representatives gathered for the Yale Law School Data and Code Sharing Roundtable in New Haven, Connecticut, where they recommended that scientists go further by providing links to the source-code and the data used to generate results when publishing. Although a step in the right direction, such requirements don’t always solve the problem. Since 1996, The Journal of Money, Credit and Banking has required researchers to upload their codes and data to an archive. But a 2006 study revealed that of 150 papers submitted to the journal over the preceding decade that fell under this requirement, results could be independently replicated with the materials provided for fewer than 15″
This particular example, illustrates that access to the code must be a requirement for a paper to make it through the review process, and that the process must include all the material (data, parameters, scripts) required to replicate the work. In practice, of course, the only way to make sure that all the elements are provided, is TO ACTUALLY REPLICATE THE WORK as part of the review process.
This is exactly what the Insight Journal has been promoting and DOING for more than five years.
There are of course, skeptics to this approach:
- Proponents of openness argue that researchers seeking to replicate published results need access to the original software, but others say that more transparency may not help much. Martin Rees, president of the Royal Society in London, says it would be too much to ask reviewers to check code line by line. And in his own field of astrophysics, results can really be trusted only in cases in which a number of different groups have written independent codes to perform the same task and found similar results. Still, he acknowledges that “how to trust unique codes remains an issue”.
Mr. Rees, is obviously confusing “code-reviews” with “peer-review”. Reviewers should only be required to RUN the code, and verify its output.
Government, Funding Agencies, and Professional Societies may have to step up and remind scientists how science is supposed to work
- In 2009, the UK Engineering and Physical Sciences Research Council put out a call for help for scientists trying to create usable software, which led to the formation of the Software Sustainability Institute (SSI) at the University of Edinburgh.
- The SSI unites trained software developers with scientists to help them add new lines to existing codes, allowing them to tackle extra tasks without the programs turning into monsters. They also try to share their products across disciplines, says Neil Chue Hong, the SSI’s director.
- For instance, they recently helped build a code to query clinical records and help monitor the spread of disease. They are now sharing the structure of that code with researchers who are trying to use police records to identify crime hot spots. “It stops researchers wasting time reinventing the wheel for each new application,” says Chue Hong.
Education and training are important pieces of the solution:
- In the long term, though, Barnes says that there needs to be a change in the way that science students are trained. He cites Wilson’s online Software Carpentry course as a good model for how this can be done, to equip students with coding skills. Wilson developed the week-long course to introduce science graduate students to tools that have been software-industry standards for 30 years — such as ‘version control’, which allows multiple programmers to make changes to the same code, while keeping track of all changes.
Yes, sadly enough, you will be surprised to see how many groups in research and clinical environment don’t even use revision control systems.
The uncontrolled obsession with publishing, just for the sake of publishing, is again pointed out as part of the problem
- “There needs to be a real shift in mindset away from worrying about how to get published in Nature and towards thinking about how to reward work that will be useful to the wider community.”
says David Gavaghan, a computational biologist at the University of Oxford, UK.
You be my Master, Oh great Mr. Gavaghan !!
The original Zeeya Merali article is available online at
an excellent complement is Nature’s article
“Publish your Computer Code: It’s good enough…”
by Nick Barnes
Time for all of us to ask ourselves:
- Can I repeat the experiments that I published last year ?
- Could other group repeat the experiments I published last year ?
- Did that paper really do anything for moving the field forward ?
- Could a shareable, well written and well tested code have done better for the community ?
2 comments to VTK: an example on how to fix the crisis of scientific software
Couldn’t agree more with the conclusions of the article, there is a real need for scientific research to embrace open source, open data and open science. There are some grass roots efforts such as the Blue Obelisk I am involved in (largely computational chemistry). Getting this work funded is difficult, and is often a thankless task that does not help further your career. I think this needs to change so that there is motivation for more researchers to produce usable software. I have encountered many groups who do not know what version control is, and have multiple forks of the same code. I think this is changing, and am very pleased to be at Kitware in a position where I can be part of that change.
It’s also worth reading Jim Graham’s thoughtful post on whether “exact” reproducibility is actually useful or not, and what we might strive for instead.