Why Open Source Will Rule Scientific Computing (Part 4)

February 25, 2010

Index To the Series
Intro: Why Open Source Will Rule Scientific Computing ?
Reason #1: Open Science
Reason #2: The Search for Authenticity
Reason #3: Agile, High Quality, Collaborative Software Process
Reason #4: Scalability
Reason #5: Business Model


In my first blog in this series of six articles, I offered five reasons why open source will rule scientific computing. In this post, I discuss reason #3: Agile, High Quality, Collaborative Software Process. (Click through to see parts one, two and three of this discussion.)

To be honest, there are many people at Kitware who can speak much more authoritatively than I can about software process. If you really want the details, I suggest that you visit Bill Hoffman’s blog posting or go directly to his Google Tech Talk video. Ultimately what I want to do in this blog posting is to examine the dynamics of the open-source software process and make the case as to why it’s better for scientific computing.

As many of you know, the Kitware software process is built around the core tools: CMake (cross-platform build), CPack (cross-platform packaging and deployment), CTest (testing client) and CDash (testing server). These tools are the culmination of more than a decade of organically creating a low-overhead, effective, and integrated software process. Initially we created these tools because, like all good programmers, we have a lazy streak and like to meet excessive work load with automation. So for example, in the early years of VTK we got very tired of tracking down bugs every few months as we released software; as most of us have experienced it can be very hard to uncover a bug introduced months before it is discovered. Very early on with partners in the open source community (notably GE Global Research as a result of a Six Sigma quality project), we began the creation of what is today’s software process. We focused initially on automated testing followed by reporting to a centralized web page, or dashboard. Over the years this has expanded to include all facets of the software development including communication, version control, documentation, bug tracking, deployment, and more sophisticated testing.

I hope you noticed that I snuck the word “organically” into the previous paragraph, since it is key to the point I am trying to make. Many of those contributing to the creation of this software process have roots in the computational sciences, or have computer science training with a strong desire to deploy useful tools to the scientific computing community. Inherently this is a world in which collaboration is as natural as breathing, computer scientists working with domain experts to exchange ideas and implement powerful computational tools. This world is also characterized by rapid technological change. Hence the roots of this process were openness, to foster collaboration, and agility, to foster responsiveness to advances in technology. Thus, the software process that grew organically from the creation of CMake, VTK, ITK and many other open source tools is a microcosm of the larger scientific computing world. And I believe that we are not alone: the animating spirit of collaboration and agility is also a hallmark of most open source projects; hence why open source processes are superior for scientific computing.

You may be wondering how “quality” fits in to the mantra of “agility, collaboration and quality”. Partially this is because as developers we want to create code that the community can depend on and we can be proud of. Part of it too, especially in the early years of the open source movement, was to counter the belief that open source software was somehow inferior, or amateurish. Now we know that as long as we as a community abide by our very rigorous software process, we will create outstanding software systems. However, I believe the major reason that quality is so important in the open source world is that collaboration and innovation/agility require a firm foundation on which to grow. Only with a disciplined process can robust growth be assured.

Now I believe Kitware’s secret weapon is its powerful, low-overhead and quality inducing software process. While there are a lot of individuals and organizations producing outstanding algorithms and systems, we too often find that external code is not cross-platform, breaks easily, and is inflexible and unstable in response to new data, parameter settings, and computing platforms. At Kitware we do not necessarily claim that we are better programmers and hence avoid these problems (though we are certainly among the best :-)), rather we have a better software process that helps us identify and fix these problems faster. As a result our toolkits and applications are known for their stability, robustness, and flexibility, which is why thousands of users and customers build their own applications based on toolkits such as VTK and ITK.

While I enjoy extolling the virtues of software process and could easily go on for several more pages, it’s important to get back to the point, namely that agile, collaborative, quality-software process is critical to scientific computing. Technology is moving so rapidly that users and developers need to be able to respond as a community to new developments, refactor code, and fix software issues to keep up with relentless technological change; and while doing it they have to have confidence that the technology they are developing is of high-quality. Waiting for a proprietary code base to respond to change and issues is not tenable for most organizations.

Continue on to the next blog posting #4: Scalability.

2 comments to Why Open Source Will Rule Scientific Computing (Part 4)

  1. I really like the idea of opensource and admire its potential. However when it comes in scientific computing I often encounter a lack of information, explanation(in terms of comments in the source code) of the algorithm. I am not a super guru who can figure out how things works by reverse-engineering them. I wish there was such a way that an author of a particular function can clearly state what he have done(such as bugzilla or forum-like interface, along with e-mailing list). Other than that I am just a big fan of open source. Anyway, thank you for this good posting!

    Jeonggyu Lee

Leave a Reply to Jeonggyu LeeCancel reply