Analyzing and Improving Software Sustainability with Our Software Sustainability Matrix
Software sustainability – the expectation that the software used today will be available into the future [1,2] – is essential to the success of future innovation, supporting a variety of activities including science, investment, and commercialization:
- The scientific method hinges on reproducibility. Since modern science relies heavily on computational methods, it is critical that the associated software remain available in order to reproduce and reevaluate results.
- Program officers, funding organizations, supporting communities, and investors expect that any software that they support will have long and productive lifetimes. This requires building healthy ecosystems that support long-term software vitality.
- Companies rely on software for critical enterprise functions, increasingly embedding millions of lines of code into new products and services, much of which is not under the control of the enterprise. Technical debt or even technical bankruptcy may result when systems are not kept up to date to meet evolving security, compliance, functional, or usability requirements .
Here at Kitware, we have long taken software sustainability very seriously (since our business model depends on it ), and its importance has only grown with the increasing awareness of sustainability issues. Our customers and collaborators are increasingly concerned about the long-term viability of software systems and have asked for assistance in addressing their concerns. Requests have been fielded from commercial organizations building products on various software platforms, nonprofit research organizations establishing new scientific initiatives, and funding agencies developing programs to address software sustainability or to transition research software into commercialization [8,9,10]. We suspect that these inquiries are the happy result of our track record: systems like VTK , ITK , CMake  and ParaView  have each been around for two decades or more, with ongoing, vital communities and significant development activities, which continue to benefit thousands of research and commercial activities. Plus, our global software process based on CMake, CDash, CTest, and related tools (integrated into popular systems such as Linux [15,16] and Microsoft Visual Studio ) clearly demonstrate our ability to develop, manage, and deploy large-scale software systems for the long term. Hence the resulting calls for help.
In this blog post, we set the stage for a discussion of software sustainability. As this is an introductory post, we provide a high-level overview of our approach to sustainability, with additional detail provided in future posts.
Software Sustainability Matrix
“How can we evaluate a software system for its long-term sustainability?” “How can we make existing software more sustainable?” “How can we create and develop sustainable software systems?”
As we become increasingly reliant on software systems, many organizations ask these questions – alone or in combination – as they reflect on the risk they bear in relying on a particular software system.
To address these questions, and based on our long experience with this issue, we have created the software sustainability matrix (SSM) (see table). The SSM consists of four values by which we score a software system: (Impact, Risks, Community, Technology). In turn, the four values (I, R, C, T) are determined by evaluating metrics associated with each value. Some of the metrics are based on objective measures, some subjective, and some a combination of both subjective and objective measures. To evaluate a particular piece of software, we score it according to each metric, roll up the metrics to assign values, and then combine the four values to assign a final score. Possible ways in which the metrics and values are combined will be addressed in a future blog post.
Admittedly this is an ad hoc and empirical approach. However, we have found that the matrix provides a framework on which to analyze software and identify sustainability weaknesses, and provides a qualitative sense as to the sustainability state of a software system. Analysis driven by the matrix metrics also suggests actionable paths forward to improve sustainability. Note however that while it is possible to improve the chances for long-term sustainability, worst-case scenarios preclude any guarantee of sustainability. (For those of you really concerned about this, make sure that your software is hosted on GitHub and included in the Arctic Code Vault in Svalbard, Norway .) Also, note that this is a work in progress – some of the metrics we’ve found important to long-term sustainability may not be important to some, or there may be different qualities that we’ve not considered. Our hope is that with time, after conversations with customers, collaborators, and other knowledgeable software experts, the matrix and associated scoring system will evolve and become both more formal and useful. (Yes, please, we want to hear from you.)
So what are the sustainability values and metrics that we measure? The table contains the current list and a brief description of each metric. In general, the four values address the following concerns:
- Impact – does the software currently impact, or have the potential to impact, a large number of people, organizations, and/or the society at large? Or does the software solve an important problem for a small niche audience?
- Risks – what risks may cause the software to “disappear,” cause development to cease, and/or eliminate the user base?
- Community – is there a vital community that can nurture the software by providing ongoing usage, development, support, contributions, and/or financial resources?
- Technology – does the utilization of the software “bring joy” to users and developers? Does the system use modern software practices? Can it withstand competitive products, and rapidly adapt to emerging technology?
Clearly, some of these values are more important than others. If the perceived value of software is high (e.g. a powerful operating system) it is likely to be more sustainable than a low-value software system (e.g. software to simulate a pet rock). Many of these metrics are correlated; for example, a large and active community typically correlates with a high level of impact, not to mention its likely correlation to the use of modern software technologies.
The SSM as presented here is motivated by hands-on experience with many software systems, including proprietary and open source systems. Since Kitware is a company providing R&D services, we tend to view this issue from the perspectives of open access (free access to research content) and open science (free access to publications, data, and methods) [19,20,21,22]. Clearly, sustainable software is strongly related to these initiatives as software is intimately connected to the practice of science (e.g. computational methods) and publication (e.g. interactive documents).
This blog post introduced the values and metrics that compose the software sustainability matrix. We have found the SSM to be a useful device to assess and improve the sustainability of software; as a vehicle to encourage communication about the state of software; and as a means to develop an action plan to improve the longevity of a software system.
In a series of follow up blogs, we will address each of the sustainability values with associated metrics, and provide rationale for the scoring procedure. We will conclude this blog series by defining an approach for arriving at a final sustainability score.
Don’t miss our follow up blog posts about software sustainability. To stay informed on this topic, and Kitware’s other areas of focus, be sure to subscribe to our blog.