Kitware Source Feature Article: July 2009

Why and How Apache Qpid Converted to CMake

Apache Qpid is an open source enterprise messaging system based on AMQP. I’ve been working on a Microsoft-funded project to convert the Apache Qpid C++ build system from the infamous GNU autotools (automake, autoconf, libtool) to CMake. In this article I’ll discuss why Apache Qpid’s C++ project decided to switch to CMake and how we completed the conversion. Hopefully there are some tips and techniques other projects can use in their conversions, and possibly tips or tricks you can share with us to help us improve our process.

Background
For those who aren’t familiar with how autotools work, there are two basic things a developer needs to write:

  1. configure.ac: An M4-based script that gathers configuration options and examines the build system for availability and location of various capabilities, tools, files and libraries.
  2. Makefile.am: A description of the build inputs and outputs. Essentially a higher-level Makefile.

These two basic areas (configuration and build items) are combined in CMake’s CMakeLists.txt file, but in autotools they are separate, though very related, items.

To build in the autotools scheme one must follow these 3 steps:

  1. Bootstrap the configure script using autoconf. This is done by the development team (or release engineer) and involves processing the configure.ac script into a shell script, generally named configure.
  2. Configure the build for the target environment. This is done by the person building the source code, whether in development or at a user site. Configuration is carried out by executing the configure script. The script often offers command-line options for including/excluding optional parts of the product, setting the installation root, setting needed compile options, etc. The configure script also examines the build environment for the presence or absence of features and capabilities that the product can use. This step produces a config.h file that the build process uses as well as a set of Makefiles generated by automake’s processing of the Makefile.am files.
  3. Build, generally using the make utility.

Why Qpid Switched to CMake
The autotools work well, even if writing the configure scripts is a black art. People who regularly download open source programs are accustomed to the process of downloading a .tar.gz file, unpacking the source and then running configure and make. So why would Qpid want to switch? Two reasons, primarily:

  1. Windows. The autotools don’t work natively on Windows since there’s no UNIX-like shell, and no POSIX-style make utility. Getting these capabilities involves installing MinGW. Many Windows developers, sysadmins, etc. simply won’t do that.
  2. Maintaining multiple build inputs (such as Makefile.am and Visual Studio projects) is an unnecessary time sink. At least one of them is always out of step. Keeping Visual Studio projects and autotools-based Makefile.am files updated is very error-prone. Even the subset of developers that have ready access to both can get it wrong.
  3. (Ok, this one is a bonus) Once you’ve spent enough nights trying to debug autoconf scripting, you’ll do anything to get away from autotools.

We looked at a number of alternatives and settled on CMake. CMake is picking up in popular usage (KDE recently switched, and there’s an effort to make Boost build with CMake as well). CMake works from its own specification of the product’s source inputs and outputs, similar to autoconf, but has the following advantages:

  • It also performs the “configure” step in the autotools process.
  • It can generate make inputs (Visual Studio projects, Makefiles, etc.) for numerous build systems.
  • It can be used to execute the test suite in addition to the build.
  • It can optionally facilitate packaging as well as building, which is important for keeping the packaging information consistent and correct throughout the development process.

In the CMake world, the autotools “bootstrap” (step 1, above) is not needed. This is because rather than produce a neutral shell script for the configure step, CMake itself must be installed on each build system. This seems a bit onerous at first, but I think is better for two main reasons:

  1. The configuration step in CMake lets the user view a nice graphical layout of all the options the developers offer to configure and build optional areas of the product. As the configuration happens, the display is updated to show what was learned about the environment and its capabilities. Only when all settings look correct does the user generate the build files and proceed to build the software. It takes the guesswork out of knowing if you’ve specified the correct configure options, or even knowing what options you have to pick from.
  2. It will probably both motivate and facilitate more projects to offer pre-built binary install packages, such as RPMs and Windows installers, to help users get going quicker. One of CMake’s components, CPack, helps to ease this process as well.

How Qpid Switched to CMake
We started the conversion in February 2009. As of this writing, the builds have been running well for a while; the test executions are not quite done. So, it took about 3 months to get the builds running on both Linux and Windows. We’re working on the testing aspects now. We have not really addressed the installation steps yet. There were only two aspects of the Qpid build conversion that weren’t completely straightforward:

  1. The build processes XML versions of the AMQP specification and the Qpid Management Framework specification to generate a lot of the code. The names of the generated files are not known a priori. The generator scripts produce a list of the generated files in addition to the files themselves. This list of files obviously needs to be plugged into the appropriate places when generating the makefiles.
  2. There are a number of optional features (such as SSL support and clustering) which can be built into Qpid. In addition to explicitly enabling or disabling the features, the autoconf scheme checked for the requisite capabilities and enabled as many features as possible building as much as it could if the user didn’t specify what to build (or not to build).

To start, one person on the team (Cliff Jansen of Interop Systems) ran the existing automake through the KDE conversion steps to get a base set of CMakeLists.txt files and did some initial prototyping for the code generation step. The original autoconf build ran the code generator at make time if the source XML specifications were available at configure time (in a release kit, the generated sources are already there, and the specs are not in the kit). The Makefile.am file then included the generated lists of sources to generate the Makefile from which the product was built. One big question we came across was where to place the code generating step in the CMake scheme. We considered two options:

  • Execute the code generation in the generated Makefile (or Visual Studio project). This had the advantage of being able to leverage the build system’s dependency evaluation and regenerate the code as needed. However, once generated, the Makefile (or Visual Studio project) would need to be recreated by CMake (recall that the code generation produces a list of source files that must be in the Makefile). We couldn’t get this sequence to be as seamless as we had hoped.
  • Execute the code generation in the CMake configuration step. This puts the dependency evaluation in the CMakeLists.txt file. Here the code regeneration had to be done by hand since we wouldn’t have the build system’s dependency evaluation available. However, once the code was generated, the list of generated source files was readily available for inclusion in the Makefile (and Visual Studio project) and the build could proceed smoothly.

We elected the second approach for ease of use. The CMake code for generating the AMQP specification-based code looks like this (note this code is covered by the Apache license):

# rubygen subdir is excluded from stable
# distributions. If the main AMQP spec is present,
# then check if ruby and python are present, and if
# any sources have changed, forcing a re-gen of
# source code.
set(AMQP_SPEC_DIR ${qpidc_SOURCE_DIR}/../specs)
set(AMQP_SPEC
   ${AMQP_SPEC_DIR}/amqp.0-10-qpid-errata.xml)
if   (EXISTS ${AMQP_SPEC})
   include(FindRuby)
   include(FindPythonInterp)
   if  (NOT RUBY_EXECUTABLE)
     message(FATAL_ERROR “Can’t locate ruby, ”
       “needed to generate source files.”)
   endif (NOT RUBY_EXECUTABLE)
   if (NOT PYTHON_EXECUTABLE)
     message(FATAL_ERROR “Can’t locate python, ”
       “needed to generate source files.”)
   endif (NOT PYTHON_EXECUTABLE)

   set(specs ${AMQP_SPEC}
      ${qpidc_SOURCE_DIR}/xml/cluster.xml)
   set(regen_amqp OFF)
   set(rgen_dir ${qpidc_SOURCE_DIR}/rubygen)
   file(GLOB_RECURSE rgen_progs ${rgen_dir}/*.rb)
# If any of the specs, or any of the sources used to
# generate code, change then regenerate the sources.
   foreach (spec_file ${specs} ${rgen_progs})
      if  (${spec_file} IS_NEWER_THAN
           ${CMAKE_CURRENT_SOURCE_DIR}/rubygen.cmake)
        set(regen_amqp ON)
     endif (${spec_file} IS_NEWER_THAN
             ${CMAKE_CURRENT_SOURCE_DIR}/rubygen.cmake)
   endforeach (spec_file ${specs})
   if (regen_amqp)
      message(STATUS
           “Regenerating AMQP protocol sources”)
      execute_process(
          COMMAND
               ${RUBY_EXECUTABLE} -I ${rgen_dir}
               ${rgen_dir}/generate gen ${specs} all
               ${CMAKE_CURRENT_SOURCE_DIR}/rubygen.cmake
          WORKING_DIRECTORY
               ${CMAKE_CURRENT_SOURCE_DIR})
   else (regen_amqp)
     message(STATUS “No need to generate AMQP ”
                               “protocol sources”)
   endif (regen_amqp)
else (EXISTS ${AMQP_SPEC})
   message(STATUS “No AMQP spec... won’t ”
                             “generate sources”)
endif (EXISTS ${AMQP_SPEC})

# Pull in the names of the generated files,
# i.e. ${rgen_framing_srcs}
include (rubygen.cmake)

With the code generation issue resolved, I was able to get the rest of the project building on both Linux and Windows without much trouble. The CMake mailing list was very helpful when questions came up.

The remaining not-real-clear-for-a-newbie area was how to best handle building optional features. Where the original autoconf script tried to build as much as possible without the user specifying, I put in simpler CMake language to allow the user to select options, try the configure, and adjust settings if a feature (such as SSL libraries) was not available. This took away a convenient feature for building as much as possible without user intervention, though with CMake’s ability to very easily adjust the settings and re-run the configuration step, I didn’t think this was much of a loss.

Shortly after I got the first set of CMakeLists.txt files checked into the Qpid subversion repository, other team members started iterating on the initial CMake-based build. Andrew Stitcher from Red Hat quickly zeroed in on the removed capability to build as much as possible without user intervention. He developed a creative approach to setting the CMake defaults in the cache-based on some initial system checks. For example, this is the code that sets up the SSL-enabling default based on whether or not the required capability is available on the build system (note this code is covered by the Apache license):

# Optional SSL/TLS support.
# Requires Netscape Portable Runtime.
include(FindPkgConfig)

# According to some cmake docs this is not a
# reliable way to detect pkg-configed libraries,
# but it’s no worse than what we did under autotools
pkg_check_modules(NSS nss)

set (ssl_default ${ssl_force})
if (CMAKE_SYSTEM_NAME STREQUAL Windows)
else (CMAKE_SYSTEM_NAME STREQUAL Windows)
   if  (NSS_FOUND)
     set  (ssl_default ON)
  endif  (NSS_FOUND)
endif (CMAKE_SYSTEM_NAME STREQUAL Windows)

option(BUILD_SSL “Build with support for SSL”
          ${ssl_default})
if   (BUILD_SSL)

  if   (NOT NSS_FOUND)
    message(FATAL_ERROR “nss/nspr not found, ”
                  “required for ssl support”)
   endif (NOT NSS_FOUND)

   foreach(f ${NSS_CFLAGS})
      set (NSS_COMPILE_FLAGS
          “${NSS_COMPILE_FLAGS} ${f}”)
    endforeach(f)

    foreach(f ${NSS_LDFLAGS})
       set (NSS_LINK_FLAGS “${NSS_LINK_FLAGS} ${f}”)
    endforeach(f)

    # ... continue to set up the sources
    # and targets to build.
endif (BUILD_SSL)

With that, the Apache Qpid build is going strong with CMake.

During the process I developed a pattern for naming CMake variables that play a part in user configuration and, later, in the code. There are two basic prefixes for cache variables:

  • BUILD_* variables control optional features that the user can build. For example, the SSL section shown above uses BUILD_SSL. Using a common prefix, especially one that collates near the front of the alphabet, puts options that users change most often right at the top of the list, and together.
  • QPID_HAS_* variables note variances about the build system that affect code but not users. For example, whether or not a particular header file or system call is available. These are passed through to compile time using the CMake configure_file statement.

As you can see from Figure 1, the settings that users would most often want to change are at the top of the list. This is a whole different experience from remembering help text (or guessing!) and typing in long command lines with the desired options.

Apache Qpid Configuration
Figure 1: Apache Qpid Configuration on Windows

Future efforts in the CMake area of the project will complete the transition of the test suite to CMake/CTest, which will have the side effect of making it much easier to script the regression test on Windows. The last area to be addressed will be how downstream packagers make use of the new CMake/CPack system for building RPMs, Windows installers, etc. The recently released Apache Qpid version 0.5 is the last one based on autotools and hand-maintained Visual Studio projects. The next version will be completely CMake-based. I believe this will help to improve the consistency of release results across supported platforms from build and test through to packaging.

References

  • http://qpid.apache.org
  • Advanced Message Queueing Protocol (www.amqp.org)
  • http://qpid.apache.org/license.html

Steve Huston  Steve Huston is a leading UNIX/Linux and Windows programming expert specializing in C++ network programming. Steve is President of Riverace Corporation and is a regular contributor to the ACE and Apache Qpid open source projects. He is co-author of C++ Network Programming (2 volumes) and The ACE Programmer’s Guide. You can read Steve’s blog at http://stevehuston.wordpress.com, follow him on Twitter at http://twitter.com/stevehuston, or send email to shuston@riverace.com.