import CMake; C++20 Modules

January 19, 2023

Update

As of Oct 2023, the experimental flags mentioned in this article have been removed for module support in CMake. Please see this article for an update on how to use C++ 20 named modules in CMake. If you want to understand the technical and historical perspective of this effort please continue to read this article. 

Introduction

Work is underway to implement support for C++20 modules in CMake! Since the C++ standards committee started talking about adding modules to the C++ language, the CMake team at Kitware has been thinking about how they will be supported. Fortunately, CMake has supported Fortran modules since 2005. In 2015, support was added to the ninja build tool to support Fortran modules. In 2019 with news of modules being added to C++, the Kitware fork of ninja was rejoined with upstream ninja and dynamic dependencies were added to ninja. This blog describes the process that was taken and the current state of named C++ 20 modules in CMake. Header modules are not covered in this blog.

Quick Introduction to C++ 20 Modules

What are modules and why are they difficult for build systems to handle? C++ modules are a replacement for the preprocessor and  #include. For a quick example, a common way to define a class in C++ would be to have two files: a .h file that contains the class declaration, and a .cxx file that contains the non-inline implementation functions for the class.

class foo {
    foo();
    ~foo();
    void helloworld();
};
#include "foo.h"
#include <iostream>
foo::foo() = default;
foo::~foo() = default;
void foo::helloworld() { std::cout << "hello world\n"; }
#include "foo.h"

int main()
{
  foo f;
  f.helloworld();
  return 0;
}

To compile this, you could just invoke the compiler like this

$ c++ main.cxx foo.cxx

(In the real world, you would of course use CMake to build this code.)

With modules this could look like this:

// Global module fragment where #includes can happen
module;
#include <iostream>

// first thing after the Global module fragment must be a module command
export module foo;

export class foo {
public:
  foo();
  ~foo();
  void helloworld();
};

foo::foo() = default;
foo::~foo() = default;
void foo::helloworld() { std::cout << "hello world\n"; }
import foo;

int main()
{
  foo f;
  f.helloworld();
  return 0;
}

To compile this, if I invoke the same command it fails to compile!

$ c++ -fmodules-ts main.cxx foo.cxx
In module imported at main.cxx:1:1:
foo: error: failed to read compiled module: No such file or directory
foo: note: compiled module file is 'gcm.cache/foo.gcm'
foo: note: imports must be built before being imported
foo: fatal error: returning to the gate for a mechanical issue

The reason for the failure is that foo.cxx must be compiled before main.cxx. In fact if I run that command twice in a row, the second time it will compile! This is because as a side effect of compiling foo.cxx the Built Module Interface (BMI) is produced by the compiler (for g++ this is a .gcm file). This is something totally new for C++, the order of compilation now matters!

In this simple case if you change the compile line to c++ foo.cxx main.cxx, it would indeed work. However, if you had a real C++ project with many classes and modules it would be very difficult to figure out and maintain the correct compile order as the code is developed. What this means is that the build system is going to have to have a way to figure out which files will provide BMI files when compiled and which files will consume them. 

In order to do this the C++ files will have to be parsed BEFORE they are compiled! CMake will now need a way to parse and extract a source file’s “provides” and “requires” information for modules during the build so that the order of compilation can be determined. In addition, build tools like ninja will have to know how to dynamically load dependency information in order to deduce the correct build order of files.

Fortran Modules

Fortunately, for everyone involved, CMake has had over 16 years of experience with modules, Fortran ones instead of C++ ones. In 2005, the initial support for Fortran modules was added to CMake

commit 19f977bad7261d9e8f8d6c5d2764c079d35cc014
Author: Brad King <brad.king@kitware.com>
Date:   Wed Jan 26 15:33:38 2005 -0500
    ENH: Added Fortran dependency scanner implementation.

In 2015, the Trilinos project funded an effort to add support for Fortran modules to the ninja build tool by adding the dyndep feature which lived in a ninja fork maintained by Kitware for four long years.  See the documentation for dyndep for more information. With the announcement of C++ 20 modules the ninja team was convinced it was worth the effort to merge this work upstream from the Kitware fork and in May of 2019 dyndeps were merged into ninja.

In order for Fortran modules to work CMake added a simple Fortran parser based on makedepf90 to its code base. For this to work with C++, CMake will require access to a C++ parser! Given the complexity of C++, adding a C++ parser to CMake is not something anyone wants. CMake will need help from compiler vendors and come up with a standard way for the compilers to give this information to CMake during the build.

SG15 request for help with C++ parsing

This need prompted the Kitware CMake team to seek out the SG15 tooling group with a paper that describes how CMake supports Fortran modules and how that same approach would apply to C++

https://mathstuf.fedorapeople.org/fortran-modules/fortran-modules.html

A simple example of how Fortran modules work can be seen here:

The full CMake process for building one Fortran library can be seen below:

With multiple targets, the process looks like this:

C++20 Modules

After the paper explaining the CMake process for Fortran modules was presented to SG15, the next step was defining a standard file format for compiler vendors to describe C++20 module dependency information. This came in the paper p1689 (currently in r5)

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1689r5.html

The paper describes a json format for compilers to create that can tell a build system like CMake a set of required and provided modules from a TU that is scanned. 

Kicking the tires

Currently all three major compilers MSVC, GCC, and Clang have either implemented p1689r5 or there exists a fork that adds support. To try the feature in CMake, you will need to first build or install a compiler that has support for p1689r5. You will also of course require a new enough version of CMake that includes experimental C++20 module support. CMake 3.25 or newer will have support for Modules.

Note, the CMAKE_EXPERIMENTAL_CXX_MODULE_CMAKE_API will change in CMake versions. Please see the Help/dev/experimenatl.rst file for the version of CMake you are using to find the correct value.

# CMake 3.25
set(CMAKE_EXPERIMENTAL_CXX_MODULE_CMAKE_API "3c375311-a3c9-4396-a187-3227ef642046")
# CMake 3.26 
set(CMAKE_EXPERIMENTAL_CXX_MODULE_CMAKE_API "2182bf5c-ef0d-489a-91da-49dbc3090d2a")

MSVC

Download and install Visual Studio 2022 17.4 (19.34) or newer. I recommend getting the most recent version that is out at the time you are reading this. Since this has been released in a stable compiler, for MSVC the code to invoke the scanner is included in the CMake Modules directory as part of CMake. It can be found here:

https://gitlab.kitware.com/cmake/cmake/-/blob/f1034acb02/Modules/Compiler/MSVC-CXX.cmake

# This means the only thing you will need is the experimental key and this flag
set(CMAKE_EXPERIMENTAL_CXX_MODULE_DYNDEP 1)

GCC

For gcc, there is a fork that adds support for p1689 that can be found here:

https://github.com/mathstuf/gcc/tree/p1689r5

Since this has not yet been accepted in upstream gcc, the scandep command line interface has not yet been included in CMake. You will have to provide the setting using the  CMAKE_EXPERIMENTAL_CXX_SCANDEP_SOURCE variable in order to build modules with CMake and gcc. You will need a file like the following:

string(CONCAT CMAKE_EXPERIMENTAL_CXX_SCANDEP_SOURCE
  "<CMAKE_CXX_COMPILER> <DEFINES> <INCLUDES> <FLAGS> <SOURCE>"
  " -MT <DYNDEP_FILE> -MD -MF <DEP_FILE>"
  " ${flags_to_scan_deps} -fdep-file=<DYNDEP_FILE> -fdep-output=<OBJECT>"
  )

set(CMAKE_EXPERIMENTAL_CXX_MODULE_MAP_FORMAT "gcc")
set(CMAKE_EXPERIMENTAL_CXX_MODULE_MAP_FLAG
  "${compiler_flags_for_module_map} -fmodule-mapper=<MODULE_MAP_FILE>")

A recent version of this can be found in the ci code for cmake in gitlab here:

https://gitlab.kitware.com/cmake/cmake/-/blob/f1034acb02/.gitlab/ci/cxx_modules_rules_gcc.cmake

Building GCC 

Once you have the source for GCC you can build it, there are a few extra hoops for m1 macs described below.

Linux

For Linux platforms and intel macs, just download and build gcc from the fork. I would recommend installing into a directory that is writable by your user such as ~/gcc-install or other.

m1 mac gcc

For m1 macs, you will need to get a patched gcc that supports arm64. That can be found here:

https://github.com/iains/gcc-darwin-arm64

You then need to apply the patch from the gcc fork implementing p1689 https://github.com/mathstuf/gcc/tree/p1689r5 

You can do that by downloading the patch for the commit with the changes:

https://github.com/mathstuf/gcc/commit/3075e510e3d29583f8886b95aff044c0474c84a5.patch

Then just apply the patch to the gcc for arm64 source tree using the patch command:

cd gcc-darwin-arm64
patch -p1 < 3075e510e3d29583f8886b95aff044c0474c84a5.patch.txt

There are some good build instructions for the mac that can be found here:

https://solarianprogrammer.com/2019/10/12/compiling-gcc-macos/

Download prerequisites and configure a build tree, make and make install  like this:

./contrib/download_prerequisites
mkdir build
cd build
../configure --prefix=$HOME/Work/modules/gcc-inst \
             --enable-checking=release \
             --enable-languages=c,c++,fortran \
             --disable-multilib --disable-werror \
             --with-sysroot=/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk
make -j20
make install

Note, that I have added –disable-werror. This is because sprintf is deprecated in the mac system header files causing a few warnings that will break the build if Werror is on. 

CLANG

For the clang compiler version 16 or newer contains the p1689 implementation required for CMake:

https://github.com/llvm/llvm-project/tree/release/16.x

Once you have the source code, you can follow build instructions found here:

https://clang.llvm.org/get_started.html

Which basically, involve creating a build directory and running CMake to configure the tree:

cmake -DLLVM_ENABLE_PROJECTS=clang -DCMAKE_BUILD_TYPE=Release -G Ninja ../llvm  # note, the ninja build tool is faster than the make tool referenced in the docs.

With clang 16 or newer CMake has the scan process file built in and you will only need the following code to activate it:

set(CMAKE_EXPERIMENTAL_CXX_MODULE_CMAKE_API "2182bf5c-ef0d-489a-91da-49dbc3090d2a")
set(CMAKE_EXPERIMENTAL_CXX_MODULE_DYNDEP 1)
# Default to C++ extensions being off. Clang's modules support have trouble
# with extensions right now.
set(CMAKE_CXX_EXTENSIONS OFF)

Hello world C++ 20 Modules in CMake

Once you have a compiler (MSVC, gcc, or Clang) with support for p1689, you are ready to try building a CMake project with modules and the ninja build tool. You will need CMake 3.25 or newer. I would recommend building the latest nightly release to make sure you have the most recent work.

Turning on the feature

First you will need to turn on the feature in CMake by setting the following three variables as follows. See https://github.com/Kitware/CMake/blob/f1034acb02/Help/dev/experimental.rst for information about the experimental API.

# make sure c++ 20 is set
set(CMAKE_CXX_STANDARD 20)# turn on the dynamic depends for ninja
set(CMAKE_EXPERIMENTAL_CXX_MODULE_DYNDEP 1)
# turn on the experimental API
set(CMAKE_EXPERIMENTAL_CXX_MODULE_CMAKE_API  "2182bf5c-ef0d-489a-91da-49dbc3090d2a")

For gcc, you will need to also include the CMake code fragments described above that set CMAKE_EXPERIMENTAL_CXX_SCANDEP_SOURCE. I recommend something like this:

if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
  include(gcc_modules.cmake)
endif()

Testing in CMake

CMake has a set of unit tests that cover C++20 modules as implemented in CMake. These are tested as part of the standard CI tests in CMake. The test suite is available as `RunCMake.CXXModules` (pass to `ctest -R CXXModules `to run) 

The files can be found here:

https://gitlab.kitware.com/cmake/cmake/-/tree/f1034acb02/Tests/RunCMake/CXXModules

Some examples can be found here:

https://gitlab.kitware.com/cmake/cmake/-/tree/f1034acb02/Tests/RunCMake/CXXModules/examples

To configure and build CMake with a compiler that supports p1689, do the following:

Set CC and CXX env variables to be the C and C++ compiler from the build or install of the compiler supporting p1689.

export CC=/path/to/clang
export CXX=/path/to/clang++

Then run cmake with the following command line: ( the example here is clang and I am using the cmake gitlab ci rules file for clang)

cmake -DCMake_TEST_MODULE_COMPILATION=named,shared,partitions,internal_partitions,export_bmi,install_bmi \
    -DCMake_TEST_MODULE_COMPILATION_RULES=/path/to/cmake/source/tree/.gitlab/ci/cxx_modules_rules_clang.cmake \
    -DCMAKE_EXPERIMENTAL_CXX_MODULE_CMAKE_API="2182bf5c-ef0d-489a-91da-49dbc3090d2a" ../cmake \
    -GNinja \
    -S /path/to/cmake/source/tree \
    -B /path/to/build/tree

Then run ninja to build. Once you have the build, you can run the test like this:

./bin/ctest -R CXXModules

This should result in all the tests passing.

CMake code

Once you have built a compiler that supports p1689 and a version of CMake that supports C++20 modules and have run the tests to make sure things are set up correctly. To use modules in CMake you need to use target_sources and FILE_SETS. https://cmake.org/cmake/help/latest/command/target_sources.html

Going back to the very first example in this blog with main.cxx and foo.cxx you would do the following:

cmake_minimum_required(VERSION 3.26)
project(std_module_example CXX)

set(CMAKE_EXPERIMENTAL_CXX_MODULE_CMAKE_API "2182bf5c-ef0d-489a-91da-49dbc3090d2a")

# Default to C++ extensions being off. Clang's modules support have trouble
# with extensions right now and it is not required for any other compiler
set(CMAKE_CXX_EXTENSIONS OFF)

if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
  include(gcc_modules.cmake)
endif()
set(CMAKE_CXX_STANDARD 20)
add_library(foo)
target_sources(foo
  PUBLIC
    FILE_SET cxx_modules TYPE CXX_MODULES FILES
    foo.cxx
)
add_executable(hello main.cxx)
target_link_libraries(hello PRIVATE foo)

Where foo.cxx is marked as a file that will produce a BMI file since it is set with the CXX_MODULES type.

Summary

C++ 20 Module support has come a long way and has been a group effort with CMake developers, compiler developers and the standards committee working together to make this possible. It is still new code, but once the compilers are released with the support, CMake should be able to have official support without the need for setting the API key. Please give it a try and report any issues or make suggestions for improvements either on the CMake discourse group https://discourse.cmake.org/ or the CMake issue tracker https://gitlab.kitware.com/cmake/cmake/-/issues.

Update

Please see this article for an update on how to use C++ 20 named modules in CMake.

Funding

This work has been funded by Bloomberg Engineering.

3 comments to import CMake; C++20 Modules

  1. I think is a good explanation but at the end there is no code to test your example, just talking ideas not facts ;(

    1. I think compilers are not ready. GCC is still very fragile, and standard library modularization is only for C++23. And there is not yet any consensus on how to structure projects, name file extensions, nor providing documentation (which was previously done inside headers). My guess is that we should expect things to change a lot. Otherwise check my other comment for examples.

Leave a Reply