Stay in Sync with Upstream via Git Subtree’s

June 1, 2015

Staying in sync with upstream

Working with upstream is a critical open source software practice that ensures long term success.

It is important to stay in sync with upstream to

  • get upstream bug fixes.
  • take advantage of new upstream features.
  • avoid refactoring local changes.
  • do the right thing and support the upstream from which you benefit.

Since software is constantly evolving and improving, an efficient process to both pull and push changes upstream is key to high quality software.

This post discusses how ITK uses Git subtree's to help both pull and push changes to upstream third party libraries.

There are other methods to track third party libraries, like Git submodules and the CMake ExternalProject module. The different approaches have their own strengths and weaknesses. The Git subtree approach is used here so that

  • Everything needed is bundled in the ITK repository.
  • No extra version control commands are needed with checkouts.
  • We can maintain local modifications that should not be pushed upstream.
  • All CMake information is consistently passed to the third party libraries.

 

Pulling changes

An UpdateFromUpstream.sh script, kept in the third party module's directory, is executed to merge in the latest changes from subtree libraries. The subtree libraries currently are GDCM, KWSys, and MetaIO.

This UpdateFromUpstream.sh script contains project specific information like the Git repository URL. A subset of directories or files and a redaction command can be configured to limit the content imported.

All of the commands to automate the process are kept in a common UpdateThirdPartyFromUpstream.sh script written by Brad King and Brian Helba. Subtree snapshots of the upstream repository are kept on their own branch, separate from the main repository branches. This subtree is then merged into the ITK master branch in a nested directory. The tip of the snapshot branch is located with an identifier in the commit message; a named branch does not need to be exposed. The script currently support snapshots of upstream Git repositories; although, it could be extended to support other snapshot revision types.

Git Subtree Merges

Pushing changes

When modifications are made to third party libraries, a Git post-commit hook will present a command to create a patch file from the subtree commit content. This patch file can be emailed or applied to a repository. When possible, the developer is also directed to the URL that documents upstream's contribution process.

Happy contributing!

1 comment to Stay in Sync with Upstream via Git Subtree’s

  1. You really need a better intro about what exactly Upstream is. Is it a CMake process, a CMake project, a CMake feature or something else completely? An here’s a ridiculous statement: “Since software is constantly evolving and improving, an efficient process to both pull and push changes upstream is key to high quality software.” “…evolving and improving…”, really? Please. It all evolves into a CF that eventually is replaced or just dies by the wayside twitching like a wounded animal after becoming horribly complex and therefore unusable – rarely improving. Software development: where folks are highly paid for getting nothing productive done and where folks get paid nothing who are really, really productive at producing something nobody needs or wants.

Leave a Reply