Imagine a bacon-wrapped Ferrari. Still not better than our free technical reports.
See all our reports

Things to Consider When Featuring Branching with Continuous Integration

The Story

So you’re using Continuous Integration and Mercurial as your VCS. CI is great, you’re loving the feedback and feeling good after a green build. You’ve just got your project started and a couple of features are underway, but then you see a red build. “It’s nothing”, they’ll say, “just wanted to publish my changes, halfway through this big refactoring”. “One broken build is expected”, they’ll say, “just this one”.

Before long either developers stop publishing their changesets daily or you end up with a Continuous Misery instead of Continuous Integration, because the only color you see is red. About that time you (re)discover that branching is so easy in Mercurial, as easy as pushing or pulling changes from other repositories.

I’ll just create a branch for these guys so they can easily publish their code to others, right?” and you create branches for every feature. The developers are happy and you CI build is always green, but soon you discover that your application is broken, despite the green lights on the build. “Oh wait, my CI jobs are all configured on the ‘default’ branch…

The Reasoning

If the above story sounds remotely familiar to you, then you’re not alone. Feature branches and Continuous Integration don’t play well together. CI purists think that feature branches are an abomination, but if you’re not a purist then you just want a working application and a clean mainline.

We value being pragmatic over being a purist. Sure, you could use feature toggles or branch by abstraction to emulate feature branches, but just running ‘hg branch “feature – x”’ is so much faster and simpler.

We’ve raised this topic once more, as a call for answers from the community, about how you guys handle such situations. This time however, we’d like your opinion about an idea that came to us.

Since our last visit with the “feature branches vs Continuous Integration” debate a few things have changed. Namely, a few projects have emerged that automatically help you clone jobs for feature branches in Jenkins:

  1. Jenkins autojobs supports Mercurial, Subversion and Git, but only takes in one job to clone for a feature branch
  2. Jenkins Build Per Branch supports only Git, but takes a set of jobs (identified by a prefix) to clone for a feature branch.

With the help of those projects you can start using CI for simpler projects, but what if your Maven project consists of multiple parts located in multiple repositories and you’d still like to use feature branches in them? And what if you don’t want to clog up your Jenkins views with an ever-changing array of jobs?

A unique problem arises of how to guarantee that your build tool doesn’t package your application from branch1 with dependencies from branch2 or branch3 or… The issue can be illustrated as follows:

fig_artifact_origin_problem

The Solution

I took this issue to heart and wrote a whole bachelors’ thesis out of it. In my thesis, I gave a brief introduction to both CI and feature branches, listing many alternatives to feature branches along the way. As mentioned earlier, going with feature branches may be the most practical way for parallel development and keeping the mainline clean.

The full thesis can be retrieved here and for people comfortable with CI and feature branches, reading chapters 3 & 5 should give just enough background to understand our proposed solution to these problems, but we’ll go over them shortly in this blog post as well.

We created a prototype Jenkins plugin called Feature Branch Notifier that works as a combination of a patched Mercurial plugin, new trigger plugin and a pre-build step, enabling us to detect updated branches with polling and change the branch the build will run on. Its configuration is similar to the default Mercurial plugin, with only one additional field called “Match branch names with” under advanced options, which can be used to filter branch names that new builds will launch on.

To enable the plugin you have to choose a new Source Code Management option called Mercurial (feature branch aware) and specify the branch as a special environment variable called $BRANCH.

FBN_custom_mercurial

Then you’ll need to select the new Feature branch aware Poll SCM option under Build Triggers and copy your trigger configuration from the usual Poll SCM trigger.

FBN_custom_trigger

Lastly you need to check the Check and mark builds with feature branches checkbox, which enables a pre-build step that sets the value of the $BRANCH variable.

FBN_pre_build

The way it works all together is that the new trigger plugin will search for updates across all your branches and fire new builds of the job if it finds any. Builds not running on the ‘default’ branch will have a special tag on them. The pre-build step looks for the tag in the build and sets the $BRANCH value accordingly, thus your build will run on different branches, which is indicated by adding the branch name to the end of the build in the Build History view.

FBN_build_history

To quickly launch new builds on a given feature branch, a new menu item is available on tagged builds to schedule a new build with the same branch tag.

FBN_menu_item

The Integration Risk

By now you may be thinking that all this sounds fine, but what about integration risks? Builds running on different branches may prove that the code in that particular branch passes tests, but it gives no guarantees about being compatible with code in another branch.

These risks can be mitigated by borrowing the gatekeeper and branch updater notions from Bamboo. In principle they are different sides of the same coin. Gatekeeper merges the feature branch into mainline before each build and branch updater does the opposite, by merging the mainline into the feature branch. The principles of a gatekeeper can be illustrated as follows:

fig_gatekeeper

Both the gatekeeper and branch updater push the merge only after a successful build, meaning all of the plan’s jobs have successfully finished, which gives us the guarantee that the build is not broken if the merge is pushed to the shared code repository.

In Jenkins, executing shell scripts could simply mimic this functionality. One script that does the merge as the first build step and another that pushes the merge as the last build step, building and testing in the steps between. With the use of the “Feature Branch Notifier” plugin an additional check for the current value of $BRANCH should be added to only run the gatekeeper or branch updater, when the build is not running on the default branch i.e. when the $BRANCH variable is not ‘default’.

The Conclusion

With the help of our plugin, you can now easily start launching builds on different branches without clogging up your views with different jobs. Though for many projects using components located in different repositories this might not suffice, as you still have the threat of combining your main application with a dependency from a random branch.

Our solution to that problem comes by using our prototype plugin in combination with some other plugins to create so called environment-locked multijobs. These configurations are complex and there’s only limited length to a blog post, though they are described in detail in chapter 5 of my thesis.

I hope some of you will find this solution at least interesting if not helpful and as always any feedback for the plugin or the solution in general is welcomed. Chapter 6 of my thesis also lists some needed improvements for this plugin and gives some ideas for alternative solutions as well.

In any case, using Continuous Integration shouldn’t exclude the use of branches in your project. It would be perfectly reasonable to move a couple of features to a separate branch and also develop on the mainline, when everybody is aware of the potential integration risks and active measures have been taken to get notified of incompatibilities as soon as possible.

Want to see more about Continuous Integration?

Visit our existing content on Continuous Integration with Why Devs <3 CI: A Guide to Loving Continuous Integration, a 30-page report by RebelLabs. Just click the button below to get access to the PDF! :)

Get the PDF: 'Why Devs <3 CI'

  • Sarah

    …or you could just use Atlassian Bamboo, which does feature branching natively for Git, Hg, and SVN. Even does automatic merges with each build, if you want.

  • Brad Appleton

    There are two varieties of feature-branching that I think you need to distingush between here. First, there are *Isolated* Feature-branches, which is what Martin Fowler describes on his Bliki, then there are *Integrated* Feature-branches, where code gets integrated regularly (i.e. daily) from the feature-branches to the mainline.

    In the case of integrated feature branches, the point of the feature-branch isnt to keep the mainline clean/pristine of “partially completed features” but rather is an integration “pipelining” technique that allows CI to happen several time per day on the Feature-branches, but product-wide integration (across features) to mainline typically happens no more than daily.

    This is done presumably because mainline would be too volatile/unstable (or suffer too much “commit contention”) with that much integration going on (i.e. at least as pften as it takes to do a full-build, say for example if CI happens every 15minutes and the build takes >=15 minutes). Having integrated feature-branches in this case gives more stability for each feature-crew (possibly allowing them to do less than a full from-scratch build every time) will still making sure that the whole shmeer still gets integrated and built (from scratch) and tested at least daily.

    Most of the CI build-server tools that support “branches” PLUS auto-integrating (propagating) them to the mainline are really in support of the case of *integrated* feature-branches (not the isolated ones).

    In the case of isolated feature-branches, the presumption is you must not integrate across in-progress features because it is deemed unacceptable to deliver partially complete features. The real question is, is it better (and more likely) that you’ll need to “subtract” or “backout” a partially completed feature than it is to have delayed integration that approaches the slippery slope of having to do “big bang” at the end?

    In an ideal world, you could maintain both the isolated feature branches AND a mainline that they all merge to regularly — get the necessary feedback on quality and “integration debt” on boththe feature-branch builds and the mainline build, and make changes on the feature-branches to eliminate/minimize future merge conflicts without having to intermingle features.

    The problem is that seems like a lot of extra integration and build work. However, with good, efficient tooling.automation and the right kind of good-quality feedback (including static-analysis that would add you in refactoring to minimize merge conflicts), it might not be so bad. This is what the newer features of “gatekeeper” and “branch updaters” start helping you get close to approximating (but its still not quite there yet).

    If you had it fully automated and fine-tuned, the result of being able to efficiently build the feature-branches, then auto merge/propagate them to an up-to-date mainline and be able to do automated “what if” static analysis on the structural-only changes to merge back into each feature branch would get you pretty darn close to a new kind of feature-branch (I’ll call it an “autonomic feature branch”) where you can have your full product+team-wide continuous integration”cake” and eat your isolated feature-branches too while still avoiding big-bang integration-debt.

    Or you could try your luck and “push back” against the powers that be who have decreed against incremental feature-integration to main. (There is an old saying, which Fowler’s “FeatureBranch” article call to mind, that goes something like “If you [integrate and] build it, it will ship!”)

  • Tõnis Pool

    True enough, but the whole point of the post and the thesis was to help the Jenkins community to have some of the same benefits. Bamboo has great features, which I reference and try to port to Jenkins.

  • Tõnis Pool

    Thanks for the input into the discussion! You’ve really fine tuned the meaning and purpose of a feature branch. The main view taken in my thesis was that feature branches are used to keep the mainline clean which hopefully decreases the amount of (mainline) broken builds (because riskier development is in a branch).

    But your view of *Integrated* feature branches to avoid commit contention on the mainline is an interesting one. I guess it all depends on the team size and/or the push frequency.

    There are very numerous different strategies for using branches and indeed “gatekeeper” approaches are in their baby steps, but at least there is some discussion and progress. With the help of our plugin hopefully some people will start experimenting with feature branches on Jenkins as well, because currently, I think, it was mostly a Bamboo playground.

    Would be interesting to get more feedback from actual projects that have somehow adopted the gatekeeper approach, to get a clearer picture of the problems different projects might run into, i.e. is the “integration” build (the one with the gatekeeper) continuously failing or does it provide really helpful feedback?

  • Oliver White

    You just couldn’t resist, huh? Automatic merges with each build. WTF Sarah, I mean seriously. How awesome is that! ;-)

  • Brad Appleton

    Tonis writes: “The main view taken in my thesis was that [isolated] feature branches are used to keep the mainline clean which hopefully decreases the amount of (mainline) broken builds”

    That’s actually not quite what [isolated] feature-branches were originally created for (back in the days of waterfall and big-bang integration). Their purpose is closer to what Fowler indicates in his Bliki, and it has little to do with decreasing broken builds, and is only indirectly related to keeping the mainline “clean”:

    The problem is (was) that for whatever reason (be it contractual, economic, or organizational, etc.) the project was not permitted to deliver any partially complete features (remember, these are *BIG* (pre-agile) features complete with up front formal specification and detailed contract negotiation). Since it wasn’t always a “sure bet” that 100% of the feature would be done in time for the next release, if they integrated partially completed features to the mainline and couldn’t finish it in time for release, they would be required to somehow subtract (dis-integrate or disable) all of the functionality of the feature.

    So their options were either to integrate frequently within each feature, but wait to integrate a feature to “main” until it was 100% functionally complete; or else frequently integrate across all features, but be ready and able to subtract a feature at the midnight hour if it didn’t finish in-time. Doing late-integration of features was actually deemed a much lesser risk than trying to “subtract” them near the end of the release.

    It’s not that they were afraid cross-feature integration would break the build more frequently (they’d rather find and solve those integration problems sooner rather than later anyway). It’s that they were afraid that the likelihood they might have to “subtract” a feature was very real and thought it would be easier and less risky to integrate the feature late, then to try and subtract it even later.

    Now, even in “the age of agility”, many organizations are still in the situation where at the level of marketing and customer-relations, the sizing of features and the agreement with the customer of what all goes into a feature is still often regarded as a “binary” contract (either you deliver all of the feature at release-time, or else you must deliver none of it). So the practice of isolated feature branches still survives in the agile age.

    However, if a project being agile also does smaller features, continuous integration and continuous refactoring+testing, then a lot of the presumed risk/likelihood of partially completed feature-work is possibly a lot lower:

    - If the features really are significantly smaller, then the likely hood of not finishing it in time should be a lot less, and if working in a feature-centric manner, then you should have a better idea when you start the feature what the risk is of not finishing it on time

    - If continuous refactoring and testing is happening, perhaps it may be a lot easier (less risky) to “subtract” feature than it is to integrate it late (then again, maybe not).

    - If there really is continuous collaboration with the customer, then maybe the assumption of “no partial-feature delivery” can (and should) be challenged and discussed with the product-owner, so that there is no need to keep features isolated until the last minute.

    Of course, not every agile project is fully agile in every one of those aspects, and some amount of command-and-control and up-front specification and contract negotiation is still going on the and project is just trying to cope with those as best as it can and be as agile as it can otherwise be.

    But the above is the original purpose and intent behind why isolated feature branches were created and used.

    (I’ll address your other comment in a separate reply.)

  • Brad Appleton

    Regarding the followin from Tonis response to me: “Your view of *Integrated* feature branches to avoid commit contention on the mainline is an interesting one. I guess it all depends on the team size and/or the push frequency.
    Hmmn – I didnt say [integrated] feature branches were primarily to avoid commit-contention. I merely gave that as an example of one of the (many possible) symptoms of high volatility and instability that can come from trying to do CI for a large project across many teams. I start to see such “commit-contention” when “push frequency” starts to exceed more than 4-5 per hour (or gets to be twice as frequent or more as the time it takes to do the CI build – even with a CI-tool that supports a “quiet period”)
    But when you say “I guess it all depends on team-size” I worry there is some confusion. When it comes to the use of *integrated* feature-branches, it really only should apply when you are talking about a project that spans multiple teams (or else a very large team that probably needs to be reorganized into multiple smaller (sub)teams).
    Then, assuming that work is happening on several features at once at any given time (which many lean-agile gurus might tell you not to do in the first place, and focus on a feature at a time until its “done” before starting the next one) you start running into some integration issues that are quite likely more challenging or more pronounced than if you use teams that organized along component boundaries instead of feature-boundaries.
    If teams are organized along component-boundaries, then integration across teams is more about “assembling and testing” rather than “merging and testing”. That makes cross-team integration easier (to handle at component interface boundaries) but it isnt centered around deliverable units of customer-value like the lean/agile folks prefer.
    When you have multiple teams that all need to build and integrate across the entire code-base (not just some some subset of components) then trying to have all teams continuously integrate to the same single mainline can start causing a lot more systemic issues:
    The first and most obvious of these, is potential build-breakage, with mainline being at higher-risk of being broken frequently and for longer periods of time.
    But the more nefarious problem (and possibly less obvious) is that there is only so much volatility each team can deal with productively, and while dealing with changes and instability frequently within their own team may be feasible to handle, adding onto that an equally high frequency of changes coming from other teams that they cant communicate with as frequently and as closely as their own team-members makes it even harder.
    This can still be managed by doing CI on a single-mainline only up to a certain point! Once that threshhold is exceeded, you end up needing to do a little “divide and conquer” both in terms of branching (each “team” typically gets its own separate “codeline” — which for a feature-team starts looking like a “feature-branch) and in terms of build-frequency (team-level builds happen continuously, but product-wide or system-wide builds happen less frequently (although still at least daily if possible — system-build time plays a factor too of course).
    So in the case where you had to divide and conquer it is typically done along team boundaries, giving each team its own “integration branch” (or codeline), which must be pipelined back into the less frequently integrated mainline. And if your teams are primairly “feature teams” then this becomes a case of *integrated* feature-branches. And the typical answer is a two-stage (or multi-stage) integration “pipeline” (which often evolves into the “deployment pipeline” of continuous delivery.)
    But the reasons for the two kinds of feature-branches are very different.
    - One [isolated feature branching] is a customer/management constraint of not being allowed to deliver partially completed features (and fearing the risk and impact of late dis-integration more than that of late feature-integration).
    - The other [integrated feature branching] is a scaling issue due to having so many changes integrating so frequently that it causes too much/frequent instability that impacts too many other teams, and so we add another level of indirection by adding another layer of integration (but this time at only slightly decreased frequency).
    It is the *integrated* feature branches that are the case where so called gatekeepers and other “buffering” or gaiting mechanisms are frequently used. And its not so much because the branches are “per-feature” as it is because the branches are “per-team” and with more than just code-interface dependencies between teams.

  • AndyB

    Unfortunately, it only works with Mercurial.. if it worked with SVN (or any path-based SCM) or git, then we’d be talking.

    For a lot of systems, the difference in branches is just the path, eg:
    MyProject/branches/BranchA/makefile
    MyProject/branches/BranchB/makefile
    MyProject/trunk/makefile

    all become MyProject/$BRANCH/makefile … so it shouldn’t be too hard to make it work on SVN, TFS, and others like this in a generic way.

  • $247183

    I still don’t fully see how some features can be reduced in so small increment that they can always be shipped. I guess I need to revise my view on stuff, or those few features that are too big would require Feature Toggles.

    I’m also not sure how maintenance works when you have only one mainline. What if I need to release bug fix on the old release, before releasing new features, or even, multiple release back?

  • dantheperson

    Most places i have seen that have one development mainline would still use a branch for any sort of emergency, we-need-to-get-this-patch-out-now release. Fix the bug on the mainline, then merge just that change to a branch from the old release.

  • itti

    For some reason the source code to this plugin seems to have vanished. Is there any plan to continue it further and make it more stable? Even if not could you at least tell us how the “Match branch names with” is supposed to work? Is it a regex?