Imagine a bacon-wrapped Ferrari. Still not better than our free technical reports.
See all our reports

Continuous Integration and Feature Branches

For couple of weeks now, I’ve been looking into Continuous Integration with multiple branches with Jenkins/Hudson. I’ve searched the intertubes, talked to friends, colleagues and other nerds out there.

My greatest discovery was that although CI is about committing, pushing and getting feedback often (i.e. the CI cluster provides you with feedback that your workstation could never give you in the same amount of time), the true purist CI actually has one more requirement — that the team needs to work on the same baseline.

I wish there was some kind of tooling for not-so-purist CI and I’ll try to explain why…

First, the non-technical view of the problem

I work on products that are distributed as downloads. After a person downloads, it is up to them to update to a newer version if they want to. So I can’t push out quick fixes that get deployed to clients automatically — the software will inform the user when a new version is available, but there are no guarantees that they will upgrade.

This means we need to take extra steps in our quality assurance (QA). In contrast, if your product is offered as a service then most of the time you can push out changes and they are visible to all clients out there. No such luck for me. This limitation puts extra attention on QA.

We have created two main branches, DEV and STABLE. DEV is the mainline that we share. The CI cluster provides extra quick feedback on this branch. It runs a test-suite of our software, testing it on a handful of JEE containers out there (e.g. Tomcat, JBoss and Websphere series). If the tests succeed, then the code is pushed to the stable branch. The CI cluster then runs the test-suite on 50 container versions.

This approach would take the pressure to have a machine or OS that supports so many test environments away from devs. They get quick feedback if their push has broken any container that they did not test locally (e.g. “Did my Resin changes break anything on Glassfish?”)

Okay, I seem to have stuff figured out, so what’s the problem then? Well, if developers want to live in their own branch for a while and still get the benefits of automated testing on a large variety of environments, then they won’t be able to do that. They are not CI engineers.

Technical view of the problem

Jenkins/Hudson uses a notion of jobs. A job is something that is usually tied to a VCS URL and then gets built (run) when there is a change in the VCS repository. The build can do anything, from executing shell scripts to sending out tweets of the status of the build.

The build can produce all kinds of results. In our case, we use Maven and Maven artefacts are the results. The artefacts are stored in one repository for the DEV branch, and in another for the STABLE branch. This is because multiple DEV branch jobs can re-use prebuilt snapshots that won’t conflict with STABLE snapshots.

There are two set of jobs. The DEV jobs and the STABLE jobs. There are two because the VCS urls differ and the maven repositories differ and now we have a problem. The two set of jobs needs to stay in sync. So if we change a job in DEV (e.g. add a new step), we’d want to change a job in STABLE and vice versa.

If devs want to live in their own branch, they need to sync any jobs that they duplicate. And now the job management is just getting out of hand:

  • They start the duplication of jobs
  • They try to keep them in sync
  • They start debugging the jobs (e.g. “Why is there a port race between SAP NetWeaver and OC4J9?”)
  • The delete the jobs after integration

Instead of developing the feature, the dev is now becoming a CI engineer. Shouldn’t he just get fast feedback from a large variety of environments without any hassle?

Solutions

The many posts that I’ve read through suggest writing the tooling for branch management. Shell scripts and ANT tasks that duplicate jobs on the filesystem level are popular. Scripts using the Jenkins/Hudson API are also a good choice.

But what if I don’t want to write another set of tooling? This is something that should be provided by the CI software I’m using. I’m sure I’m not the only one out there using branching and wanting CI support for it.

The CI purists will say that I’m doing it wrong and that I should either drop the feature branches or live with the problems. But does it have to be that way?

I’d really like to see a Jenkins plugin that will understand multiple branches, multiple Maven repositories and just deal with the problem. If CI necessarily implies a single branch, maybe we can change the name of these servers to Continuous Integration and Build Automation servers and we could throw out the implication?

We’ll have to wait and see…

Materials used

  • http://profiles.google.com/james.strachan James Strachan

    FWIW we use a little bit of scala code to auto-generate most of our CI build configurations; so its really easy to add new projects, branches, mvn profiles and so forth using a simple little DSL describing the build configurations…

    https://github.com/fusesource/hudsongen

  • http://twitter.com/toomasr Toomas Römer

    Okay, so there is some shared tooling. This is good. After I’ve got all the replys and stuff I’ll check it out. Would not want to repeat others work. Of course I’m a little bit of scared of scala but a great excuse to learn it

  • Timo Meinen

    For me, it seems that the problem is, that you separated your build into one Maven part and one CI-Server part. Try to configure all the rules that describes your build within Maven (or any other build tool you like).

    Then it’s as simple as pressing the “Copy Job” button in your CI server and adjusting the VCS root at the copy to point to your branch.

    In our project, everything is wrapped by Maven. At worst you can run your Shell scripts via maven-exec-plugin or maven-ant-plugin. In the end, you get a build which is independent by the CI server infrastructure.

    If for some reason, you have to depend on the CI server or server configuration, create Maven profiles and control them by environment variables.

    In any case, the only configuration needed in the CI Server is the VCS root.

  • http://twitter.com/toomasr Toomas Römer

    Okay, one field less to manage in the CI duplicate jobs but still the managing of those is PITA because you need to write the tooling?

  • Travis Laborde

    I face the same problem. We do “branch per feature” and even “branch per bug.” But Hudson has to be “pointed” at some “known” location in Mercurial to monitor for changes. We are experimenting with having Hudson prompt for that location. We’ve built our MSBuild scripts to “not care” what the path is, so it seems like its going to work well. So any dev who wants their branch to run through the CI process can just run the job and paste in their branch URL and be done.

    The remaining problem is really of deployment. We do mostly websites, so we’d want to deploy to branchname.whatever.com. Setting that up is where we are currently stuck.

  • http://twitter.com/reinra Rein

    Which VCS are you using? Just wondering because of MSBuild.

    I guess your on the Microsoft stack (IIS) but with Apache you can generate a new virtual host file and tell Apache to reload the configuration and voila you will have yourbranch.domain.com. YMMV.

  • http://profiles.google.com/adam.zochowski Adam Żochowski

    or just setup apache with:

    VirtualDocumentRoot /www/hosts/%0/docs

    where %0 tells apache to automatically handle new domains. Someone hits server with test-1.example.com , then you better have /www/hosts/test-1.example.com/docs . This way, no need to touch apache ever. Just ensure your folders match your dns entries.

  • Timo Meinen

    What do you mean by tooling? Have you every tried TeamCity? They separated VCS configuration (branches etc.) from Build configuration. So, you always have the possibility to change the Branch for every build. It’s very simple and it disengages you from copying build configuration.

  • http://twitter.com/toomasr Toomas Römer

    Have not tried Teamcity, will check out their stuff. By tooling I meant pressing copy and VCS root management is manual and to automate that you need tooling.

  • Travis Laborde

    using mercurial. and yes, IIS :)

  • Phil Martin

    We use TestTrack for issue management, CruiseControl.Net for our build automation and test running.

    For each each branch-by-feature/bug, we have an issue in TestTrack (we call it a code change)

    We then have one project defined in CruiseControl.Net, that runs every 5 minutes. It queries TestTrack asking for what code changes are marked “ready to test”. It then generates a cc.net project for each of those, updating the CC.Net configuration file. 

    It’s not perfect but for us it’s an excellent balance of continuous integration, but also branch-by-feature. It lets us work collaboratively in the same branch when we want to, but also have totally separate branches when we need to as well.

    – Phil

  • Tom Howard

    There may be a suitable option for you that I’ve just stumbled across. It goes by the old “Feature Branching” name, but it would be better to call it “Continuously Integrated Feature Branching”. The basics (from the little I understand) is that you have an integration branch that all your feature branches are merged to, which is what you run your tests against.

    I don’t understand it 100% (e.g., why it works with DVCS, but not Subversion or similar), but it’s interesting enough to warrant further investigation.

    Here are some links:

    http://continuousdelivery.com/2011/07/on-dvcs-continuous-integration-and-feature-branches/

    https://plus.google.com/109096274754593704906/posts/R4qkeyRadLR

    http://jamesmckay.net/2011/07/why-does-martin-fowler-not-understand-feature-branches/

    http://jamesmckay.net/2011/07/feature-branches-versus-continuous-integration/

    http://www.slideshare.net/wakaleo/jenkins-from-continuous-integration-to-continuous-delivery

  • http://www.zeroturnaround.com/ Toomas Römer

    I happened to see a presentation at Devoxx (it was either by David Farley or John Smart) that if you do feature branches and you want to do continuous integration then you should have a separate branch where you actually integrate. The main idea was that for continuous integration you need an integration point and with feature branches most of the time you postpone the integration so the solution is to have a special integration branch. I believed that story and once I need to tackle the problem again I will create that special branch.

  • Nathan Grunzweig

    Hello Toomas,

    i’m encountering the exact problems you describe. in the company where i work we develop a very large project that is abundant in legacy code. 150+ devs on misc areas of code.

    because of the size and complexity of the product we have several branches that have nothing to do with features.

    each team works on their own branch because everyone working on trunk is impossible – it will continuously break or become unstable.

    because of good separation done recently before we switched to the branch model, the teams don’t touch the same areas of code frequently, and it’s possible to merge all of the different branches into trunk without getting too much of a headache.

    the problem is that we need ci coverage per branch and only teams where one dev volunteers to take it upon themselves to see to it – have a working ci that is synced with the main trunk ci. it’s costly and problematic, as you described.

    i’ve been given a task to “solve this”, because i’m that volunteering dev at my team.
    currently i’m doing research, and your page and research which i see in many places over the net is very useful to me. thank you for publishing.

    i’m considering writing a jenkins plugin, if no alternative presents itself.

    any comments would be welcome,
    NG

  • http://www.zeroturnaround.com/ Toomas Römer

    Hi Nathan,

    I visited Devoxx this year and went to couple of presentations ( http://www.devoxx.com/display/DV11/John+Smart and http://www.devoxx.com/display/DV11/Continuous+Delivery ) that had either good content or great discussions afterwords.

    Couple of takeaways:

    * If you do multiple branches and don’t want to merge into trunk for some reason then have an integration branch just for the integrations. Then you get the CI benefits and also keep trunk clean. Or do it the other way around. Have trunk for CI and another branch for the something more stable.

    * Managing stuff from Jenkins is difficult (this was my realization not from speakers). I saw their configurations, pages and pages of endless checkboxes and other input fields which all were very important :). This means that once you start duplicating jobs and managing them then it will be a nightmare. Even greater than I said in the post.

    At ZeroTurnaround what we did was that we use branches very seldom and whenever somebody wants lets say to test with some app server that is configured on our test servers but not on his then he makes a copy of that job and manages that until he no longer needs that.

    Also the DEV and STABLE stuff is working for us. We constantly break stuff in DEV but at the same time we fix those! It is a great way to get developer discussions going also (or sometimes yelling).

    T

  • Alessandro Giannone

    Hey Toomas,

    We’re currently using Atlassian’s Bamboo for CI and it supports automated detection of feature branches. Unfortunately it’s not OSS and does have a cost associated but it is very cool. It also has a some handy features related to AWS (convenient for us as we run on AWS).

    Ale