ZeroTurnaround Blog

Product Updates, Company News & Fun

How ZeroTurnaround Releases Twice a Week with 99.99% Uptime

At ZeroTurnaround we don’t just make products. We run a business that relies on essential services like zeroturnaround.com, my.jrebel.com, buy.jrebel.com, multiple internal apps and backend services, that are supported by our internal IT organization. This adds up to 32 applications that run on 17 servers in 10 separate deployment groups (this includes production and staging environments).

Our Infrastructure

Includes Apache servers running PHP apps, and Tomcat & Jetty servers running Java apps and Python services. We do builds and unit testing with Jenkins, provision our servers with Chef and Ansible, and use Nagios for monitoring.

Our 24/7 Application Uptime Challenge

Our three key apps – zeroturnaround.com, buy.zeroturnaround.com and my.jrebel.com need to be available 24/7. Our billing and other supporting apps have to be operational during the business hours, which span 7 time zones. At the same time we have a constant stream of updates from marketing, sales operations and product teams that need to be deployed promptly to be valuable to the business.

How We Do It

We decided to build our release process around Jenkins and LiveRebel. It looks like this:

  • Jenkins builds the new application version and runs unit tests
  • If all tests pass, Jenkins triggers LiveRebel to deploy the new version to staging
  • Jenkins then triggers automated integration tests on the new version on staging
  • Once automated tests succeed, manual testing follows
  • Once all tests pass, the new version is released into production using LiveRebel through a manual trigger

This process allows us to do 2 weekly releases at 10am Estonian time, updating 2-5 services each time.

Thanks to LiveRebel we need no special release preparation and no maintenance windows. We’ve had 6 failed releases over the last 6 months, and LiveRebel successfully rolled back failures preventing any adverse user impact whatsoever.

Here are the things our operations team likes most about LiveRebel:

  • Console & Overview: LiveRebel provides the operations team with a simple view into which versions are deployed across which environments.
  • Automation: Releases are a non-event as all you need to do is press the “Release” button and choose a rollout strategy.
  • Configuration management: Applications include the configuration templates where properties are substituted during the release. Every LiveRebel server group has an associated Git repository that contains the configuration property files.
  • Rollback: Failed releases are rolled back automatically. Seriously.
  • Rolling updates: No downtime, no maintenance windows, no fuss.

Find out more

We recorded a 45-minute webinar that walks through the ZeroTurnaround infrastructure and shows how we use LiveRebel to release twice a week and still achieve 99.99% uptime. Watch it to find out how we do it.

  • Jim

    Why do you roll out during the business day in Europe? We have servers for each geographic region, and roll out to each when it is overnight there. This means fewer users likely to be affected by a deployment.

  • http://zeroturnaround.com Jevgeni Kabanov

    Why increase infrastructure and rollout complexity when our tooling provides us with zero downtime and failure recovery? Why have the team be up overnight when we can press the button any time during the day?

  • Ari

    Hi, thanks for the post. What tools do you like to use for integration or end-to-end automatic testing?

  • Joeri Hendrickx

    How do you handle database schema updates without any downtime?

  • foo

    LiveRebel handles the JVM/processes, so that is kinda out of scope for the product. There are many strategies for handling (relational/RDBMS) DB-changes.

    For instance, I myself find the two(three)-phase expand/contract-pattern together with a simple self crafted versioning scheme (such as ActiveRecord, DbDeploy, Liquibase, etc) to work fine in about 95% of the cases. Sometimes of course – you just have to take a DB offline to ensure consistency.

  • Arhus

    Are there any sorts of API changes that you cannot release into production using LiveRebel?