Surprise! Most Production Update processes are done manually…
In Part 4 of our preliminary coverage of our ongoing Developer Productivity Survey, we take a look at respondents’ Deployment Pipeline. This time, we put a special focus on the degree to which this deployment-to-production processes are automated or manual.
Now, a lot of developers are probably sitting here wondering why they should continue reading this unbelievable coverage of Deployment Pipeline automation – isn’t that for the Ops folks to worry about?
Well yes, but there is a good reason why developers should also care about how their Ops colleagues are running their production update processes. As deploying apps to production becomes more automated and transactional, developers will be able to see their code go into production faster, ideally in a setting where code can be deployed automatically, since fixing or reverting things should be made easy as heck.
But wouldn’t you know it, the majority of those surveyed told us that their processes for deploying applications to the production environment are generally NOT AUTOMATED to any significant degree. Take a look:
(Note: Responses have been normalized to include only selections of Manual or Automated. For these 7 segments, anywhere from 13-25% of original responses were either Missing from the process or Unknown.)
Over 70% of respondents thankfully automate their integration builds, but after that things are looking pretty manual-ish around the rest of the pipeline. Approximately 25% of respondents have implemented automatic production application and database updates, but 3/4 of businesses surveyed still do this manually. If we take a look at automatic rollbacks in case of failures, just over 12% of ops teams have this automated.
I wanted to get some feedback on this directly from the horse’s mouth, so to speak, so I’ve asked our fearless leader Jevgeni Kabanov (ZeroTurnaround Founder and CEO) to provide his own take on this. Check out his responses if you have a minute.
Extended Interview with ZT Chief Executive Geek, Jevgeni Kabanov
ZT: Before we start, I thought I should warn our readers that you have little tolerance for non-automated, non-instant, non-transparent, non-transactional, risky deployment activities. They shouldn’t take any of what you’re going to say personally, right?
JK: Absolutely. I downright loathe inefficient production deployment processes that get implemented at 3 AM by over-worked staff – this is why we’ve introduced LiveRebel in the first place. But I’m here to help, not criticize :-)
ZT: Cool, so let’s start with where Operations teams actually DO tend to automate most – Integration Builds. Does this comparatively high level of automation make sense? What kind of tools are ops teams using to automate integration builds?
JK: Well, it’s actually amazing that the numbers are so high and I need to tip my hat off to the guys who pushed through the idea of automated builds. Just a few years ago, the numbers wouldn’t be anywhere near as high. Automating the builds (and hopefully some of the tests) makes the development process considerably more predictable and helps release better software on time and budget.
ZT: What would you consider a well-automated staging deployment process? Could you explain this a little for our readers?
JK: Really, every release candidate that passes build & automated tests should go right into the staging environment, no questions asked. If it doesn’t, then you’re doing it wrong! :) The reason being that by getting continuous feedback you can detect and correct failures early and in small increments, rather than in big batches that can break multiple things at once.
ZT: What are the most important aspects of a successful rollout that Ops teams face during production updates?
JK: Where do I begin! One of the big issues is that the systems we use are so complex and are almost always in use. In real life even Formula 1 cars stop to change tires, but the Ops teams have to make changes to a running application, so there’s no wonder that things can break. And when it does, everything may not necessarily be recoverable, which is why most companies do offline updates. Doing a manual online rollout is faster and less painful to the customer experience, but it is less predictable and safe. Going fully offline at 3 AM to update your application is safe, but hurtful to your operations teams and affects the global customer experience. Someone is online somewhere, right?
I’m a huge proponent of automation in the production update process (why we created LiveRebel in the first place, wink, wink). I would have everything completely automated, all aspects of the deployment pipeline. But why do we want everything to be automated?
Because it’s completely predictable. Manual processes tend to introduce a natural amount of human variation, which is why automation is so nice – it’s by definition a repeatable process. Good production update automation should also be transactional and reversible so that the system would always be in a consistent state and could be brought back to a checkpoint before the breaking change was applied. And if you add the online component into the mix, it becomes almost inevitable that you have updates running continuously during the day, so that the whole staff be available if something goes wrong and fixes are needed.
One of our LiveRebel customers delivers over 2000 updates a month in such an automated fashion, and we couldn’t be happier. Of course, most others choose to batch the commits together and update a few times a week, but what exactly is keeping operations teams from trying to build the kind of acceptance tests that would allow you the kind of confidence needed to push it to production right away?
ZT: Production Deployment Decision is the least automated process, according to the survey. Why is that?
JK: For some people, this is a bit controversial, and gets into the Continuous Delivery vs. Continuous Deployment space. Continuous Deployment adherents believe that the entire deployment pipeline, including the decision to roll out new versions, should be completely automated to the very end. Continuous Delivery supporters are more in favor of this being a business decision, made by a project owner. Indeed, it could be beneficial to have some manual checkpoints along an otherwise completely automated process, but again, I support full automation whenever possible.
I guess the biggest question is what has the potential to affect users more? For example, in the case of a hot patch or partial rollout, I would favor automation in case something breaks. Then the rollback would be automated as well, and the errors would have very little effect on users.
ZT: That kind of brings me to the next point – Production Update Rollbacks. In order to mitigate risk during deployments, it seems like teams should have a pretty failsafe method of instantly, or at least automatically, rolling back in case of errors or bugs. right? Why do nearly 90% of companies not have this automated?
JK: I would definitely agree that a “panic button” for initiating an automated, instant rollback is vital to Continuous Deployment/Delivery fans, even mission critical for many customers. But the limitations for automating this are simple – if you haven’t implemented automation for application/DB/environment updates, then the rollback cannot be automated.
Why not? A typical rollback is most likely done by a uni-directional script, and in most organizations recovery is completely manual, since the procedures for unbreaking something are quite complex. So if only 1/4 of those surveyed have automated application and DB updates, then only a subsegment of these folks will be able to implement automated rollbacks.
ZT: Good stuff, Jevgeni. Would you care to make any final points?
JK: TL;DR….As a final note, I cannot stress enough the importance of automation to guarantee predictability and transparency as major tenets of the production update process.
And I hope you don’t mind, but I’m telling everyone about our latest release of LiveRebel 2.0 – a tool that tries to solve the pains that manual and offline update processes give to Ops teams. We built LiveRebel in order to automate the production update process so that you can implement Continuous Delivery/Deployment, mitigate risk, run 100% transparent & reversible application updates and also to serve your customers better. I’m available to talk more about this, so drop me a line to firstname.lastname@example.org – thanks!