Imagine a bacon-wrapped Ferrari. Still not better than our free technical reports.

DevOps Productivity Report Preview – How Much Production Failures Cost You

About a month ago, we launched our IT Productivity survey (take it if you haven’t already done so!) in order to gain insight into the day-to-day lives of IT folks, their processes and tools of choice that help improve productivity. Thus far, we have some interesting insights and I’d like to share one with you. But you’d better sit down for this, because once you actually figure out how much money you lose EVERY TIME you have app failures in production, you’ll get weak in the knees!

How often are recoveries needed due to failures in production?

The first thing to ask is “How often?” I posed this question to get a sense of how frequently teams need to recover apps from failures in production. After all, failures impact the business as a whole, potentially taking down customer-facing sites, business-critical systems and impacting brand and revenue.

We received a range of responses to this question. On average, the number of recoveries per month is 2.15 with a standard deviation of 3.73 and a median of 1. This means that in general, the number of critical failures is quite low with a select few from the sample that have frequent failures, thereby skewing the average.

The cost of failures and slow recoveries

Around 52% of production failures take more than 30 minutes to recover from, and half of those take 60 minutes or longer! So on average, over half of apps, sites or services from the sample are down for at least 30 minutes, each time they fail in production. So if the site was an online store that averaged “X” dollars per hour or an ERP app that facilitated “X” amount of business, you can easily calculate the cost of failure for your business. And this does not include the cost of war rooms, overtime or employee morale.


Recovery processes are important, but what about testing?

Based on the sample, only 36% of recovery processes are tested! Fire drills and dry runs are critical to make sure that your recovery processes and scripts deliver results when your production environment really needs it. Or else, use LiveRebel :).

So where do you stand?


Looking for more?

Look out for our “DevOps Productivity Report” coming on April 9. We’ll showcase many more insights just like this for your reading pleasure. In the meantime, share your pain and experiences with bad or inefficient processes in your development-to-deployment pipeline in our survey. It takes about 10min to finish, but I bet you can do it in 6 minutes!

Feed us raw data plz