It worked in dev, it’s an ops problem now

Releasing updates should be the most boring part of the product release cycle (yes, you read that correctly - boring!).

TL;DR

The more mature your product and the larger your team, the more boring your deployments should be:

  1. Always work on the most important thing
  2. Add automation when it makes sense
  3. Define a “what’s good enough?” checklist for your product releases
  4. Only pay for school once

Do you Dread Deployments?

If your deployments induce a sense of dread, the chances are you’re doing something wrong. Successful deployments shouldn’t be about creating big bang spectacles or pushing things to the limit. They’re about small, carefully considered, predictable pushes. How boring your releases are is directly related to the question “What’s good enough?”

In software development, the concept of “good enough” varies based on two main factors: team size and product maturity. Depending on these two aspects, the level of automation and acceptable level of risk will fluctuate.

I’ll explain more about these factors shortly - but first I’m going to get my crystal ball…

Predicting the Future

You might not be able to predict winning lottery numbers. But, before releasing changes to any product if you consider the following questions, you should able to predict the future for your deployments:

How do I know if something is broken?

If you can’t answer this question now, you will certainly be able to after something breaks 😄.

Even the most basic checks and monitoring can save you a lot of stress and time. If you’re not sure where to start, consider the following:

What are the critical paths through the application?

If a user can’t perform these actions, your application may as well not exist. Checking these paths before you deploy will help predict failures and help you with your definition of “good enough”.

What might cause something to break?

Again, you might need a few failures to answer this question. But, ask yourself:

What are the most common causes of failure?

Data integrity is a common causes of failure. If you have a database and constantly find you’re having to manually add data in order to develop or test features, you might want to consider automation to seed your database.

Automating the seeding your databases will allow you to develop and test against production-like environments. The closer your development environment is to production, the more likely you are to catch bugs before they reach your users; allowing you to predict & prevent failures.

If something breaks, what do I do about it?

There are 2 parts to this:

  1. How do I fix it?
  2. How do I prevent it from happening again?

Only “pay for school” once.

Incidents and outages are a great opportunity to learn and improve. While they are uncomfortable to deal with, they are a great opportunity to learn and improve.

When you learn from a failure, make sure you document it and figure out how to prevent it from happening again (add it to your “good enough” checklist). Also, make sure you store the knowledge - so it’s easier to fix next time.

How Much Automation is Enough?

Crystal ball gazing aside, the level of automation and “What’s good enough?” depends on your team size and product maturity. A solo Engineer working on a side project will have higher risk tolerance and fewer quality control checks than a team of 10 working on a mature product with thousands of users.

Focus on the most important thing. That might be automation - not features.

Be careful not to over-engineer anything. Only add automation when it makes sense and if it moves you closer to boring deployments.

Welcome to the Boring Club

Once you have your “What’s good enough?” checklist, and you know how predict, recover, and learn from failures your on your way to boring deployments. Make features and product improvements the exciting part of your releases, not the deployment.

Keep it boring! 😉


Popular related posts for Product Development, Technical Direction and SaaS:


Related Posts