I saw Eric Ries speak at the Google Kirkland office last week. Most of his talk was on Lean Startups, but he touched on the process of continuous deployment. I'm a huge fan of continuous deployment, but when I hear people talk about it, including Eric, the process they describe feels unapproachable for most startups, and I think that's a huge shame.
Lean Startup OODA Loop
If you are developing a web service, you should be working toward continuous deployment right now. It can drastically increase your speed through this loop, and alter your idea of how quickly you can deliver changes to a live production site.
If you are an early stage web-based startup, and especially if you are pre-launch, you do not need an IMVU-like setup to succeed at this. Even trying to do what they do will just get in your way. What you need is an MVP for continuous deployment. Over time you can grow a system that is appropriate to your technology, service and customers.
Here are the steps I recommend:
- Add basic error notifications and monitoring
- Get to zero downtime deployments
- Create an automated test (continuous integration) server
- Add code to deploy when your tests run green
- Develop a culture around automated tests
Basic error notifications and monitoring
You can use existing tools like Hoptoad for error notifications and PingDom for knowing that your site is alive and running. I'm also a huge fan of New Relic RPM for performance analysis. Their free pricing tier makes this an absolute no brainer.
If you can't find existing tools that will work for you, then write your own. I've been in companies that did this, and it wasn't that hard. Start simple (error notification, for instance) and work up from there.
Zero downtime deployments
The ability to deploy code changes without taking your site down will completely transform your relationship to deployments.
For low to medium traffic rails sites, this is unbelievably easy. We discovered that we could just do a "cap deploy" and nobody noticed the restart.
If you have database changes, make them backwards compatible (add columns, deploy code, and then a week later remove the unused columns ... never rename columns, etc).
If all you did was stop here, with basic error notifications, monitoring, and zero downtime deployments, you will still reap huge benefits.
Get an automated test server
Assuming your automated tests can be run via a script at the command line, then your basic automated test server has only a few simple responsibilities:
- Check your source code control system to see if there is new code
- Pull that code into a folder
- Run your command line script to run your tests
- Notify your team if the tests fail
You don't have to use an existing tool. You can write your own script to poll your git repo (or whatever you use) for changes, and then pull them, run your tests, and send email on failure. It is just not that hard.
Shoot for the simplest possible tool. It helps a lot if you can really understand everything that is happening in your automated test environment.
Automate your deployments
This really means just having your test server do a deployment when your tests run green. You can either hack something, or if you are using CruiseControl.rb you can use the cap_deployer plugin.
If you don't already have a testing culture, the key areas that I think will help the most are these:
Measure test coverage, and make sure it trends up
Run test coverage weekly, add the new number to the bottom of a historical list, and email that to your team. In general, you get what you measure, and this by itself is probably enough to move the needle the right direction.
Write tests for any bug that is made live
A bug goes live, you find it, you write a test. If you have the time to do it, you should write the test first, verify that you can reproduce the bug, then fix and deploy. If you don't have time, then fix, deploy, write tests, and go from there.
Write tests for code that matters most to your business
If you aren't going to have a testing culture and a big body of tests, then at least write tests for the stuff that would be outright embarrassing if it was screwed up.
Yep, there are risks
There are lots of ways this can go wrong. But when you are small, and particularly when you are pre-launch, those issues matter less than being able to move fast. You have to decide what tradeoffs make sense for you, but my experience has been that most pre-launch web services are way too worried abouy deploying a bug, when they should be worried about speed through the loop.
You should read these two great articles: