Last time around, we discussed how to create an AMI for every deployment: a crucial step in enabling you to leverage deployment immutability. This time around we'll learn what it takes to automate autoscaling immutable deployments with zero-downtime, while sitting behind a load balancer and using Route 53 for DNS.
Given the immutable images we're now able to build, thanks to the last article, in this article we'll lay out an architecture, based on Amazon Web Services, that's able to take advantage of those immutable images.
Our journey starts at Route 53, a DNS provider service from Amazon. You'll set up the initial DNS configuration by hand. I'll leave the
TXT records up to you. The feature that interests me in Route 53 is creating
ALIAS record sets. These special record sets allow you to bind a domain name to an Elastic Load Balancer (ELB), which can then distribute traffic among your Elastic Cloud Compute (EC2) server instances.
The eye candy in
ALIAS record sets is that you're able to use them at the naked, or apex domain level (e.g
ponyfoo.com), and not just sub-domains (e.g
blog.ponyfoo.com). Our scripts will set up these
ALIAS records on our behalf whenever we want to deploy to a new environment, also a very enticing feature.
When you finish reading this article, you'll be able to run the commands below and have your application listening at
dev01.example.com when they complete, without the need for any manual actions.
NODE_ENV=dev01 npm run setup
NODE_ENV=dev01 npm run deploy
We don't merely use ELB because of its ability to route traffic from naked domains, but also because it enables us to have many web servers behind a single domain, which makes our application more robust and better able to handle load, becoming highly available.
In order to better manage the instances behind our ELB, we'll use Auto Scaling Groups (ASG). These come at no extra cost, you just pay for the EC2 instances in them. The ASG will automatically detect "unhealthy" EC2 instances, or instances that the ELB has deemed unreachable after pinging them with
GET requests. Unhealthy instances are automatically disconnected from the ELB, meaning they'll no longer receive traffic. However, we'll enable connection draining so that they gracefully respond to pending requests before shutting down.
When a new deployment is requested, we'll create a new ASG, and provision it with the new
carnivore AMI, which you may recall from our last encounter. The new ASG will spin as many EC2 instances as desired. We'll wait until EC2 reports every one of those instances are properly initialized and healthy. We'll then wait until ELB reports every one of those instances is reachable via
HTTP. This ensures that no downtime occurs during our deployment. When every new instance is reachable on ELB, we'll remove the outdated EC2 instances from ELB first, and downscale the outdated ASG to 0. This will cause the ASG to allow connection draining to kick in on the outdated instances, and terminate them afterwards. Once all of that is over, we delete the outdated ASG.
This approach might not be blazing fast, but I'll take a speed bump over downtime any day.
Of course, none of this would be feasible if spinning a new instance took 15 minutes while installing software that should've been baked into an image. This is why creating those images was crucial. In a slow process such as this, baking images saves us much appreciated startup time.