Guide to Fault Tolerant and Load Balanced AWS Docker Deployment on ECS
Table of Contents:
- Overview
- Setting Up Our Docker Image
- Pushing Our Docker Image to AWS ECR and Setting up the ECS Cluster
- Creating the IAM Roles
- Configuring Security Groups for Our ECS Container Instances and Application Load Balancer
- Defining the Launch Configuration and Auto Scaling Group
- Setting up the Elastic Application Load Balancer
- Making Our ECS Task Definitions and Launching Our Services
- Testing Out the Fault Tolerance
- Final Thoughts
Overview
In this Guide we're going to accomplish the following:
- Create/Setup an Nginx Docker image from a react application for usage on AWS
- Push up our Docker Image to AWS ECR (EC2 Container Registry)
- Setup all needed AWS IAM (Identity and Access Management) Roles for our AWS Docker Service
- Setup all needed Security Groups to control traffic to our containers and instances
- Create an ECS Cluster to manage our AWS docker containers and EC2 instances
- Create an Application Load Balancer (elastic load balancer v2) to split traffic between our containers and server instances
- Create a common Launch Configuration for use with Auto Scaling Groups to launch our ECS container instances and keep them at a healthy level
- Launch our containers into our cluster by defining ECS Task Definitions and Services
- Give a meaty overview of the different AWS services and concepts involved
While I've tried my best to keep this as accessible as possible, in order to stay on topic I'm assuming that...
a) Docker is installed on your local machine and you are at least somewhat familiar with it. I'll still give you the commands and instructions.
Install it here: https://www.docker.com/products/docker
It's also probably good to have a Docker hub account as well, which can be found here: https://hub.docker.com/
b) an AWS Account. If you don't have one, the first 12 months get you free usage in a LOT of areas. Well worth the value.
c) AWS Command Line Interface should be installed on your local machine. Instructions on how to set it up can be found here:
http://docs.aws.amazon.com/cli/latest/userguide/installing.html
Despite the efficiency of the command line, I'm also going to keep the guide extremely visual. Lots of images of the AWS console, where to look for certain values etc. Even though ultimately much of day-to-day deployment of AWS docker style infrastructures will be purely automated - visual reinforcement of these concepts and steps will help us LEARN and UNDERSTAND. Once you're able to tag a service/workflow to the UI, scripting with the CLI will be that much easier.
Let's Begin.
Setting Up Our Docker Image
There's 3 ways we can set the image up here. They're dependent upon if you worked through my previous tutorial about setting up a react app with docker, sass and yarn. I'll walk through all of them but the options are:
a) start from where aforementioned tutorial ended
b) pull down the repository from the tutorial
c) setup your own create-react-app
project
1) Create a new folder code
and cd
into it
Unless you're starting from the last tutorial.
2) Build down our React Project
Regardless, of which approach taken, make sure you're in the code
directory you created.
a) If you're starting from the last tutorial, then make sure you're in your code
folder we created and run
$ docker-compose run web yarn build
and move on to the next step
b) If you'd like to use what we did in the previous tutorial:
Pull down the cra-storybook-sass-yarn
repository like so:
$ git clone git@github.com:jcolemorrison/cra-storybook-sass-yarn.git .
$ git checkout cra-storybook-sass-yarn
$ git checkout -b aws-docker-deploy
You'll need to make sure you have a Docker account and are signed up to hub.docker.com so that you can pull images from the Docker Hub.
If you have that then just:
$ docker login
and fill put in your info
$ docker-compose pull
in your code
directory to pull down the relative images
$ docker-compose run web yarn
to install dependencies
$ docker-compose run web yarn build
And you're good to go
c) If you'd prefer to just use the plain ol' create-react-app:
Run:
$ yarn global add create-react-app
(... or npm install -g create-react-app
)
$ create-react-app .
inside of our code
directory
This will install everything we need
$ yarn build
(... or npm run build
) to create the optimized build
and you're now good to go
I'll be specific, but whenever I reference the app
directory for those using approaches (a) and (b) from above, just remember those using the straight create-react-app
need to do it directly in the code
directory.
3) Create a new file nginx.conf
in your code/app
directory (or just you code
directory if you're using straight CRA). Input the following:
# Auto Detect and set workers to number of cpu cores
worker_processes auto;
worker_rlimit_nofile 4096;
events {
# Match T2 1 core with Worker Limit AND the open rlimit file.
# Change based on the instance type you use, and the traffic you expect
# Most public sites will hit 10k+ but you'll need to understand the traffic
# Default is 768 if you want to remove it later
worker_connections 4096;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
server_tokens off;
# https://developer.mozilla.org/en-US/docs/HTTP/X-Frame-Options
add_header X-Frame-Options SAMEORIGIN;
add_header X-Content-Type-Options nosniff;
# https://www.owasp.org/index.php/List_of_useful_HTTP_headers
add_header X-XSS-Protection "1; mode=block";
# Enabling the sendfile directive will eliminate the step of copying the
# data into the buffer and enables direct copying data from one file
# descriptor to another.
sendfile on;
# Optimize Amount of Data Sent at Once with sendfile
tcp_nopush on;
# http://nginx.org/en/docs/hash.html
types_hash_max_size 2048;
# Of Course. Gzip.
gzip on;
gzip_disable "msie6";
# Import our default.conf
include /etc/nginx/conf.d/default.conf;
}
This is our base Nginx configuration file. More specifics are in the comments, but we won't spend any specific time on this since there is a TON to cover.
4) Create a new file default.conf
in your code/app
directory (or just you code
directory if you're using straight CRA). Input the following:
server {
listen 80 default deferred;
root /var/www/;
keepalive_timeout 60;
# lightweight health check for load balancer
location = /health-alb {
access_log off;
return 200 'A-OK!';
add_header Content-Type text/plain;
}
# for root, go to index
location / {
try_files $uri /index.html;
}
# gzip static files
location ~ ^/static/ {
gzip_static on;
expires max;
add_header Cache-Control public;
add_header Last-Modified "";
add_header ETag "";
}
# Don't serve hidden files
location ~ /\. {
return 404;
access_log off;
log_not_found off;
}
# Try to load the favicon or fall back to status code 204.
location = /favicon.ico {
try_files /favicon.ico = 204;
access_log off;
log_not_found off;
}
}
5) Create a new file Dockerfile
in your code/app
directory (or just you code
directory if you're using straight CRA). Input the following:
FROM nginx:1.11
# Install only what is needed
RUN apt-get update && apt-get install -y --no-install-recommends curl \
&& rm -rf /var/lib/apt/lists/*
# Remove default nginx
RUN rm /usr/share/nginx/html/*
# Copy all of our nginx configurations
COPY ./nginx.conf /etc/nginx/nginx.conf
COPY ./default.conf /etc/nginx/conf.d/default.conf
# Copy our optimized build into the web folder that we point to in default.conf
COPY ./build/ /var/www/
# Convenicne just in case we want to add more configuration later
COPY entrypoint.sh /
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
# Daemon Off otherwise, Docker will drop when the main process is done making child ones
CMD ["nginx", "-g", "daemon off;"]
Notes about what's happening in the Dockerfile are in the comments. Again, while I'd love to stop and explain each line, there are a plethora of resources on it and we still have MUCH to do.
I see that the code syntax highlights makes much of the above code example look like comments, but it's not. Copy and use that entire code block.
6) Create a new file entrypoint.sh
in your code/app
directory (or just you code
directory if you're using straight CRA). Input the following:
#!/usr/bin/env bash
# Exit the script as soon as something fails.
set -e
exec "$@"
This is simply for future usage if we desire to comeback and modify and/or perform more operations when the Docker Image build is complete. This is particularly useful for automate deploys and continuous integration testing. We won't be doing anything beyond this though for this guide.
Only note to point out is that the !/usr/bin/env bash
gives us a very reliable way of hooking into Bash.
The full code repository of this can be found at this Github Repository
Pushing our Docker Image to AWS ECR and Setting up the ECS Cluster
Next up is getting our Docker image up into AWS's EC2 Container Registry. While we could use Docker Hub, ECR comes with all of the usual benefits of using other AWS services with other AWS services.
After that we'll also set up our ECS Cluster.
7) Head over to the AWS Console and Login
Make sure that North Virginia (N. Virginia) is the selected Region in your console. The current region can be found by looking at the top right of your screen
8) Click on EC2 Container Service
9) Inside, click on Repositories and then click on the big blue Create Repository button
10) After it's taken you inside, create a new repository called yourusername/aws-docker and click Next Step
11) We'll have a list of commands here that need to be run, go ahead and grab the first one in your clipboard
We'll run this to properly login to ECR via docker
12) Head back over to our terminal and use the command by typing $(aws ecr get-login --region us-east-1)
This executes the statement AWS gave us and logs is in with the Docker CLI. Now we can push images to ECR.
Remember, you must have the AWS CLI AND Docker installed on your local machine to do any of this.
13) While in our terminal, let's build our image:
In your code
directory run:
docker build -t yourusername/aws-docker ./app
to build down your image
OR if you're just using straight CRA run:
docker build -t yourusername/aws-docker .
Where yourusername
is your username (or whatever you'd like)
This is actually the second command AWS ECR wanted us to run.
14) Head back over to the AWS page and grab the 3rd command
This tags the image we just built as latest
and also tags it with our ECR account which will allow docker push
to send it up to ECR. Run it.
15) Finally grab, the last command off the page and paste and run that as well
This will actually push the docker image UP to our newly created ECR repository.
Click on Done on the success screen.
Before we move on though...
... Let's quickly summarize ECS in a nut shell, so that we're not blindly operating in the dark.
First, let's start at the lowest level
A Task Definition is just a set of "instructions" that specifies how to create a docker container(s), how to launch it and with what parameters/resources from AWS. These "instructions" always consist of a Docker Image, but also a number of other things like required CPU, Memory, Environment Variables, etc.
But they're just that. Just, how to make a docker container out of a docker image in context of AWS ECS.
Task Definitions can also define more than one container. So we may have a "web task definition" that consists of a Node.js docker container that references and depends on a MySQL docker container.
A Service is just manages a Task Definition in context of our cluster. We give it a Task Definition and it finds the appropriate space for it within the servers we have launched into our cluster. Additionally it can be paired with a load balancer to keep any one Task (an instance of a Task Definition) from getting over loaded.
Container Instances !! Are not instances of docker images. They're just EC2 Instances that are specified to launch into our ECS Cluster. Our cluster, instead of completely viewing our instances as separate servers, instead looks at the total CPU and Memory available and makes a lot decisions for scaling based on that.
Finally our Cluster is just the "club" for all of the above. We give a number of EC2 Instances. It views them as aggregate computing power, think of it like turning multiple instances into one big instance. We define Task Definitions out of our desired Docker Images. We create a Service, hand it our Task Definition and tell it that we want X number of "Tasks" from this Task Definition. It looks through all of our Container Instances, identifies which of them have the computing power to handle our Tasks, and launches them into it. Even though the members of the "club" did the work, we still say the "club" did it.
So non Cluster Style
"I have an nginx web server on a t2.micro. But it's only taking up 128 of the 1024 cpu and 300 of the memory"
"I have a Node API Server on a t2.micro. But it's only taking up 200 of the 1024 cpu and 600 of the memory"
"I have a MySQL on another t2.micro. And it's taking up 500 cpu and 1000 memory"
We have 3 instances running. Yes there's ways to finagle EC2 to let us put them together and what not. But why not just use ECS? It's free. You only pay for EC2 instances, just like you do in this aforementioned scenario.
So Cluster Style
"I have 3 Task Definitions, an Nginx One, an API one and a MySQL one"
"My Cluster has 2000 cpu and 1990 memory available"
Now you can launch a service for each of the task definitions and have them optimally placed. Additionally, 2000 cpu and 1995 memory is approximately the equivalent of 2 x t2.micros
caveat being that t2's operate in bursts but we're just going with averages here.
If you'd like to learn more about AWS I have a incredibly thorough guide on the concepts and how they all work together:
The Hitchhiker's Guide to AWS ECS and Docker
.. Okay, back into the thick of it
16) Head over to Clusters and the click Create Cluster
17) On this page name your cluster aws-docker-cluster
Click on create an empty cluster
It should look like this:
and then click create.
After it's complete, click View Cluster to see it up!
18) Let's head back over to the main AWS page by clicking on the logo in the top left
Creating the IAM Roles
Before we move on, we need to setup all of our Identity and Access Roles that will allow our EC2 instances to interact with and join our ECS cluster and ALSO our ECS Services to manage our Task Definitions in context of a Load Balancer.
IAM is it's own beast, so I'm not going to dive deep into it. In a nutshell though...
A policy is just a set of permissions to access AWS resources. It's nothing until attached to one of the following...
A user allows you, a person (or a third party service), to interact with AWS resources
A role allows your AWS services, software and instances, to interact with other AWS resources
A group is just a logical grouping of users. Instead of attaching a set of permissions (policy) to each user, you can instead attach a set of permissions to a group and then declare users as part of the group. They will then be restricted to that group's permissions.
Moving on...
19) Click on IAM on the main AWS page and then click on Roles on the left side navigation
20) Click Create New Role and name it awsDockerClusterInstanceRole in step 1
Like so:
Click Next Step in the bottom right when you've got the name in
21) Find the role Amazon EC2 Role for Simple Systems Manager and Select it
22) Select the role AmazonEC2RoleforSSM and then again click Next Step in the bottom right
Note that this rule is actually for running batch commands on all of your ECS instances if you need to via the Run Command
feature in ec2, good for admin'ing
23) Review your details and then click Create Role in the bottom right
This takes us back to the main roles screen. Unfortunately AWS forgot that, in the create role wizard, that users might want to attach more than one policy...
24) Click the role we just created - awsDockerClusterInstanceRole and it will take us to its summary screen.
Under the permissions tab, click Attach Policy
25) Search for the policy type AmazonEC2ContainerServiceforEC2Role select it
Click Attach Policy in the bottom right once selected
Excellent. Now one more role to go.
26) Click on Roles
the side left navigation and then click Create New Role
again.
Since we just did this whole process, and since there's only one policy that needs to be attached this time, instead of posting a bunch of screen shots, I'm just going to give you the information needed to create this role:
In Step 1 - name it awsDockerClusterServiceRole
In Step 2 - find the Amazon EC2 Container Service Role NOT Amazon EC2 Role for EC2 Container Service; Select it;
(Step 3 is skipped)
In Step 4 - Select the AmazonEC2ContainerServiceRole (the checkbox on the left) and click Next Step;
In Step 5 - Review it and click Create Role.
It will take you back to the Roles Page and you'll have two roles created
awsDockerClusterInstanceRole
and awsDockerClusterServiceRole
27) Head back over to the Main AWS Console Page by clicking on the logo in the top left
If you're interested in learning more about confusing IAM policies, I have a human readable guide about them here:
AWS IAM Policies in a Nutshell
Configuring Security Groups for our ECS Container Instances and Application Load Balancer
The next layer of security we'll hit is Security Groups. Again this is its own beast so I'm not going to go deeply into it. But all they do is, in a very simplified and boiled down summary, is permit certain traffic to make it to your instances and containers.
As an analogy, think of your instances like buildings. Security Groups are like a security team that only allow certain persons into the buildings. They assume once someone is in though, that they're allowed to leave (known as a STATEFUL - return traffic is allowed).
However, to complete this analogy, we need to add one more concept. Security Groups are actually more like Security Companies that buildings (our instances) can pick to guard them. If multiple buildings (instances) belong to the same Security Company (security group), they are allowed to access each other. Security Companies (security groups) can also clients of other Security Companies access to their clients.
The last sentence there is a bit odd. But as an example, suppose I have a set of instances Group A that use Security Group A, and also a set of instances Group B that use Security Group B. If Security Group A says that Security Group B is permitted access, then all instances in Group B can access instances in Group A.
Of course, this is just the close layer of security to the instance.. however it's surprisingly pretty effective.
The BEST way to set up a great security layer is by putting a great deal of upfront effort into a Virtual Private Cloud aka VPC, but that goes beyond being it's own beast and is instead an entire PACK of beasts. I have a very thorough write up about VPCs here:
AWS VPC Core Concepts in an Analogy and Guide
Also, I'll be releasing a video series in the next couple months that covers VPC's in context of production application deploys in detail, but for this tutorial, we're going to just keep it simple.
We're going to be creating 2 security groups. One to only allow in HTTP traffic into our Application Load Balancer. Another to only allow traffic from our Load Balancer into our Container Instances.
28) Click on EC2
In hindsight, this probably didn't need it's own step
29) Click on Security Groups and then Create Security Group in the side left navigation
30) Click Create Security Group and then use the following information:
Security Group Name: aws-docker-alb
Description: security group for aws docker load balancer
VPC: select the default VPC; it will have (default) next to it
for Security group rules:
Click Add Rule
Select HTTP in the Type dropdown
And everything else will be prefilled
Everything should look like
Click Create when completed
31) One more thing, for easy hover of the Name cell for your aws-docker-alb and a pencil will show up. Click it and give it the name tag of aws-docker-alb-sg. Not necessary, but nice.
Also, copy to your clipboard the Group ID of this security group that we just created.
32) Click Create Security Group again and use the following information
Security Group Name: aws-docker-instances Description: security group for aws docker cluster instances VPC: select the default one
for Security group rules:
Click Add Rule
Leave Custom TCP Rule selected
Port Range: 0-65535
Source: Custom and then type in the ID of your security group you just created for your aws-docker-alb
It should look like this:
Where the sg-49399735 is the Group ID of your aws-docker-alb security group we just created.
Click Create when completed
33) Similar to step 31, hover over the name cell for your aws-docker-instances group and give it a Name of aws-docker-instances-sg
You'll now have two security groups. One for the load balancer we'll create and one for all the instances that contain our docker containers.
Awesome!
Defining the Launch Configuration and Auto Scaling Group
From our earlier big overview of ECS Clusters, we talked about the idea of Container Instances. Despite the mind leading to "Oh an instance of an image is maybe what they mean", we know that it's actually just an EC2 Instance that belongs to an ECS Cluster.
Now, why don't we just spin up instances manually and throw them in? Well, if one our manual instances fails, there's nothing that's going to put it back into service. Additionally, what if we want to add more? Take away some?
This is where we leverage Launch Configurations and Auto Scaling Groups.
Launch Configurations are just a set of parameters that define an EC2 Instance. We pick a base Amazon Machine Image, the type and power of an instance (i.e. t2, i2, m3, etc), storage types, vpc settings, etc. just like we do when creating a stand alone EC2 instance. However, instead of it spinning up an instance, it defines a set of instructions to create instances from.
Auto Scaling Groups use Launch Configurations and keep and scale a set of EC2 instances that are created from a provided Launch Configuration. For example, we may have a Launch Configuration that uses an Ubuntu Image and an M3 general purpose instance type that launches into a particular network. If we tell our Auto Scaling Group to always have 3 of these available, than even if one fails, it will automatically spin up one with identical settings.
Additionally, Auto Scaling Groups can be used to scale up and down the number of instances based on defined parameters such as CPU reservations/utilization, particular date/time events and more. We won't dive into that in this guide since it would involve us diving into CloudWatch as well.
34) In the left side navigation of EC2 click on Launch Configurations
If you've never created a Launch Configuration before, it'll show the Welcome to Auto Scaling page. Just go ahead click Create Auto Scaling Group. If you've made one before, just create a new launch configuration
Once you click that IF you've never created a Launch Configuration, AGAIN, it'll show a little splash screen about Auto Scaling Groups. Just click Create Launch Configuration in the bottom Right.
35) In this first step, Choose AMI, click "Community AMIs" in the side left tabs and search for the official ECS image ami-a58760b3
AWS keeps a list of ECS optimized Amazon Machine Images (AMIs) they personally attend to. The list of these is always updated and can be found here:
Oddly enough they have them under Community AMIs which isn't immediately obvious, especially since the marketplace has another official ECS AMI. Bottom line is that you'll always look for the AMI referenced in the above link.
After you searched for the AMI you should see a few options like below:
Click Select the first option. If it doesn't look like the one in the picture above, just make sure to select the one with the ACTUAL AMI ID, not ones that support it.
36) On the next step of Choose Instance Type, select t2.micro and click on Next: Configure details
T2 types are amazing. Burstable performance with baseline guaranteed performance. Even though T2.micro tends to be what gets used in tutorials don't under estimate the power of this instance family type. Sure, if you need consistent performance, one of the other types is great, but most web operate in surges of traffic fit this type amazingly.
37) On the next step of Configure Details, input the following information:
Name: aws-docker-cluster-launch-config
IAM Role: awsDockerClusterInstanceRole (the one we created earlier)
Click on Advanced Details
User data: leave As Text selected and paste in the following:
#!/bin/bash
echo ECS_CLUSTER=aws-docker-cluster >> /etc/ecs/ecs.config
The above is essentially a launch script. When an instance launches, it'll run the above script. What this is doing is specific to our ECS Optimized image, and puts our cluster's name into the ecs config file so that it knows to join that cluster as a Container Instance.
Everything should look like the following:
If you're interested in adding unified logging and SSM to your container instances. This is where you'd do it. I have a write up about how to do so here:
How to Setup Unified AWS ECS Logs in CloudWatch and SSM
click Next: Add Storage
38) Leave the storage as is and click Next: Configure Security Group
39) Click Select an existing security group and select our aws-docker-instances security group that we created.
Click Review selecting the security group.
40) Review all the Launch Configuration and make sure its inline with all of the steps above.
It should look similar to the following:
Obviously some values will be different since Security groups, ebs volumes, etc have unique IDSs.
Click Create launch configuration in the bottom right when ready.
41) Once you click Create launch configuration it will pop up a dialog about using an existing or creating a keypair.
When creating EC2 Instances or Launch Configurations, AWS will prompt with the ability to use an Key Pair to access your instances directly via SSH. The benefits of this are obviously management, direct logs, troubleshooting etc. We're going to go ahead and generate one, but we won't be diving directly into any of our instances in this tutorial.
Select Create New Key Pair
Name it aws-docker-major-keys
Click Download Key Pair
Keep it secret. Keep it safe. After all it's a major key.
42) Once you've clicked the Download Key Pair the Create launch configuration button will enable. Click that to create it.
This will take us right into the Auto Scaling Group Creation screen
43) On the Configure Auto Scaling group details step (steps are shown at the very top), input the following information:
Group Name: aws-docker-cluster-scaling-group
Group Size: 2
Network: the default VPC
Subnets: select us-east-1a
, us-east-1c
and us-east-1d
Selecting the different subnets makes it so that our Auto Scaling Group will launch instances into different Availability Zones. Since our service will then be available in different "AZs", if one goes down, we still have another one up. Spreading our service across multiple AZs is a major key to fault tolerance.
Click Next: Configure scaling policies in the bottom right
44) On Configure Scaling Policies Leave Keep this group at its initial size selected
As mentioned earlier, we won't dive into scaling up and down. However, this option will make it so that our Auto Scaling group keeps our instance numbers at 2. If one of them fails for any reason, it will spin a new one right back up.
Click Next: Configure Notifications
45) Leave Configure Notifications this as is and click Next: Configure Tags
46) For the first row, make the key Name and give it the value of AWS Docker Cluster Instance. Leave Tag New Instances checked.
The Name tag is actually a special tag that gets used by AWS in a lot of things. In this case, it specifies what our Container Instances will be tagged as when they're created.
Click Review in the bottom right of the screen
47) Confirm everything is in line with what we've done above
Click Create Auto Scaling Group
Once you've seen the success message, click Close
48) With your auto scaling group aws-docker-cluster-scaling-group selected, click on the instances tab in the bottom panel to see the status of your instances.
Setting up the Elastic Application Load Balancer
The word Load Balancer is pretty self explanatory. Elastic Load Balancers, both Classic and Application, deal with spreading traffic across different instances and availability zones. They routinely check the health of all instances they load balance to in order to ensure that traffic can still be sent. If an instance repeatedly fails a health check, the Load Balancer will direct traffic away from the unhealthy instance to a healthy one.
The difference here is that we'll be using the newer version known as the Application Load Balancer. This gives us an additional HUGE benefit when working with AWS ECS. Instead of just being able to spread traffic across instances, we also spread it across our Tasks (remember, instantiations of Task Definitions) within instances.
For example, if you have 6 Tasks spread across 3 instances, the Application Load Balancer will work with our Services and Cluster to balance traffic between all 6 Tasks.
There's many other benefits as well, such as being able to name space tasks in a particular cluster based on endpoint. For example, you can have a group of Tasks that serve web page A behind a route called /a
and another group of Tasks that serve web page B behind a route called /b
. This isn't currently possible with Classic Load Balancers.
One last random tidbit, since every application I see from others talks about how they'll use HTTP/2. Application Load Balancers hook this right up as long as you've got SSL set up on them. We won't do this in this tutorial, but they make it incredibly easy.
On to creating our Application Load Balancer
49) In the EC2 section of AWS, Click on Load Balancers in the left side navigation
50) Click Create a Load Balancer
51) Select Application Load Balancer and then click Continue
52) On step 1 Configure Load Balancer (as seen at the top of the screen), input the following details
Name: aws-docker-cluster-alb
Scheme: internet-facing
Listeners: leave as is with HTTP on port 80
Under availability zones make sure the default VPC is selected
Under available subnets select us-east-1a
, us-east-1c
and us-east-1d
Everything should look like the following:
Click Next: Configure Security Settings
53) This will complain about lack of HTTPS, but we can always come back and add that later easily. Click Next: Configure Security Groups
54) On this screen, select our security group we created aws-docker-alb and then click Next: Configure Routing
55) On the Configure Routing screen, input the following information
Target group: New Target Group
Name: aws-docker-cluster-targets
Protocol: HTTP
Port: 80
Under Health Checks
Protocol: HTTP
Path: /health-alb
Leave everything else as is
Click Next: Register Targets in the bottom right.
56) On Register Targets, Leave it as is.
AWS ECS Clusters manage and register our targets for us (and we set them up there).
Click Next: Review
57) Review that everything is in alignment with our work above
Click Create
Click Close once it's completed
Load Balancers can sometimes take a bit to get up and running. By a bit I only mean anywhere from a few minutes to 10ish minutes. Until then, their state will be in Provisioning
. The Load balancer won't be able to take traffic until it hits active
.
We can continue on though. Just know if you speed through the rest of the steps and complete them before the Load Balancer is up, you'll still have to wait until it's active
.
Making Our ECS Task Definitions and Launching Our Services
Now that our ECS Cluster has EC2 instances launched within it (now known as Container Instances), we can provide it with Services of Task Definitions to launch Tasks of our aws-docker
image across our Container Instances. And yes, I am also perplexed as to why AWS chose such general names.
58) Click on the AWS Logo in the top left to head back to the main page and head over to EC2 Container Service
Just in case you were still in EC2
59) Click on Clusters and then click on our aws-docker-cluster cluster.
In here click on the ECS Instances tab and we'll see that our aws-docker-cluster has registered the two ec2 instances from our auto scaling group.
60) Now click on Repositories in the side left navigation and click on our image repository we created username/aws-docker.
In here grab the Repository URI of your image at the top of the repository page. Copy it or remember it, etc.
We need this value to tell our Task Definition which image to use when creating containers.
61) Click on Task Definitions on the left side Navigation and then click Create New Task Definition
On this page, near the bottom, there's a button that says Configure via JSON
. Click on that, remove what's there, and then modify and paste the following JSON into that area:
{
"containerDefinitions": [
{
"name": "aws-docker-task",
"image": "<yourawsaccountnumber>.dkr.ecr.us-east-1.amazonaws.com/<yourusername>/aws-docker",
"memory": "300",
"cpu": "256",
"essential": true,
"portMappings": [
{
"containerPort": "80",
"protocol": "tcp"
}
],
"environment": null,
"mountPoints": null,
"volumesFrom": null,
"hostname": null,
"user": null,
"workingDirectory": null,
"extraHosts": null,
"logConfiguration": null,
"ulimits": null,
"dockerLabels": null
}
],
"volumes": [],
"networkMode": "bridge",
"placementConstraints": [],
"family": "aws-docker-task",
"taskRoleArn": ""
}
Make sure to replace <yourawsaccountnumber>
and <yourusername>
with the appropriate values
This is an alternative to filling it in via the user interface above. We won't deep dive into the specifics of a Task Definition due to the sheer amount there is and how much cross over there is with general Docker knowledge. For example, CPU and Memory usage aren't some special AWS Docker thing, instead those are actual values you use in normal Docker container definitions.
The one thing I will mention is CPU and Memory usage. There are two types, hard limits and soft limits. Hard limits are how much your container can use and will never go above. Soft limits are what it will try and stay at, but will fluctuate if resources allow. Generally both are preferable, however, our app Task is only managing one container, so it's not as important.
How do you decide CPU and Memory values? There's a number of ways, along with many more depending on your situation. The common few though would be:
a) the actual demands of your containers derived through actually testing your container and watching it's cpu and memory usage
b) the cpu and memory limits of your selected instances
c) the other types of tasks and their limits that will be sharing the same cluster compute space
d) the auto scaling policies (if any) and their thresholds on memory and cpu utilization / reservation.
Back to the Task at hand (pun intended). After pasting and accepting, your screen should look similar to:
When everything looks good, click Create
62) Click on clusters in the side left navigation and then click on our aws-docker-cluster cluster.
63) With the services tab selected click Create
64) On this Create Service page, input the following information:
Task Definition: aws-docker-task:1
Cluster: aws-docker-cluster
Service name: aws-docker-cluster-service
Number of tasks: 2
Minimum healthy percent: 50%
Maximum percent: 200%
Placement Templates: AZ Balanced Spread
The first 4 of these params should be straight forward. The Healthy Percents are just how much the cluster will try and keep it's number to tasks to. The Max percent is mainly there so that if it needs to kill an old task (say you've updated your task definition), it's allowed to launch new ones and kill the old one afterwards.
The Placement Templates are how it will spread traffic across your different tasks. Keeping traffic across AZs again helps performance and fault tolerance.
Your screen should look similar to:
Do not click create yet. We still need to setup the load balancer.
65) Click on Configure ELB under Optional configurations
This is how and where we're going to hook up the load balancer we created to our aws-docker Tasks running in our Cluster.
66) On this screen "Elastic Load Balancing (Optional)" input the following:
ELB Type: Application Load Balancer
Select IAM Role for service: select our awsDockerClusterServiceRole
ELB Name: select our alb-docker-cluster-alb
Select Container: select our aws-docker-task:0:80
don't click save yet though.
67) Next to Select Container, with our aws-docker-task:0:80 selected, click Add to ELB. Input the following:
Listener Port: 80:HTTP
Target Group Name: our aws-docker-cluster-targets
Leave everything as is:
Click Save. This will take us back to the Create Service page.
68) Confirm that everything looks correct in accordance to our work above
Click Create Service
69) The Launch Status screen will come up. Once it's complete, click View Service. We should see our services running!
Finally, let's confirm that our services are actually up and running on the world wide web.
70) Head back to the main page (aws logo in top left) and then click on EC2. Click on Load Balancers on the left side navigation.
71) Select our load balancer aws-docker-cluster-alb and in the bottom panel copy the DNS name:
Paste it into your browser and BOOM your AWS Docker app is now fault tolerant and load balanced on AWS.
Testing out the Fault Tolerance
If you want to play around with the fault tolerance, we can go into the Instances tab on the side left navigation and
a) select one of the auto scaled instances
b) click on actions in at the top
c) in the drop down hover over Instance State
d) click Terminate
e) click Yes Terminate in the dialog that pops up
Give it a few minutes. As the instance shuts down and is terminated: (a) our load balancer will only direct traffic to the working instance and (b) our auto scaler will detect the drop in live instance and spin up a new one!
Final Thoughts
Hopefully this guide has covered all the bases to getting aws docker applications up and running on ECS with a good crash course foundation of knowledge on AWS services. While it hasn't covered everything, this basic blueprint CAN be used and extended to ...
- leverage SSL/TLS
- Auto Scale Container Instances based on traffic
- Auto Scale Task Definitions based on traffic
- Directly monitor docker logs in CloudWatch
- Directly monitor application logs of the container's application
- Automated deploys and updates
Ehhh, at this point I should probably stop typing. The point being that you can use this as a blueprint and starting point to production level architectures and extend it to your needs.
The MAIN component missing though is a production ready Virtual Private Cloud as I mentioned earlier in the tutorial. This really is a critical part to security and segmenting AWS resources.
We also probably want a continuous integration and continuous deploy pipeline setup for developers so that pushing updates is seamless.
As mentioned earlier, I'm planning to release a video series in the next couple of months that dives deep, conceptually and technically into the entirety of
- setting up a full production stack infrastructure
- working with Docker and containers
- meaning your local development environment
- a ci/cd pipeline
- version control
- deploy to AWS and logging
If that's of interest feel free to hit me up on twitter or email and I'll make sure to let you know when it's complete.
As always, please leave me a comment if you find any typos or technical glitches! Thanks!
Be sure to follow for weekly updates!
More from the blog
J Cole Morrison
http://start.jcolemorrison.comDeveloper Advocate @HashiCorp, DevOps Enthusiast, Startup Lover, Teaching at awsdevops.io