OpenVPN over AWS Systems Manager Session Manager

Introduction

AWS Systems Manager Session Manager allows you to establish a shell session to your EC2 instances and Fargate containers even when these resources don’t have a public IP address. Also, with EC2 instance port forwarding, you can redirect any port inside your remote instance to a local port on your client to interact with your private EC2 instance based applications. A common use case for this might be to access a web application running on your instance from your browser.

However, Session Manager sessions are limited to a single resource — one EC2 instance or one Fargate container. So, it is not possible to use Session Manager alone to create an ingress point allowing access to all resources within your private VPC.

In this article, I will show how you can combine Session Manager with OpenVPN to allow a secure network path from your client to all resources within your private VPC.

Design

The below diagram illustrates the design for our solution.

We will launch an EC2 instance in a private subnet which will act as our OpenVPN server. We will then establish a Session Manager port forwarding session between our client and this EC2 instance. Then, using an OpenVPN client, we will tunnel to the OpenVPN server over the Session Manager session. With our VPN connection in place, we can then access all private applications in our VPC.

The configuration of the OpenVPN server will be done with the script at github.com/Angristan/OpenVPN-install. Because Session Manager does not support UDP, our OpenVPN server will be configured in TCP mode.

Prerequisites

To be able to use Session Manager from the AWS CLI you also need to install the Session Manager Plugin.

Install the OpenVPN client.

Install the Terraform CLI.

Terraform

The solution can be deployed via Terraform:

git clone https://gitlab.com/aw5academy/terraform/openvpn-ssm.git
cd openvpn-ssm
terraform init
terraform apply

This Terraform code will provision the required VPC and Session Manager resources. The OpenVPN server is not deployed here… that comes later.

Make a note of the output alb-dns. This is the DNS record for a sample application load balancer deployed to the private subnets. If you try to access this you will not be able to connect.

This is expected because as we can see from the load balancer settings, this is an internal load balancer, meaning it can only be accessed from resources within the VPC.

Session Manager Preferences

The Session Manager preferences can’t be configured via Terraform. So we must set these manually with the following steps:

  • Login to the AWS console;
  • Open the Systems Manager service;
  • Click on ‘Session Manager’ under ‘Node Management’;
  • Click on the ‘Preferences’ tab;
  • Click ‘Edit’;
  • Enable KMS Encryption and point to the alias/session-manager key;
  • Enable session logging to S3 bucket ssm-session-logs... with encryption enabled;
  • Enable session logging to CloudWatch log group /aws/ssm/session-logs with encryption enabled;
  • Save the changes;

Start VPN Script

With our infrastructure deployed via Terraform we can now try to launch our OpenVPN server. The script provided at start-vpn.sh can be used to do this. This script performs the following steps:

  • Obtains the launch template for the OpenVPN instance;
  • Starts the EC2 instance;
  • Waits for the instance to be ready for Session Manager sessions;
  • Waits for the instance to complete its user_data which, is where the OpenVPN server is installed and configured;
  • Downloads the OpenVPN client config file generated by the server;
  • Starts a port forwarding Session Manager session;

Let’s try this script now by running:

bash start-vpn.sh

Our Session Manager session is up and awaiting connections.

OpenVPN Client

Now we need to configure our OpenVPN client. In the previous step, an ssm.ovpn file was downloaded from S3. Make a note of the location of this file. Next, launch the OpenVPN client and select to import a profile by file.

Navigate to the location of the ssm.ovpn file.

Now click on Connect.

Our tunnel is now in place.

If the VPN fails to connect on Windows Subsystem for Linux try restarting WSL by running the following from a command prompt:

wsl --shutdown

Then rerun bash start-vpn.sh and try to connect from your OpenVPN client again.

Testing

Now we can test if our solution works by trying to access the load balancer that failed to connect earlier. Try it again in your browser:

Success!

Important Points

This solution was created for fun more than as a realistic real-world solution. The performance of this feature has not been thoroughly tested. Indeed, we are using TCP instead of UDP because Session Manager only supports TCP. TCP is known to be sub-optimal for VPN traffic and can suffer from a phenomenon know as TCP meltdown.

The security of this configuration is quite strong however. The communication between client and AWS is both HTTPS and KMS encrypted. Also, no customer managed networking ingress is required — so your VPC can be entirely private.

You may find this useful in a small development team environment. But for corporate settings, consider AWS Client VPN instead.

Cleanup

To clean-up the resources created in this guide, first destroy any EC2 instances with:

aws ec2 terminate-instances --instance-ids $(aws ec2 describe-instances --filters "Name=tag:Name,Values=openvpn-server"  --query Reservations[*].Instances[*].[InstanceId] --region us-east-1 --output text |xargs) --region us-east-1

Then, from the root of your checkout of the Terraform code run:

terraform init
terraform destroy

Serverless Caching With AWS AppConfig and Lambda Extensions

Introduction

In this article I will show how you can deploy a simple caching solution for AWS Lambda functions by combining the AWS AppConfig service with the Lambda Extensions feature.

To demonstrate this, lets create a problem that we must solve. Suppose you have been asked to implement a solution that will allow the engineers on your team to query any IPv4 address to check if it is in the AWS IP address ranges.

Design

Our solution to this problem will be very straightforward. We will have an Amazon S3 bucket serving a single HTML page which, allows the user to input an IP to check. This rudimentary web application will make a request to an Amazon API Gateway REST API. The API will use Lambda to check the input IP against the list of Amazon IP ranges. The IP ranges will be stored in AWS AppConfig.

Next, we will take advantage of the AppConfig Lambda extension so that our function does not need to call AppConfig on every invocation.

Lambda Extension

To use the AppConfig Lambda extension, we first attach a Lambda layer to our function code. The documentation here provides all of the per-region ARNs for the AppConfig Lambda extension.

Once attached, we modify our function code to make a request to a localhost HTTP endpoint that is created by the layer. This endpoint will regularly poll AppConfig for your configuration data and maintain a local cache of it which is available to your function.

Your function code can then query the endpoint for the configuration data. Some sample code showing this in use would be:

import json
import urllib.request

def lambda_handler(event, context):

  app_config_app_name = "foo"
  app_config_env_name = "live"
  app_config_profile_name = "data"

  url = f'http://localhost:2772/applications/{app_config_app_name }/environments/{app_config_env_name}/configurations/{app_config_profile_name}'
    config = json.loads(urllib.request.urlopen(url).read())

SAM

The AWS Serverless Application Model (SAM) code at gitlab.com/aw5academy/sam/awsip can be used to deploy this solution.

This code defines:

  • A lambda function;
  • The AppConfig Lambda extension layer;
  • An API Gateway API;

Deploy the code with:

git clone https://gitlab.com/aw5academy/sam/awsip.git
cd awsip
sam deploy --guided

When deployed, the API endpoint will be output. We will need this value in the next section.

Web Application

Let’s now create an S3 bucket to host our web application. From the S3 service, create a bucket and untick the Block all public access checkbox and acknowledge the warning.

In your checkout of the sam/awsip repository you will see a HTML document at files/index.html. Edit this file and replace the variable value with the value output in the previous step.

var apiBaseUrl = "REPLACE_ME_LATER_AFTER_SAM_DEPLOY"

Next, upload this file to your bucket. Expand the “Permissions” section and tick the “Grant public-read access” radio button and acknowledge the warning.

AppConfig

We now need to deploy the IP data to AppConfig. In your checkout of the sam/awsip repository there is a Python script at scripts/create-app-config.py. Run this script to load the AWS IP ranges into AppConfig.

python3 scripts/create-app-config.py

We can confirm the script has worked by seeing the “1” and “2” configuration profiles under the “awsip” application in the AppConfig console.

Testing

We can now test our solution. In the S3 console, open the index.html object we uploaded and open the link under “Object URL”. Lets try an IP that we know does not exist in the AWS ranges.

Great! That works! Now let’s also check for one that does exist in the AWS range. (52.94.76.1 can be used).

Success!

Confirming Caching

Our solution works but can we verify that we are seeing a performance improvement by using the Lambda extension?

Earlier when you ran sam deploy, two outputs were the function ARNs. The first, “CheckIpFunctionArn” is the function attached to our API which contains the Lambda extension feature. The second, “AppConfigCheckIpFunctionArn” is a separate function that does not have the Lambda extension and instead makes a request to AppConfig directly for the configuration.

In your checkout of the sam/awsip repository you will see a Bash script at scripts/lambda-metrics.sh. Run this script and provide the name of the first function. E.g.

bash scripts/lambda-metrics.sh awsip-CheckIpFunction-S7XVscISBk2q

This script will invoke our function and report how long the function ran for. It will invoke the function 20 times and report the non cold start average duration.

So we see an average duration of approx. 897ms. Lets now try with the other function.

bash scripts/lambda-metrics.sh awsip-AppConfigCheckIpFunction-H2quZz4TWC28

We see now that the average duration is 959ms. So our caching saves us approx. 60ms.

Wrap-Up

This very simple solution implements serverless caching by using AWS AppConfig as a data source.

You can clean-up the resources by deleting the CloudFormation stacks created by SAM and you may also use the Python script at scripts/cleanup-app-config.py to remove the AppConfig resources.

Serverless Jenkins and Amazon ECS Exec

In this very short article I will show how you can create a serverless Jenkins instance and start a shell session in an AWS Fargate task without opening SSH ports or managing SSH keys.

Why Serverless?

No server is easier to manage than no server.

Werner Vogels, CTO @ Amazon

Managing a fleet of EC2 instances for your Jenkins slaves is cumbersome and time consuming, even when baking the configuration into an Amazon Machine Image (AMI). By combining AWS serverless products we can run an instance of Jenkins with substantially less overhead.

Design

We will run our Jenkins master node in an AWS Fargate cluster. The JENKINS_HOME will be stored on an Amazon Elastic File System. We won’t have Jenkins slaves but will instead run jobs on AWS CodeBuild using the Jenkins plugin.

Terraform

The Terraform code at https://gitlab.com/aw5academy/terraform/sls-jenkins can be used to provision the components we need. We can utilise this code as follows:

git clone https://gitlab.com/aw5academy/terraform/sls-jenkins.git
cd sls-jenkins
terraform init
terraform apply

Once applied, we get the following:

Wait a few moments for the ECS task to fully start then open the jenkins-url output in your browser. You should see the Unlock Jenkins page:

ECS Exec

We can obtain the password from the task logs.

However, let’s take advantage of a new feature of Fargate called ECS Exec. With this feature we can start a shell session in any container without requiring SSH ports to be opened or authenticating with SSH keys. To use this feature, ensure you have the latest version of the AWS CLI as well as the latest version of the session manager plugin.

Find the task id of the sls-jenkins task in the ECS console and use it with command:

aws ecs execute-command  \
    --region us-east-1 \
    --cluster sls-jenkins \
    --task <task-id> \
    --container sls-jenkins \
    --command "/bin/bash" \
    --interactive

You can then find the password in the /mnt/efs/secrets/initialAdminPassword file.

Use the value to login to Jenkins and complete the setup wizard.

CodeBuild

We will run Jenkins jobs in AWS CodeBuild.

AWS CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy. With CodeBuild, you don’t need to provision, manage, and scale your own build servers. 

AWS CodeBuild – Fully Managed Build Service (amazon.com)

First, install the CodeBuild plugin.

Next, create a new pipeline job. You can use the sample project at https://gitlab.com/aw5academy/codebuild/sample-project.git as the source of the project.

The Jenkinsfile in this sample project starts a build of the sls-jenkins-small CodeBuild project. When we run the build we get the following output:

The logs from CodeBuild are pulled into Jenkins and displayed in the console output.

Persistent Storage

To verify our Jenkins configuration will persist, lets stop the ECS task.

And if we open Jenkins in our browser we see an outage as expected.

ECS will now launch a new task and will remount the EFS file system that stores our JENKINS_HOME. And if successful we will see the sample-project job that we created earlier.

Success!

Wrap-Up

This solution may be a good fit for very simple Jenkins implementations. You will find that the EFS performance is not as good as EBS or ephemeral storage. There is also a queueing and provisioning time for CodeBuild which you would not experience with your own fleet of EC2 instances. These factors should be considered but if you spend a lot of time maintaining your CI/CD infrastructure, this solution could be useful to you.

Blue/Green Deployments in AWS Fargate with Automated Testing and Rollbacks

Introduction

AWS CodeDeploy makes it easy to setup Blue/Green deployments for your containerised applications running in AWS Fargate. In this article, I will show how you can configure CodeDeploy and Fargate to allow automated testing of your deployments before they receive production traffic. Additionally, I will show how you can configure automatic rollbacks, if your application generates errors after receiving production traffic.

Design

For this demonstration, our container application will be a simple Apache web server. An application load balancer will route production traffic to the containers. Our Docker code will be stored in an AWS CodeCommit repository. AWS CodeBuild will be used to build the Docker image and AWS CodeDeploy will of course be used to perform the deployments. We will use AWS CodePipeline to wrap the build and deploy stages into a deployment pipeline. The below diagram represents our design.

Blue/Green Deployment Pipeline Design

During a deployment, the new v2 code is launched in a second set of one or more containers. These new containers are registered with the “green” target group. The green target group is registered to a test listener on the application load balancer (port 8080 in this demonstration). We will then perform our testing against the test listener. When testing is complete, we signal for the deployment to continue at which point the live listener (port 80) is registered to the green target group. The security group rules for our load balancer only allow ingress on port 8080 from within our VPC thus, preventing end-users from accessing the release prematurely.

As we will see later, CodeDeploy automatically handles the registration of containers to the blue/green target groups and also the registration of listeners to target groups.

Prerequisites

The resources deployed in this solution are described with Terraform — an infrastructure as code software tool. Install the latest version of the Terraform CLI.

Next, ensure you have the git-remote-codecommit utility installed. Most often this can be installed with:

sudo pip install git-remote-codecommit

Terraform

The Terraform code at aw5academy/terraform/ecs-blue-green-demo can be used to provision the resources we need for this demonstration. Deploy this code to your environment by running:

git clone https://gitlab.com/aw5academy/terraform/ecs-blue-green-demo.git
cd ecs-blue-green-demo/
terraform init
terraform apply
Output From Terraform Apply

Note the “alb_dns_name” output — we will need this value later.

Docker

We now need to push our Docker code to the CodeCommit repository created by Terraform. Run the following commands to set it up:

git clone codecommit::us-east-1://ecs-blue-green-demo codecommit
git clone https://gitlab.com/aw5academy/docker/ecs-blue-green-demo.git
cp -r ecs-blue-green-demo/* codecommit/
cd codecommit/
git add .
git commit -m "v1"
git push origin master

CodePipeline

If you open the AWS Console and navigate to the CodePipeline service you will see that the “ecs-blue-green-demo” pipeline has started due to our commit to the CodeCommit repository. Wait for the pipeline to complete our first deployment.

CodePipeline Successful Release

Now lets check that our application is working by opening the “alb_dns_name” Terraform output from earlier in our browser.

Application Response

Great! We have a working application.

CodeDeploy Hooks

Hooks are a feature of CodeDeploy which allow you to perform actions at certain points in a deployment before the deployment continues to the next step. The Hooks for ECS/Fargate are defined here. The hook we are most interested in is “AfterAllowTestTraffic”. We want to run tests during this phase of the deployment to validate our deployment before sending production traffic to our release. To do this we will add an AWS Lambda function reference to our appspec.yaml. This lambda (source code at aw5academy/terraform/ecs-blue-green-demo/lambda-src/deploy-hook/lambda_function.py) writes the hook details to an Amazon S3 bucket for a CodeBuild project to reference. This CodeBuild project (source code at aw5academy/docker/ecs-blue-green-demo/test.sh) runs in parallel to our CodeDeploy deployment in our pipeline and performs our tests during the “AfterAllowTestTraffic” stage.

Automated Testing

Let’s test our deployment process by deliberately introducing an error. If you examine our test script at aw5academy/docker/ecs-blue-green-demo/test.sh you can see that we expect our application to return “Hello from v1”. So let’s break this by changing it to return “Hello from v2” instead. Run the following commands from the CodeCommit checkout to do this:

sed -i "s,Hello from v1,Hello from v2,g" start.sh
git commit -a -m "v2"
git push origin master

This action will automatically trigger our pipeline and if you navigate to the CodeDeploy service in the AWS Console you can follow the deployment when it starts. After some time you should see a failure on the “AfterAllowTestTraffic” stage as we expected.

CodeDeploy Failure

When we check the CodeBuild logs for our test project we can see the problem. As we noted, our tests still expect the application to respond with “Hello from v1”.

CodeBuild Error Logs

CodeDeploy and CloudWatch Alarms

There is one more way we can validate our deployments. Suppose we would like to monitor our deployments for some time after we route production traffic to them. And if we notice any issues we would like to rollback. By combining CodeDeploy and CloudWatch Alarms we can do this in an automated way.

AWS CodeDeploy allows you to retain the existing containers for a period of time after a deployment. In our demonstration, for simplicity, we have configured it to 5 minutes but it can be many hours if you wish. With this setting, and properly configured CloudWatch alarms, you can monitor your application post-deployment and if any of your alarms move into the alarm state during the retention time, CodeDeploy will automatically rollback to the previous version.

In our demonstration, we have configured our Docker container to send the httpd access logs to a CloudWatch Logs group. A log metric filter will send a data point whenever our httpd access logs contain the string ” 404 ” — i.e. whenever a request is made to the server which can’t be served. Next, we have a CloudWatch alarm that will move into the alarm state when 1 or more data points are received from the log metric filter.

In the next section we will see how CodeDeploy works with this CloudWatch alarm to automatically rollback when needed.

Automated Rollbacks

Let’s go back and fix the error we introduced. In our CodeCommit checkout, run the following commands:

sed -i "s,Hello from v1,Hello from v2,g" test.sh
git commit -a -m "v2 -- fix test"
git push origin master

Our tests have been corrected to match the new response from our application. If you open the AWS CodeDeploy service you should see the deployment happening again. This time you will see that it proceeds past the “AfterAllowTestTraffic” stage and that production traffic has been routed to the new set of containers.

CodeDeploy Wait

We can verify by opening the URL from our Terraform “alb_dns_name” output.

Application Response

Our application has been fully released and is serving production traffic. Now let’s deliberately cause an error by generating a 404. You can do this by entering any random path to the end of our URL. As expected we get a 404.

Application 404 Response

When we inspect our CloudWatch logs we can see the request in the access logs.

CloudWatch Logs 404 Error

Next, if we go back to CodeDeploy we should see a reporting of the alarm and a rollback being initiated.

CodeDeploy Alarm Rollback

Looks good! Now to confirm, we open our URL from the Terraform “alb_dns_name” output again to verify that the application has been rolled back to v1.

Application Response

Success!

Wrap-Up

I hope this article has demonstrated how powerful AWS CodeDeploy can be when configured with supporting services and features.

Ensure you clean-up the resources created here by running the following from the root of your checkout of the Terraform code:

terraform init
terraform destroy

Serverless File Transfer Workload – Part 3 – CSV-To-DynamoDB

Introduction

The last piece of our overall solution is the processing of a CSV file into a data store.

Design

We will use Amazon DynamoDB as our data store and AWS Lambda to perform the CSV processing. This design was influenced by the AWS blog post at Implementing bulk CSV ingestion to Amazon DynamoDB | AWS Database Blog. In fact, our Lambda code is an extension of that provided in the blog.

Our design looks like the following.

csv-to-dynamodb design

Here, we have an EventBridge rule watching for tagging operations against S3 objects in our bucket. When detected, our Lambda is invoked which, loads each record of the CSV as an item in a DynamoDB table.

Terraform

The Terraform code at aw5academy/terraform/csv-to-dynamodb can be used to create the required components.

Once applied, we have an empty DynamoDB table.

DynamoDB table – empty

Recap

We now have everything in place to test our entire solution. To recap, this is what our infrastructure now looks like.

Complete Design

So when we upload a CSV file via SFTP we expect:

  • the CSV file will be stored in S3;
  • an ECS task will launch which will scan the file with ClamAV;
  • if the file is clean, the S3 object will be tagged with av-status=CLEAN;
  • the lambda function will be invoked and the CSV records loaded into DynamoDB;

Testing

Let’s try it. We will upload a CSV file via WinSCP. You may use the sample file at aw5academy/terraform/csv-to-dynamodb/sample-file.csv.

WinSCP CSV upload

Now within a few minutes, if all is successful, we will see the items appear in our DynamoDB table.

DynamoDB Table – Filled

Success!

Wrap-Up

The requirements presented to us were complex enough. Yet, by combining many services and feature within AWS, we have constructed a solution using no servers. I hope you found these articles useful.

Serverless File Transfer Workload – Part 2 – AntiVirus

Introduction

We require uploaded files to be scanned for viruses before they can be processed further.

Design

Our design for this solution can be represented in the following diagram.

AntiVirus Diagram

There is a lot in this so let’s describe all that is happening here.

  • We use ClamAV to perform the anti-virus scans.
  • ClamAV definitions are stored in an Amazon Elastic File System (EFS).
  • An Amazon EventBridge scheduled rule starts an Amazon Elastic Container Service (ECS) task periodically (every 3 hours) which, runs freshclam to update the virus database on the EFS file system.
  • A bucket notification is created for the S3 bucket that stores files to be scanned.
  • When new objects are created in this bucket, the event is sent to an Amazon Simple Queue Service (SQS) queue.
  • An Amazon EventBridge scheduled rule invokes an AWS Lambda function every minute.
  • The lambda function uses an approach documented in this guide to determine the ScanBacklogPerTask by reading attributes of the SQS queue and the ECS service’s task count.
  • The lambda publishes the ScanBacklogPerTask metric to Amazon CloudWatch.
  • An Amazon CloudWatch alarm which, is monitoring the ScanBacklogPerTask metric, notifies the Application Auto Scaling service.
  • Application Auto Scaling updates the running task count of an ECS service.
  • The tasks in the ECS service mount the EFS file system so that the latest ClamAV virus definitions are available.
  • The tasks then receive messages from the SQS queue.
  • Each message contains details of the S3 object to be scanned. The task downloads the object and performs a clamdscan on it.
  • The result of the virus scan (either “CLEAN” or “INFECTED”) is set as the “av-status” tag on the S3 object.
  • Note also that the ECS scan service runs in a protected VPC subnet. That is, a subnet which has no internet access.

Docker

The Docker code for the ECS tasks can be found at aw5academy/docker/clamav. The Docker containers built from this code poll SQS for messages and perform the ClamAV virus scan. We will come back to this later.

Terraform

The Terraform code that will provision our infrastructure can be found at aw5academy/terraform/clamav.

When you apply the code you will prompted for a bucket name. Enter the name of the bucket that was created in the first part of this article.

Terraform apply

Configuration

When Terraform is applied, we now have to push the Docker code to a CodeCommit repository created by Terraform. The following steps will do this:

git clone https://gitlab.com/aw5academy/docker/clamav.git clamav-aw5academy
pip3 install git-remote-codecommit
export PATH=$PATH:~/.local/bin
git clone codecommit::us-east-1://clamav
cp clamav-aw5academy/* clamav/
cd clamav
git add .
git commit -m "Initial commit"
git push origin master

You should then see the code in the CodeCommit console.

CodeCommit

Next, we need to start an AWS CodeBuild project which will clone the clamav repository, perform a Docker build and push the image to an Amazon Elastic Container Registry (ECR) repository.

Docker build in AWS CodeCommit
ECR repository

One last step is we need to trigger a run of the freshclam task so that the ClamAV database files are present on our EFS file system. The easiest way to do this is to update the schedule for the task from the ECS console and set it to run every minute.

ECS Scheduled Task

We can verify that the database is updated from the task logs.

Freshclam logs

Testing

Now let’s test our solution by uploading a file directly to the S3 bucket. When we do, we can check the metrics for our SQS queue for activity as well as the logs for the ECS scan tasks.

SQS metrics
ECS scan logs

Success! We can see from the metrics that a message was sent to the queue and deleted shortly after. And the ECS logs show the file being scanned and the S3 object being tagged.

Virus Check

As one final test, let’s see if a virus will be detected and appropriate action taken. This solution has been designed to block access to all objects uploaded to S3 unless they have been tagged with “av-status-CLEAN”. So we expect to have no access to a virus infected file.

Rather than using a real virus we will use the EICAR test file. Let’s upload a file with this content to see what happens.

S3 Infected

Great! The object has been properly tagged as infected. But are we blocked from accessing the file? Let’s try downloading it.

S3 download error

We are denied as expected.

Now let’s check out part 3 where we implement the loading of our CSV data.

Serverless File Transfer Workload – Part 1 – SFTP

Introduction

Suppose a file transfer workload exists between a business and their customers. A comma-separated values (CSV) file is transferred to the business and the records are loaded into a database. The business has regulatory requirements mandating that all external assets are virus scanned before being processed. Additionally, an intrusion prevention system (IPS) must operate on all public endpoints.

In the following 3 articles I will demonstrate how we can build a serverless system that meets these requirements.

Design

We will use the Secure File Transfer Protocol (SFTP) to enable the transfer of files between the customer and the business. Using the AWS Transfer Family service we can create an SFTP endpoint with an Amazon S3 bucket to store the files. An AWS Network Firewall will sit in front of our SFTP endpoint.

AWS Network Firewall is a managed service that makes it easy to deploy essential network protections for all of your Amazon Virtual Private Clouds (VPCs).

AWS Network Firewall’s intrusion prevention system (IPS) provides active traffic flow inspection so you can identify and block vulnerability exploits using signature-based detection.

For our Network Firewall deployment, we will follow the multi-zone internet gateway architecture as described at Multi zone architecture with an internet gateway – Network Firewall (amazon.com)

A simplified view of our infrastructure is shown below.

Design diagram for SFTP

Terraform

The Terraform code at aw5academy/terraform/sftp can be used to apply the infrastructure components.

Terraform apply console output

Make a note of both the bucket-name and sftp-endpoint outputs… we will use both of these values later.

With Terraform applied we can inspect the created components in the AWS console. Let’s first check our SFTP endpoint which can be found in the AWS Transfer Family service.

SFTP endpoint

We can also see the AWS Network Firewall which is in the VPC service.

AWS Network Firewall

Testing

Let’s test out our solution. First, in the root of the Terraform directory, an example.pem file exists which is the private key we will use to authenticate with the SFTP endpoint. Copy this to your Windows host machine so we can use it with WinSCP.

In WinSCP, create a new site and provide the sftp endpoint. For username we will use “example”.

WinSCP new site

Select “Advanced” and provide the path to the example.pem you copied over. It will require you to convert it to a ppk file.

WinSCP SSH

Now login and copy a file across.

WinSCP file copy

Lastly, verify the file exists in S3 from the AWS console.

S3 Console

Success!

Now let’s continue with part 2 where we will implement the anti-virus scanning.

Automated UI Testing With AWS Machine Learning

This article will be a little bit different to previous posts. Having only just recently started to check out AWS Machine Learning I am still in the early stages of my study of these services. So for this article, I wanted to post what I have learned so far in the form of a possible usage for machine learning — automated UI testing.

Web Application

Let’s suppose we have a web application that provides a listing of search results — maybe a search engine or some kind of eCommerce website. We want to ensure the listings are displaying correctly so we have humans perform UI testing. Can we train machines to do this work for us?

In https://gitlab.com/aw5academy/docker/mock-search-webapp I have created a mock web application which displays random text in a search listings view. Running the buildandrun.sh script will run this in Docker which we can view with http://localhost:8080.

Additionally, we can generate a random error with http://localhost:8080?bad=true.

Training Data

The most difficult part of building a machine learning model appears to be collecting the right training data. Our training data will consist of screenshots of the web page where the “good” images will be when the application is working as expected and the “bad” images are when there is some error in the display of the application.

We need good variety of both the “good” and the “bad”. In https://gitlab.com/aw5academy/sagemaker/mock-search-webapp-train we can execute the run.sh script which will generate 100 random good images and 100 random bad images. These images are generated by using PhantomJS – a headless browser.

We can then explode our test data by performing random orientation changes, contrast changes etc. This increases the number of images in our training set.

Now we can open the Amazon SageMaker service and create our training job. We upload the training data to an Amazon S3 bucket so that SageMaker can download it.

Once created, the training job will start. We can view metrics from the job as it is working.

You can see the training accuracy improving over time.

Inference

Now that we have our model trained, we can test how good it is by deploying it to a SageMaker Model Endpoint. Once deployed, we can test it with invoke-endpoint. We provide a screenshot image to this API call and the result returned to us will be two values: the probability of the image being “good” and the probability of it being “bad”.

In https://gitlab.com/aw5academy/sagemaker/mock-search-webapp-train-infer we have a run.sh script which calls the invoke-endpoint API and provides it with screenshots which the model has never seen before. You can observe these with http://localhost:8080?test=0 to http://localhost:8080?test=9. Even values for the “test” query parameter are “good” images while odd values are “bad” images.

When we execute the script we see:

A partial success! The model did well for some tests and not so well for others.

Conclusions

Some thoughts and conclusions I have made after completing this experiment:

  1. The algorithm used in this model was Image Classification. I am not sure this is the best choice. Most of the “good” images are very similar. Probably too similar. We might need another algorithm which, rather than classify the image, detects abnormalities.
  2. As mentioned earlier, gathering the training data is the difficult part. It is possible that this mock application is not capable of producing enough variation. A real world application may produce better results. Additionally, actual errors observed in the past could be used to train the model.
  3. Even with the less than great results from this experiment, this solution could be used in a CI/CD pipeline. The sample errors I generated were sometimes very subtle, such as text being off by a few pixels. The model could be retrained to detect only very obvious errors. Then, an application’s build pipeline could do very quick sanity tests to detect obvious UI errors.

AWS Fargate Application Configuration With S3 Environment Files

A recent AWS Fargate feature update has added support for S3 hosted environment files. In this article I will show how you could use this to manage your application’s configuration. I will also demonstrate how changes to the configuration can be released in a blue-green deployment.

Design

The solution we will build will follow the design shown in the below diagram.

Our source code (including our configuration files) will be stored in AWS CodeCommit. We will then use AWS CodeBuild and AWS CodeDeploy to package and deploy our application to Amazon Elastic Container Service (ECS). AWS CodePipeline will be used to knit these services together into a release pipeline.

S3 will of course be used to store the application configuration so that it is available for our applications running in ECS to consume.

Terraform

The Terraform code at https://gitlab.com/aw5academy/terraform/ecs-env-file-demo will deploy our demo stack. The task definition we create uses the environmentFiles directive noted in the AWS documentation.

We can create the stack with:

mkdir -p terraform
cd terraform
git clone https://gitlab.com/aw5academy/terraform/ecs-env-file-demo.git
cd ecs-env-file-demo
terraform init
terraform apply

Docker

We now need to build our Docker image which will serve as our application. In https://gitlab.com/aw5academy/docker/ecs-env-file-demo we have a simple Apache server which reads the value of the CSS_BACKGROUND environment variable and sets it as the background colour of our index.html document.

You can build this Docker image and push it to the ECR repository created by Terraform with:

mkdir -p docker
cd docker
git clone https://gitlab.com/aw5academy/docker/ecs-env-file-demo.git
cd ecs-env-file-demo
bash -x build.sh

CodeCommit

Now we need to deploy our application configuration to our CodeCommit repository. https://gitlab.com/aw5academy/ecs/ecs-env-file-demo contains everything we need and we can deploy it to our CodeCommit repo with the following:

mkdir -p ecs
cd ecs
git clone https://gitlab.com/aw5academy/ecs/ecs-env-file-demo.git
cd ecs-env-file-demo
rm -rf .git
git clone codecommit::us-east-1://ecs-env-file-demo
mv *.* ecs-env-file-demo/
cd ecs-env-file-demo
git add .
git commit -m "Initial commit"
git push origin master

If you have any issues with this step, navigate to the CodeCommit service and open the ecs-env-file-demo repository for clone instructions and prerequisites.

CodePipeline

As soon as we push our code to CodeCommit, our release pipeline will trigger. Navigate to the CodePipeline service and open the ecs-env-file-demo pipeline.

Wait until this release completes.

Application Configuration Changes

We can now test our process for making configuration changes. Navigate to the CodeCommit service and open our ecs-env-file-demo repository. Then open the cfg.env file. You can see that our configuration file has a value of “blue” for our CSS_BACKGROUND variable. This is the variable that our Apache server uses for the webpage’s background colour.

Let’s change this value to “green”, enter the appropriate Author details and click “Commit changes”.

CodeDeploy

We can now use the CodeDeploy service to follow our deployment. If you first navigate to the CodePipeline service and open our ecs-env-file-demo pipeline, when the CodeDeploy stage begins, click on the Details link to bring us to the CodeDeploy service.

Our deployment has started. Note, our deployments will use a Canary release with 20% of the traffic receiving the new changes for 5 minutes. After that, 100% of the traffic will receive the new changes. In your checkout of the Terraform code, there is a deployment-tester.html file. This is a page of 9 HTML iframes with the source being the DNS name of the load balancer in our application stack. The page auto refreshes every 5 seconds.

If you open this deployment-tester.html file (you may need to open developer tools and disable cache for it to be effective) you will be able to verify our release is working as expected. It should initially show just the original blue.

Now you can wait for CodeDeploy to enter the next stage.

We now have 20% of our traffic routed to the new application configuration — the green. Let’s check this in our deployment-tester.html file:

Success!

And to complete the process, we can wait for CodeDeploy to finish and verify the application is fully green.

Looks good!

Wrap-Up

Cleanup the created resources with:

cd terraform
cd ecs-env-file-demo
terraform destroy

I hope this very simple example has effectively demonstrated the new capability in AWS Fargate.

AWS CodeBuild Local

In this article I will show how you can run your AWS CodeBuild projects locally. AWS CodeBuild is a “fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy”. By running your CodeBuild projects locally you can test code changes before committing, allowing you to rapidly develop and debug your projects.

Workstation

I recommend using Windows Subsytem for Linux 2 with Ubuntu 20.04 for your local workstation configuration. Additionally, the Chef code I have created at aw5academy/chef/workstation will bootstrap an environment for you with everything you need to follow along in this article. Note: remember to run the /home/ec2-user/codebuild_setup.sh which, builds the Amazon Linux CodeBuild Docker image (this process can take over 60 minutes to complete).

Ubuntu App for WSL

If not using this workstation, the resources you will need are:

  1. awscb.sh # Copy this file into your PATH (without the .sh extension)
  2. codebuild_setup.sh # Download and run this script (Note: this can take over 60minutes to complete)
  3. codebuild_build.sh # Copy this file into your PATH
  4. git-remote-codecommit # Install this Python module
  5. Docker Desktop

CodeBuild Project

Let’s first create a CodeBuild project in AWS. In this example our project will be a Docker based Apache application with the built Docker image pushed to Amazon Elastic Container Registry. We will use the Terraform code at aw5academy/terraform/docker-codebuild to provision the resources we need.

git clone https://gitlab.com/aw5academy/terraform/docker-codebuild.git
cd docker-codebuild
terraform init
terraform apply

Part of this Terraform stack is an AWS CodeCommit repository that we will use to store our Docker code. We can copy the code I have created at aw5academy/docker/httpd into this CodeCommit repository.

git clone codecommit::us-east-1://aw5academy-httpd
cd aw5academy-httpd
git clone https://gitlab.com/aw5academy/docker/httpd.git
mv httpd/{Dockerfile,buildspec.yaml} .
rm -rf httpd
git add .
git commit -m "Initial commit"
git push origin master

Now let’s test that the CodeBuild project works from AWS. Navigate to the CodeBuild service and find the docker-aw5academy-httpd project. Click on “Start Build” and select the “master” branch.

CodeBuild Start build page – Configuration
CodeBuild Start build page – Source

Now if you start the build and view the build logs you will see the Docker build happening and the Docker image being pushed to ECR.

CodeBuild logs

CodeBuild Local

We can now try running CodeBuild locally. From your checkout of the “aw5academy-httpd” repository, simply run “awscb”.

CodeBuild Local
CodeBuild Local

Success! You now have a Docker image locally, built in the same way as is done by the AWS CodeBuild service.

You can also add script arguments to awscb to pass in environment variables that will be available within your builds. For example:

awscb -e "env1=foo,env2=bar"

We can also use the “-p” arg to push the Docker images we build locally into ECR. You can combine this with the “-t” arg to tag your images differently. E.g.

awscb -t develop -p

If we run the above command and view our repository in ECR we can see the “latest” image created by AWS CodeBuild and the “develop” image we created locally and pushed.

ECR image list

Wrap-Up

In this article I have demonstrated CodeBuild local for Docker. But you can use this for other build types, e.g. Maven. For more detailed information, refer to the AWS blog post at https://aws.amazon.com/blogs/devops/announcing-local-build-support-for-aws-codebuild/