Serverless File Transfer Workload – Part 3 – CSV-To-DynamoDB

Introduction

The last piece of our overall solution is the processing of a CSV file into a data store.

Design

We will use Amazon DynamoDB as our data store and AWS Lambda to perform the CSV processing. This design was influenced by the AWS blog post at Implementing bulk CSV ingestion to Amazon DynamoDB | AWS Database Blog. In fact, our Lambda code is an extension of that provided in the blog.

Our design looks like the following.

csv-to-dynamodb design

Here, we have an EventBridge rule watching for tagging operations against S3 objects in our bucket. When detected, our Lambda is invoked which, loads each record of the CSV as an item in a DynamoDB table.

Terraform

The Terraform code at aw5academy/terraform/csv-to-dynamodb can be used to create the required components.

Once applied, we have an empty DynamoDB table.

DynamoDB table – empty

Recap

We now have everything in place to test our entire solution. To recap, this is what our infrastructure now looks like.

Complete Design

So when we upload a CSV file via SFTP we expect:

  • the CSV file will be stored in S3;
  • an ECS task will launch which will scan the file with ClamAV;
  • if the file is clean, the S3 object will be tagged with av-status=CLEAN;
  • the lambda function will be invoked and the CSV records loaded into DynamoDB;

Testing

Let’s try it. We will upload a CSV file via WinSCP. You may use the sample file at aw5academy/terraform/csv-to-dynamodb/sample-file.csv.

WinSCP CSV upload

Now within a few minutes, if all is successful, we will see the items appear in our DynamoDB table.

DynamoDB Table – Filled

Success!

Wrap-Up

The requirements presented to us were complex enough. Yet, by combining many services and feature within AWS, we have constructed a solution using no servers. I hope you found these articles useful.

Serverless File Transfer Workload – Part 2 – AntiVirus

Introduction

We require uploaded files to be scanned for viruses before they can be processed further.

Design

Our design for this solution can be represented in the following diagram.

AntiVirus Diagram

There is a lot in this so let’s describe all that is happening here.

  • We use ClamAV to perform the anti-virus scans.
  • ClamAV definitions are stored in an Amazon Elastic File System (EFS).
  • An Amazon EventBridge scheduled rule starts an Amazon Elastic Container Service (ECS) task periodically (every 3 hours) which, runs freshclam to update the virus database on the EFS file system.
  • A bucket notification is created for the S3 bucket that stores files to be scanned.
  • When new objects are created in this bucket, the event is sent to an Amazon Simple Queue Service (SQS) queue.
  • An Amazon EventBridge scheduled rule invokes an AWS Lambda function every minute.
  • The lambda function uses an approach documented in this guide to determine the ScanBacklogPerTask by reading attributes of the SQS queue and the ECS service’s task count.
  • The lambda publishes the ScanBacklogPerTask metric to Amazon CloudWatch.
  • An Amazon CloudWatch alarm which, is monitoring the ScanBacklogPerTask metric, notifies the Application Auto Scaling service.
  • Application Auto Scaling updates the running task count of an ECS service.
  • The tasks in the ECS service mount the EFS file system so that the latest ClamAV virus definitions are available.
  • The tasks then receive messages from the SQS queue.
  • Each message contains details of the S3 object to be scanned. The task downloads the object and performs a clamdscan on it.
  • The result of the virus scan (either “CLEAN” or “INFECTED”) is set as the “av-status” tag on the S3 object.
  • Note also that the ECS scan service runs in a protected VPC subnet. That is, a subnet which has no internet access.

Docker

The Docker code for the ECS tasks can be found at aw5academy/docker/clamav. The Docker containers built from this code poll SQS for messages and perform the ClamAV virus scan. We will come back to this later.

Terraform

The Terraform code that will provision our infrastructure can be found at aw5academy/terraform/clamav.

When you apply the code you will prompted for a bucket name. Enter the name of the bucket that was created in the first part of this article.

Terraform apply

Configuration

When Terraform is applied, we now have to push the Docker code to a CodeCommit repository created by Terraform. The following steps will do this:

git clone https://gitlab.com/aw5academy/docker/clamav.git clamav-aw5academy
pip3 install git-remote-codecommit
export PATH=$PATH:~/.local/bin
git clone codecommit::us-east-1://clamav
cp clamav-aw5academy/* clamav/
cd clamav
git add .
git commit -m "Initial commit"
git push origin master

You should then see the code in the CodeCommit console.

CodeCommit

Next, we need to start an AWS CodeBuild project which will clone the clamav repository, perform a Docker build and push the image to an Amazon Elastic Container Registry (ECR) repository.

Docker build in AWS CodeCommit
ECR repository

One last step is we need to trigger a run of the freshclam task so that the ClamAV database files are present on our EFS file system. The easiest way to do this is to update the schedule for the task from the ECS console and set it to run every minute.

ECS Scheduled Task

We can verify that the database is updated from the task logs.

Freshclam logs

Testing

Now let’s test our solution by uploading a file directly to the S3 bucket. When we do, we can check the metrics for our SQS queue for activity as well as the logs for the ECS scan tasks.

SQS metrics
ECS scan logs

Success! We can see from the metrics that a message was sent to the queue and deleted shortly after. And the ECS logs show the file being scanned and the S3 object being tagged.

Virus Check

As one final test, let’s see if a virus will be detected and appropriate action taken. This solution has been designed to block access to all objects uploaded to S3 unless they have been tagged with “av-status-CLEAN”. So we expect to have no access to a virus infected file.

Rather than using a real virus we will use the EICAR test file. Let’s upload a file with this content to see what happens.

S3 Infected

Great! The object has been properly tagged as infected. But are we blocked from accessing the file? Let’s try downloading it.

S3 download error

We are denied as expected.

Now let’s check out part 3 where we implement the loading of our CSV data.

Serverless File Transfer Workload – Part 1 – SFTP

Introduction

Suppose a file transfer workload exists between a business and their customers. A comma-separated values (CSV) file is transferred to the business and the records are loaded into a database. The business has regulatory requirements mandating that all external assets are virus scanned before being processed. Additionally, an intrusion prevention system (IPS) must operate on all public endpoints.

In the following 3 articles I will demonstrate how we can build a serverless system that meets these requirements.

Design

We will use the Secure File Transfer Protocol (SFTP) to enable the transfer of files between the customer and the business. Using the AWS Transfer Family service we can create an SFTP endpoint with an Amazon S3 bucket to store the files. An AWS Network Firewall will sit in front of our SFTP endpoint.

AWS Network Firewall is a managed service that makes it easy to deploy essential network protections for all of your Amazon Virtual Private Clouds (VPCs).

AWS Network Firewall’s intrusion prevention system (IPS) provides active traffic flow inspection so you can identify and block vulnerability exploits using signature-based detection.

For our Network Firewall deployment, we will follow the multi-zone internet gateway architecture as described at Multi zone architecture with an internet gateway – Network Firewall (amazon.com)

A simplified view of our infrastructure is shown below.

Design diagram for SFTP

Terraform

The Terraform code at aw5academy/terraform/sftp can be used to apply the infrastructure components.

Terraform apply console output

Make a note of both the bucket-name and sftp-endpoint outputs… we will use both of these values later.

With Terraform applied we can inspect the created components in the AWS console. Let’s first check our SFTP endpoint which can be found in the AWS Transfer Family service.

SFTP endpoint

We can also see the AWS Network Firewall which is in the VPC service.

AWS Network Firewall

Testing

Let’s test out our solution. First, in the root of the Terraform directory, an example.pem file exists which is the private key we will use to authenticate with the SFTP endpoint. Copy this to your Windows host machine so we can use it with WinSCP.

In WinSCP, create a new site and provide the sftp endpoint. For username we will use “example”.

WinSCP new site

Select “Advanced” and provide the path to the example.pem you copied over. It will require you to convert it to a ppk file.

WinSCP SSH

Now login and copy a file across.

WinSCP file copy

Lastly, verify the file exists in S3 from the AWS console.

S3 Console

Success!

Now let’s continue with part 2 where we will implement the anti-virus scanning.

Automated UI Testing With AWS Machine Learning

This article will be a little bit different to previous posts. Having only just recently started to check out AWS Machine Learning I am still in the early stages of my study of these services. So for this article, I wanted to post what I have learned so far in the form of a possible usage for machine learning — automated UI testing.

Web Application

Let’s suppose we have a web application that provides a listing of search results — maybe a search engine or some kind of eCommerce website. We want to ensure the listings are displaying correctly so we have humans perform UI testing. Can we train machines to do this work for us?

In https://gitlab.com/aw5academy/docker/mock-search-webapp I have created a mock web application which displays random text in a search listings view. Running the buildandrun.sh script will run this in Docker which we can view with http://localhost:8080.

Additionally, we can generate a random error with http://localhost:8080?bad=true.

Training Data

The most difficult part of building a machine learning model appears to be collecting the right training data. Our training data will consist of screenshots of the web page where the “good” images will be when the application is working as expected and the “bad” images are when there is some error in the display of the application.

We need good variety of both the “good” and the “bad”. In https://gitlab.com/aw5academy/sagemaker/mock-search-webapp-train we can execute the run.sh script which will generate 100 random good images and 100 random bad images. These images are generated by using PhantomJS – a headless browser.

We can then explode our test data by performing random orientation changes, contrast changes etc. This increases the number of images in our training set.

Now we can open the Amazon SageMaker service and create our training job. We upload the training data to an Amazon S3 bucket so that SageMaker can download it.

Once created, the training job will start. We can view metrics from the job as it is working.

You can see the training accuracy improving over time.

Inference

Now that we have our model trained, we can test how good it is by deploying it to a SageMaker Model Endpoint. Once deployed, we can test it with invoke-endpoint. We provide a screenshot image to this API call and the result returned to us will be two values: the probability of the image being “good” and the probability of it being “bad”.

In https://gitlab.com/aw5academy/sagemaker/mock-search-webapp-train-infer we have a run.sh script which calls the invoke-endpoint API and provides it with screenshots which the model has never seen before. You can observe these with http://localhost:8080?test=0 to http://localhost:8080?test=9. Even values for the “test” query parameter are “good” images while odd values are “bad” images.

When we execute the script we see:

A partial success! The model did well for some tests and not so well for others.

Conclusions

Some thoughts and conclusions I have made after completing this experiment:

  1. The algorithm used in this model was Image Classification. I am not sure this is the best choice. Most of the “good” images are very similar. Probably too similar. We might need another algorithm which, rather than classify the image, detects abnormalities.
  2. As mentioned earlier, gathering the training data is the difficult part. It is possible that this mock application is not capable of producing enough variation. A real world application may produce better results. Additionally, actual errors observed in the past could be used to train the model.
  3. Even with the less than great results from this experiment, this solution could be used in a CI/CD pipeline. The sample errors I generated were sometimes very subtle, such as text being off by a few pixels. The model could be retrained to detect only very obvious errors. Then, an application’s build pipeline could do very quick sanity tests to detect obvious UI errors.

AWS Fargate Application Configuration With S3 Environment Files

A recent AWS Fargate feature update has added support for S3 hosted environment files. In this article I will show how you could use this to manage your application’s configuration. I will also demonstrate how changes to the configuration can be released in a blue-green deployment.

Design

The solution we will build will follow the design shown in the below diagram.

Our source code (including our configuration files) will be stored in AWS CodeCommit. We will then use AWS CodeBuild and AWS CodeDeploy to package and deploy our application to Amazon Elastic Container Service (ECS). AWS CodePipeline will be used to knit these services together into a release pipeline.

S3 will of course be used to store the application configuration so that it is available for our applications running in ECS to consume.

Terraform

The Terraform code at https://gitlab.com/aw5academy/terraform/ecs-env-file-demo will deploy our demo stack. The task definition we create uses the environmentFiles directive noted in the AWS documentation.

We can create the stack with:

mkdir -p terraform
cd terraform
git clone https://gitlab.com/aw5academy/terraform/ecs-env-file-demo.git
cd ecs-env-file-demo
terraform init
terraform apply

Docker

We now need to build our Docker image which will serve as our application. In https://gitlab.com/aw5academy/docker/ecs-env-file-demo we have a simple Apache server which reads the value of the CSS_BACKGROUND environment variable and sets it as the background colour of our index.html document.

You can build this Docker image and push it to the ECR repository created by Terraform with:

mkdir -p docker
cd docker
git clone https://gitlab.com/aw5academy/docker/ecs-env-file-demo.git
cd ecs-env-file-demo
bash -x build.sh

CodeCommit

Now we need to deploy our application configuration to our CodeCommit repository. https://gitlab.com/aw5academy/ecs/ecs-env-file-demo contains everything we need and we can deploy it to our CodeCommit repo with the following:

mkdir -p ecs
cd ecs
git clone https://gitlab.com/aw5academy/ecs/ecs-env-file-demo.git
cd ecs-env-file-demo
rm -rf .git
git clone codecommit::us-east-1://ecs-env-file-demo
mv *.* ecs-env-file-demo/
cd ecs-env-file-demo
git add .
git commit -m "Initial commit"
git push origin master

If you have any issues with this step, navigate to the CodeCommit service and open the ecs-env-file-demo repository for clone instructions and prerequisites.

CodePipeline

As soon as we push our code to CodeCommit, our release pipeline will trigger. Navigate to the CodePipeline service and open the ecs-env-file-demo pipeline.

Wait until this release completes.

Application Configuration Changes

We can now test our process for making configuration changes. Navigate to the CodeCommit service and open our ecs-env-file-demo repository. Then open the cfg.env file. You can see that our configuration file has a value of “blue” for our CSS_BACKGROUND variable. This is the variable that our Apache server uses for the webpage’s background colour.

Let’s change this value to “green”, enter the appropriate Author details and click “Commit changes”.

CodeDeploy

We can now use the CodeDeploy service to follow our deployment. If you first navigate to the CodePipeline service and open our ecs-env-file-demo pipeline, when the CodeDeploy stage begins, click on the Details link to bring us to the CodeDeploy service.

Our deployment has started. Note, our deployments will use a Canary release with 20% of the traffic receiving the new changes for 5 minutes. After that, 100% of the traffic will receive the new changes. In your checkout of the Terraform code, there is a deployment-tester.html file. This is a page of 9 HTML iframes with the source being the DNS name of the load balancer in our application stack. The page auto refreshes every 5 seconds.

If you open this deployment-tester.html file (you may need to open developer tools and disable cache for it to be effective) you will be able to verify our release is working as expected. It should initially show just the original blue.

Now you can wait for CodeDeploy to enter the next stage.

We now have 20% of our traffic routed to the new application configuration — the green. Let’s check this in our deployment-tester.html file:

Success!

And to complete the process, we can wait for CodeDeploy to finish and verify the application is fully green.

Looks good!

Wrap-Up

Cleanup the created resources with:

cd terraform
cd ecs-env-file-demo
terraform destroy

I hope this very simple example has effectively demonstrated the new capability in AWS Fargate.

AWS CodeBuild Local

In this article I will show how you can run your AWS CodeBuild projects locally. AWS CodeBuild is a “fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy”. By running your CodeBuild projects locally you can test code changes before committing, allowing you to rapidly develop and debug your projects.

Workstation

I recommend using Windows Subsytem for Linux 2 with Ubuntu 20.04 for your local workstation configuration. Additionally, the Chef code I have created at aw5academy/chef/workstation will bootstrap an environment for you with everything you need to follow along in this article. Note: remember to run the /home/ec2-user/codebuild_setup.sh which, builds the Amazon Linux CodeBuild Docker image (this process can take over 60 minutes to complete).

Ubuntu App for WSL

If not using this workstation, the resources you will need are:

  1. awscb.sh # Copy this file into your PATH (without the .sh extension)
  2. codebuild_setup.sh # Download and run this script (Note: this can take over 60minutes to complete)
  3. codebuild_build.sh # Copy this file into your PATH
  4. git-remote-codecommit # Install this Python module
  5. Docker Desktop

CodeBuild Project

Let’s first create a CodeBuild project in AWS. In this example our project will be a Docker based Apache application with the built Docker image pushed to Amazon Elastic Container Registry. We will use the Terraform code at aw5academy/terraform/docker-codebuild to provision the resources we need.

git clone https://gitlab.com/aw5academy/terraform/docker-codebuild.git
cd docker-codebuild
terraform init
terraform apply

Part of this Terraform stack is an AWS CodeCommit repository that we will use to store our Docker code. We can copy the code I have created at aw5academy/docker/httpd into this CodeCommit repository.

git clone codecommit::us-east-1://aw5academy-httpd
cd aw5academy-httpd
git clone https://gitlab.com/aw5academy/docker/httpd.git
mv httpd/{Dockerfile,buildspec.yaml} .
rm -rf httpd
git add .
git commit -m "Initial commit"
git push origin master

Now let’s test that the CodeBuild project works from AWS. Navigate to the CodeBuild service and find the docker-aw5academy-httpd project. Click on “Start Build” and select the “master” branch.

CodeBuild Start build page – Configuration
CodeBuild Start build page – Source

Now if you start the build and view the build logs you will see the Docker build happening and the Docker image being pushed to ECR.

CodeBuild logs

CodeBuild Local

We can now try running CodeBuild locally. From your checkout of the “aw5academy-httpd” repository, simply run “awscb”.

CodeBuild Local
CodeBuild Local

Success! You now have a Docker image locally, built in the same way as is done by the AWS CodeBuild service.

You can also add script arguments to awscb to pass in environment variables that will be available within your builds. For example:

awscb -e "env1=foo,env2=bar"

We can also use the “-p” arg to push the Docker images we build locally into ECR. You can combine this with the “-t” arg to tag your images differently. E.g.

awscb -t develop -p

If we run the above command and view our repository in ECR we can see the “latest” image created by AWS CodeBuild and the “develop” image we created locally and pushed.

ECR image list

Wrap-Up

In this article I have demonstrated CodeBuild local for Docker. But you can use this for other build types, e.g. Maven. For more detailed information, refer to the AWS blog post at https://aws.amazon.com/blogs/devops/announcing-local-build-support-for-aws-codebuild/

Amazon RDS Proxy – Improved Application Security, Resilience and Scalability

Amazon RDS Proxy is a fully managed, highly available database proxy for Amazon Relational Database Service (RDS) that makes applications more scalable, more resilient to database failures, and more secure.

https://aws.amazon.com/rds/proxy/

In this article I will demonstrate how you can configure an Amazon RDS Proxy for an Amazon Aurora database. With the provided Terraform code, you can launch a sample database to test RDS Proxy.

This short video presentation by AWS explains the benefits of RDS Proxy and demonstrates how it can be configured with the AWS console.

Database

The Terraform code at aw5academy/terraform/rds-proxy will create the following resources:

We will use the EC2 instance as a mock for an application that needs to communicate with our Aurora database.

Note: at the time of writing this article, Terraform does not support RDS Proxy resources. So we will need to manually create this component from the AWS console.

Let’s first deploy our Terraform code with:

git clone https://gitlab.com/aw5academy/terraform/rds-proxy.git
cd rds-proxy
terraform init
terraform apply

Once Terraform has been applied, it is worth examining the security groups that were created.

Inbound security group rules for our Aurora database
Inbound security group rules for our RDS proxy

We can see that the Aurora database only allows connections from the Proxy and the Proxy only allows connections from the EC2 instance.

Additionally, a Secrets Manager secret was created. Our RDS Proxy will use the values from this secret to connect to our database. Note how it is the proxy alone that uses these credentials. We will see later that our application (the EC2 instance) will use IAM authentication to establish a connection with the RDS proxy and so the application never needs to know the database credentials.

Secrets Manager secret containing our database credentials

RDS Proxy

Now we can create our RDS proxy from the AWS RDS console. During the creation of the proxy, provide the following settings

  1. Select PostgreSQL for Engine compatibility;
  2. Tick Require Transport Layer Security;
  3. Select rds-proxy-test for Database;
  4. Select the secret with prefix rds-proxy-test for Secrets Manager secret(s);
  5. Select rds-proxy-test-proxy-role for IAM role;
  6. Select Required for IAM authentication;
  7. Select rds-proxy-test-proxy for Existing VPC security groups;
Create RDS Proxy Settings
Create RDS Proxy Settings

Now wait for the proxy to be created. This can take some time. Once complete, obtain the RDS Proxy endpoint from the console which, we will use to connect to from our EC2 instance.

Application

Let’s test our setup. SSH into the EC2 instance with:

ssh -i rds-proxy-test.pem ec2-user@`terraform output ec2-public-ip`

From the terminal, set the RDSHOST environment variable. E.g.

export RDSHOST=rds-proxy-test.proxy-abcdefghijkl.us-east-1.rds.amazonaws.com

We can now test our connection to the database via the RDS proxy with:

./proxy.sh
Terminal output from successful connection to the database via the RDS proxy

Success! The proxy.sh script uses the psql tool and is obtaining the permissions to connect to the proxy via the aws rds generate-db-auth-token AWS CLI command. We can also use generate_db_auth_token from boto3 for Python:

python3.8 proxy.py
Terminal output from successful connection to the database via the RDS proxy

Wrap-Up

The RDS Proxy feature can improve application security as we have seen, with the proxy alone having access to the database credentials and the application using IAM authentication to connect to the proxy.

Application resilience is improved since RDS Proxy improves failover times by up to 66%.

Lastly, your applications will be able to scale more effectively since RDS Proxy will pool and share connections to the database.

To cleanup the resources we created, first delete the RDS Proxy from the console and then from your terminal, destroy the Terraform stack with:

terraform init
terraform destroy

Configure a Desktop Environment For an Amazon Linux EC2 Jumpbox

In this article I will show how you can launch an Amazon Linux EC2 instance with a desktop environment that will serve as a jumpbox. Connections to this jumpbox will be made through RDP via a session manager port tunneling session. By using session manager, our EC2 instance’s security group does not require ingress rules allowing RDP or other ports to connect, thus improving the security of the jumpbox.

Recommended Reading

Before continuing with this article I would strongly recommend reading my earlier article Access Private EC2 Instances With AWS Systems Manager Session Manager. That article will explain the fundamental workings of session manager and shows how to deploy resources to your AWS account that will be required for setting up the jumpbox described in this article.

Terraform

Firstly, if you haven’t already done so, deploy the Terraform code at aw5academy/terraform/session-manager to setup session manager. Be sure to also follow the Post Apply Steps documented in the README.md.

When the session-manager stack is deployed we need to read some of the Terraform outputs as we will need their values for the jumpbox stack’s input variables. We can retrieve the outputs and set them as environment variables with:

export TF_VAR_private_subnet_id=`terraform output private-subnet-id`
export TF_VAR_vpc_id=`terraform output vpc-id`

Now we can deploy the jumpbox Terraform code:

cd ../
git clone https://gitlab.com/aw5academy/terraform/jumpbox.git
cd jumpbox
terraform init
terraform apply

After the stack deploys, wait approximately 5 minutes. This is to allow time for the converge of the aw5academy/chef/jumpbox Chef cookbook which, is part of the EC2 instance’s user data. This cookbook installs the MATE desktop environment on the Amazon Linux instance. Also see here for more information on installing a GUI on Amazon Linux.

Jump

Let’s make sure we can connect to the jumpbox with a terminal session. The jump.sh script can be used:

bash jump.sh

You should see something like the following:

Now we can try a remote desktop session. Terminate the terminal session with exit and then run:

bash jump.sh -d

You should now see the port forwarding session being started:

Also printed are the connection details for RDP. Open your RDP client and enter localhost:55678 for the computer to connect to and provide the supplied user name. Check the Allow me to save credentials option and click Connect:

Provide the password at the prompt and click OK:

Success!

Behind The Scenes

An explanation of what is occurring when we use our jump.sh script…

In order to start an RDP session the client needs to know the username and password for an account on the jumpbox. Rather than creating a generic account to be shared among clients we dynamically create temporary (1 day lifetime) accounts. This is accomplished through the following actions:

  • The client creates a random username using urandom;
  • The client creates a random password using urandom;
  • The client creates a SHA-512 hash of the password using openssl;
  • The client puts the hashed password into an AWS Systems Manager Parameter Store encrypted parameter with a parameter name including the username;
  • The client uses the send-command API action to run the /root/create-temp-user.sh script on the jumpbox passing the username as a parameter;
  • The jumpbox retrieves the hashed password from parameter store;
  • The jumpbox deletes the hashed password from parameter store;
  • The jumpbox creates an account with the provided username and the retrieved hash of the password;
  • The jumpbox marks the account and password to expire after 1 day;

With these steps, the password never leaves the client and is always stored either encrypted and/or hashed and is only stored for as long as it is required.

Summary

That’s all there is to it. After your jumpbox is enabled you can configure your private applications to accept traffic from the jumpbox’s security group. The chromium browser can then be used to access these applications securely. I hope you find this article useful.

Access Private EC2 Instances With AWS Systems Manager Session Manager

In this article I will demonstrate how you can connect to EC2 instances located in private subnets by using AWS Systems Manager Session Manager.

Session Manager is a fully managed AWS Systems Manager capability that lets you manage your EC2 instances, on-premises instances, and virtual machines (VMs) through an interactive one-click browser-based shell or through the AWS CLI. Session Manager provides secure and auditable instance management without the need to open inbound ports, maintain bastion hosts, or manage SSH keys.

https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager.html

Terraform

The Terraform code at aw5academy/terraform/session-manager will provision the resources for us.

Deploy the stack by issuing the following commands:

git clone https://gitlab.com/aw5academy/terraform/session-manager.git
cd session-manager
terraform init
terraform apply

Post Apply Steps

Some things are not configured in Terraform and must be set manually. These are the Session Manager preferences. To set these:

  • Login to the AWS console;
  • Open the Systems Manager service;
  • Click on ‘Session Manager’ under ‘Instances & Nodes’;
  • Click on the ‘Preferences’ tab;
  • Click ‘Edit’;
  • Enable KMS Encryption and point to the alias/session-manager key;
  • Enable session logging to S3 bucket ssm-session-logs... with encryption enabled;
  • Enable session logging to CloudWatch log group /aws/ssm/session-logs with encryption enabled;
  • Save the changes;

Session Manager Plugin

To be able to use Session Manager from the AWS CLI you also need to install the Session Manager Plugin.

Start Session

Let’s try it out. First, we will use the AWS CLI to launch a new EC2 instance in the private subnet that was created by the Terraform code. This instance will have no key pair and will use the VPC’s default security group which allows no inbound traffic from outside the VPC.

aws ec2 run-instances \
    --image-id $(aws ssm get-parameters --names /aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2 --query 'Parameters[0].[Value]' --output text --region us-east-1) \
    --instance-type t3a.nano \
    --subnet-id $(terraform output private-subnet-id) \
    --iam-instance-profile Name=session-manager \
    --output json \
    --region us-east-1 \
    --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=session-manager-test}]' \
    --count 1 > /tmp/ssm-test-instance.json

Next, run this command to wait for the instance to become ready:

while true; do if [[ $(aws ssm describe-instance-information --filters "Key=InstanceIds,Values=`cat /tmp/ssm-test-instance.json |jq -r .Instances[0].InstanceId`" --region us-east-1 |jq -r .InstanceInformationList[0].PingStatus) == "Online" ]]; then echo "Instance ready." && break; else echo "Instance starting..." && sleep 5; fi; done

When the instance is ready we can connect to it with session manager:

aws ssm start-session --target $(cat /tmp/ssm-test-instance.json |jq -r .Instances[0].InstanceId) --region us-east-1

That’s it! You are now connected to a private EC2 instance in your VPC which, has no public IP, no key pair and no inbound access from outside the VPC defined in its security group.

Port Forwarding

As well as starting a shell session on an instance you can also use session manager to start a port forwarding session. Suppose you have an EC2 instance with a tomcat server running on port 8080, you could start a port forwarding session that maps local port 18080 to the instance’s port of 8080:

aws ssm start-session --target <INSTANCE_ID> \
                      --region us-east-1 \
                      --document-name AWS-StartPortForwardingSession \
                      --parameters "localPortNumber=18080,portNumber=8080"

You could then access the tomcat server via http://localhost:18080 on your workstation.

Restricting Access

You can create IAM policies to define who can access which instances. For example, the following policy will permit session manager access to instances that are not tagged with Team=admins:

{
  "Version":"2012-10-17",
  "Statement":[
    {
      "Effect":"Allow",
      "Action":[
        "ssm:DescribeInstanceInformation",
        "ssm:StartSession"
      ],
      "Resource":"arn:aws:ec2:*:*:instance/*",
      "Condition":{
        "StringNotEquals":{
          "ssm:resourceTag/Team": [
            "admins"
          ]
        }
      }
    },
    {
      "Effect":"Allow",
      "Action":[
        "ssm:StartSession"
      ],
      "Resource":[
        "arn:aws:ssm:*::document/AWS-StartSSHSession"
      ]
    }
  ]
}

Run As

By default, session manager sessions are launched via a system-generated ssm-user. We can change this by launching the session manager preferences, checking the `Enable Run As support for Linux instances option and providing the alternative user.

Now when we start a session we are logged in as this user:

Additionally, you may add a tag to IAM roles or users with the tag key being SSMSessionRunAs and the tag value being the user account to login with. This allows you to further control access to your EC2 instances. See here for more details on this.

Summary

I hope this article demonstrates both how useful session manager is and how easy it is to setup and configure. Beyond the advantages described above you also get a full log of all sessions delivered to a CloudWatch log group and an S3 bucket for auditing purposes. These are configured in the Terraform code I have provided.

Amazon Elastic File System (EFS) Integration With AWS Lambda

AWS has recently announced support for Amazon Elastic File System (EFS) within AWS Lambda. This change creates new possibilities for serverless applications. In this article I will demonstrate one such possibility — centralising the storage and updating of the ClamAV virus database.

ClamAV

ClamAV® is an open source antivirus engine for detecting trojans, viruses, malware & other malicious threats.

Like any antivirus solution, ClamAV needs to be kept up to date to be fully effective. Ordinarily the virus database can be updated by issuing the freshclam command. However, this requires that the instance running the command have internet access. When developing secure architectures in public cloud it is sometimes necessary to have fully isolated subnets which, do not have internet access. Additionally, strict security compliance requirements may dictate that virus definitions are not updated directly from the internet but instead be updated from a centralised location within the VPC.

Combining EFS, Lambda and EC2 we can create a configuration that will meet these requirements.

Design

The below diagram represents the architecture we will implement.

Our virus database will be stored on an EFS file system. EC2 instances will be configured to use this file system for their virus definitions (we will deploy the instance in a public subnet in this example just to keep things simple). A “freshclam” Lambda function will keep the virus database stored on EFS up to date.

Terraform

The Terraform code at aw5academy/terraform/clamav will provision the resources for us.

Deploy the stack by issuing the following commands:

git clone https://gitlab.com/aw5academy/terraform/clamav.git
cd clamav
terraform init
terraform apply

Chef

As part of the Terraform stack we create an EC2 instance. This instance’s user data clones the repository at aw5academy/chef/clamav containing a Chef cookbook which, bootstraps the instance, installing ClamAV, mounting the EFS file system and configuring the virus database to point to a path on the EFS file system.

EC2 Instance

Lets now login to our EC2 instance to test our setup.

SSH into the EC2 instance with:

ssh -i clamav.pem ec2-user@`terraform output ec2-public-ip`

Next verify no virus definitions are present:

clamconf |grep -A 3 "Database information"

As expected, we see none because our Lambda function has not yet executed. So lets invoke the “freshclam” lambda function with:

aws lambda invoke --function-name freshclam /dev/null --region us-east-1

Now verify the virus definitions are present:

clamconf |grep -A 3 "Database information"

As we now have a valid database we can perform a virus scan:

clamscan .bash_profile

Success!

Cleanup

To remove the stack, from your local terminal run:

terraform destroy

Summary

This is just one example of a real world application of EFS with Lambda. I hope you find this article and the sample code useful.