The last piece of our overall solution is the processing of a CSV file into a data store.
We will use Amazon DynamoDB as our data store and AWS Lambda to perform the CSV processing. This design was influenced by the AWS blog post at Implementing bulk CSV ingestion to Amazon DynamoDB | AWS Database Blog. In fact, our Lambda code is an extension of that provided in the blog.
Our design looks like the following.
Here, we have an EventBridge rule watching for tagging operations against S3 objects in our bucket. When detected, our Lambda is invoked which, loads each record of the CSV as an item in a DynamoDB table.
The Terraform code at aw5academy/terraform/csv-to-dynamodb can be used to create the required components.
Once applied, we have an empty DynamoDB table.
We now have everything in place to test our entire solution. To recap, this is what our infrastructure now looks like.
So when we upload a CSV file via SFTP we expect:
- the CSV file will be stored in S3;
- an ECS task will launch which will scan the file with ClamAV;
- if the file is clean, the S3 object will be tagged with av-status=CLEAN;
- the lambda function will be invoked and the CSV records loaded into DynamoDB;
Let’s try it. We will upload a CSV file via WinSCP. You may use the sample file at aw5academy/terraform/csv-to-dynamodb/sample-file.csv.
Now within a few minutes, if all is successful, we will see the items appear in our DynamoDB table.
The requirements presented to us were complex enough. Yet, by combining many services and feature within AWS, we have constructed a solution using no servers. I hope you found these articles useful.