Serverless Event Submission Pattern using Terraform and AWS

The following article and architecture are inspired by an actual project where we designed & implement a system that processed a large number of invoices at regular intervals. 

The actual project is much more complex than what we will present in this article. Here we explained an architecture pattern for serverless applications and showed how to create that infrastructure using Terraform code and AWS services.

Our situation matches an event-driven architecture pattern: a third-party system generates a couple of thousands of invoices, and our application needs to process these invoices. Since this processing is something that happens once a month, the “going serverless” decision was an obvious one.

You are probably familiar with the term “serverless.” It represents an abstraction of your computer infrastructure to the point that you don’t have to worry or be responsive to the physical machines you run the code on. And, of course, you are not paying for the idle server time.

To give more details about the project:

 A few thousand invoices are generated every month at regular intervals.

These invoices need to be processed; some math needs to be done, and we will need to send some notifications, process some payments, call another 3rd party accounting API, and in the end, save the data into a database table.

As you can see, several tasks need to be done for each invoice, and sometimes, in certain conditions, you need to do “action A,” while in other cases, you need to do” action B.”

Now that you have a general idea about our challenge let’s cover the project from a theoretical point of view, and in the second part of the article will look over the actual code.  

This theoretical section discusses the architectural pattern and explains how we reach the final solution. We also show how the infrastructure evolved from a simple but problematic design to a more complicated but reliable one. 

The obvious starting point for this project was to implement an API Gateway(our event source calls that) that invoke a Lambda function to process the invoice. 

The Lambda function will consume the event produced by the 3rd party system and also is the source for down steam processing. In our case, it will save the processed data into a Dynamo DB table.

It looks like a simple serverless pattern:

  • The event comes via an API call over HTTP
  • Lambda process this event
  • Lambda send the processed data to a Dynamo Db for perssitent storage

While this sounds like a ready-to-go architecture, some problems must be addressed.

For example:

  • The API Gateway expects a fast response as a synchronous event source for the Lambda function. The API has a 30 seconds interval. Our lambda function has 15 min timeout, and while an invoice process will not take that long, there is a big chance that it will take more than 30 sec.
  • There are no automatic retries for synchronous invocations, such as between API Gateway and Lambda. It is necessary to implement error handling and retries for all types of errors in the code.
  • Any problems that happen downstream components will return to the client/event source.

The solution to all these challenges is to switch from a synchronous pattern to an asynchronous one. And we did that by introducing an SQS between the Api Gateway and the Lambda function. 

This change will allow for a fast response to the API no matter how long the Lambda function will take to process the invoice.

So our architecture becomes something like this. 

The SQS will respond to the API gateway with a message Id. Later on, you can use this Id for message tracking.

The invoice/messages will be stored durably in the queue, while Lambda will pool the SQS for new messages/ orders. The lambda function is invoked when new items are added, and the invoice data is used as the input parameter.

When your Lambda function successfully processes a batch, the Lambda service deletes that batch of messages from the queue. However, if your function returns an error or doesn’t respond, the messages in that batch become visible again.

In the same function, it would help if you also implemented a delete mechanism for successfully processed messages. Lambda gets data from SQS in batches, and if a message processing fails, the whole set is added again to the queue, and you risk to double processing successful ones.

We also added a Dead Letter Queque as part of the SQS implementation. This second queue is where we send messages that we fail to process. We will then investigate and fix the problems that caused the failure.

While things look promising, we are not done yet. Remember that we have to do a series of processing steps for each invoice: saving to Dynamo Db, calling an accounting software API, notifying an AWS SNS topic, etc.

We could chain some Aws lambda functions, but we will end up with something like this.

There are some obvious problems with this design.

  • First, we must do error checking and retry processing for each function.
  • We will have to consider other steps in the chain that may have a longer duration.
  • And if a problem occurs during this processing, we must roll back the already-made transactions.

A design like the above will increase the coupling and interdependencies between functions. 

So, the final step will be to replace the whole Lambda logic with an AWS Steps function. The step function will help us orchestrate the workflow, track ongoing tasks, and complete the invoicing process.

With this architecture, the data flow will be like this.

The client/ event source – sends those thousands of invoices into our API gateway. The API gateway will send those to an SQS queue.

A lambda function will get a batch of messages from the queue and initiate a Step Function execution flow passing along the invoice data. Then, the tasks defined inside the step function will perform the actions required. Specific steps (like saving into Dynamo DB) will call additional Lambda functions, while for others, it will call different AWS services.

The Step Function flow will orchestrate the invoice processing steps. For example, it will ensure the designed tasks are executed in order, cover error handling, do retries with a looping pattern, or try/catch logic for errors.

Simultaneously, you can implement parallel execution or a mix of the abovementioned patterns. 

We can say many things about the step function, but this article is not in place. So here we just covered this event processing pattern.

After we discuss the theoretical part, we can move to the coding one. The link to part 2 is https://sg12.cloud/serverless-event-submission-pattern-using-terraform-and-aws-part-2/

The Terraform implementation code can be found here: https://github.com/crerem/event-submission-pattern-live

Share this:

Related Posts