This article describes a small serverless project we did for MlsImport – a WordPress plugin provider that handles replication and synchronization of real estate data. But before diving into explanations, here are some details to better understand the context.
The main task of MlsImport.com is to (you guessed it) import real estate data from various MLSs, transform it, and populate WordPress-based real estate websites with this information.
MLS stands for Multiple Listing Services. It is a comprehensive database and online platform used by real estate agents and brokers to share information about available properties for sale or rent.
The MLS is a powerful tool that facilitates cooperation and collaboration among real estate professionals, allowing them to access and share detailed property information. An MLS is commonly operated at the local or regional level, with multiple MLS systems (over 600 in the USA ) existing across different areas.
Now, each MLS should offer a web API that can be used by their members to read and use the real estate data or to replicate it. These APIs are maintained by several organizations that provide technological support. While all the APIs follow the RESO standards, they have differences, like how data is served or authentication is made.
MlsImport works with several hundred MLSs – their WordPress plugin connects to each APIs, reads the real estate data, transforms it, and then replicates it on the realtor website. Besides that, they implement a by-hour synchronization routine that checks if the data is changed and update the information on the client website. Quite complex.
Now, about our challenge – Serverless reconciliation of MLS data
When the data is not valid anymore ( like when a property is sold or not in the market anymore), the MLS operator should change its status into a “deleted” mode and do the database delete after x days.
But in real life, the MLS operator/realtors delete the property data from the database and move on. This situation would be acceptable for platforms that read data directly from the API. But for the replication context, it only means that a particular website will end up with stalled data because there is no way to get notified.
The best solution for this case will be for the API maintainers to implement a webhook system. Once a record is deleted, the MLS system will call a webhook and notify the client.
However, that option is not available at this point; the only viable solution is implementing a tracking system. Every day, MlsImport.com systems will request from the APIs the listing ids that are still available and send those to client websites for comparison.
Our challenge was to redo this reconciliation process to accommodate new API systems and develop it as a serverless application in the AWS environment.
We used Terraform, AWS Serverless Application Model, and GitHub for the code repository. While we cannot share the actual code, we can explain the whole architecture and highlight the power of a serverless application.
We start by implementing an Event Bridge Rule that will be triggered once per day. The target of this Rule will be a Lambda function that will create a list with the MLS and clients that need daily data reconciliation.
Now, this list is quite long. For each item in the list (a client-MLS pair), we have to pick the API we need to call, ask for an authentification token, check if that is valid, then do the actual call to the MLS APi. And if things are not complicated enough: the MLS API responses are paginated -so we have to implement a loop algorithm that will go through all the MLS API responses and pick only the listing’s unique ids.
The first step of the solution is to make the initial Lambda function send each item in an SQS. There are several advantages to this approach:
- Decoupling allows for better fault tolerance and scalability.
- Asynchronous process – once the items are sent to the SQS, a second processing lambda can work those indecently and in parallel.
- Retry and error handling. Suppose a Lambda function fails to process a message multiple times. In that case, you can configure a dead-letter queue (DLQ) in SQS to store these failed messages for further analysis and troubleshooting.
For this SQS, we attached a second Lamnda function that will do some extra processing, gather additional data, and send the new item to a Step function.
Now, this Step Function is the one that handles the management of the reconciliation process. Here is a simplified version of the real function
- The first thing we do is to check the input event. Based on the data we found there, we decided what kind of API provider we needed to use
- After that, we get the authentication token that needs to be attached to any MLS API request. Each API provider has a different way of implementing this, so we needed separate lambda functions to call the authorization endpoint.
- Once we receive the authorization token, we call the API endpoints that will reply with the available listings ids. Since these requests need to be paginated, we also implemented a loop that checks if we reach the end of the data.
- SNS notifications are in place in case of failed operations
We choose to save the data into S3: for each pair of userId/MLS, we create a new object and store the IDs there. We prefer S3 to store the data because it is inexpensive, highly scalable, durable, and secure. S3 is slower in response than a database; however, the cost factor was decisive since speed is not critical (data is required only once daily).
The MlsImport WordPress plugin has a cron built in that will also daily and will request a list of live data from an API Gateway endpoint.
Our take from this whole project – while the Lambda functions are the key component of a serverless application, the Step Functions are the ones that will manage and coordinate your application.
In real life, You cannot develop a solid, scalable serverless app with step-function capabilities to orchestrate and coordinate distributed services
Please do not hesitate to contact our team if you require assistance in seamlessly integrating MLS Reso Web API or expanding and migrating your cloud infrastructure. We provide professional support and guidance to ensure a smooth and efficient process tailored to your specific needs.