Triggering ETL from an email via SES and Lambda
This example details how to have the arrival of an email at an Amazon SES controlled email address trigger a Matillion job via AWS S3 and a Lambda function. This process will extract any TEXT/CSV attachments from emails sent to an SES email address and load them into the target database.
Why Do This
Perhaps you’d like the the ability for a business user to create a standard CSV file and simply email it to Matillion ETL, rather than have to give them access to and train them on using S3 buckets. Or perhaps you receive a CSV email from a supplier you’d like to simply forward to the target database.
This exercise requires some items set up in your AWS account before you begin:
- A domain name for which Amazon SES receives the email for - see the Integration with Amazon SQS.
The basic steps and services involved in this process are:
- User sends an email to a specific AWS SES controlled email address
- AWS SES copies the email in plain text format to the S3 Landing Bucket
- S3 Trigger calls a Lambda function to scan the email for TEXT/CSV attachments and output them to S3 Process Attachment bucket
- Same Lambda function writes a message to an SQS Queue containing the required parameters to run a Matillion ETL for (including the bucket and filename written by the Lambda function)
- Matillion ETL consumes the message from the SQS Queue and runs a job to load the file to the target database.
The process is summarised in this diagram:
Note: example files are attached to this article should you wish to try this yourself.
- Configure an email address to be managed by SES as per the AWS documentation
- Configure an SES Receipt Rule to perform an S3 action (see here) to place all received emails into the first S3 bucket (S3 Landing) , e.g.
- Any emails sent to this address will be dropped into the S3 bucket (in the example mtln-email-test) in plain text format. Note you may need to amend the bucket permissions to allow the SES process access
- Within the AWS Console create a New Lambda function for Python 2.7, ideally using the s3-get-object-python Blueprint (using the Blueprint will allow you to configure which bucket the trigger will be from, and a number of permissions to). If you are using the template method set these values on the next page:
- Name: Matillion_Email_Test
- Role: Create new role(s) from Template
- Role Name: Matillion_Email_Role
- Bucket: <choose the emails land in>
- Event Type: Object Created (All)
- Enable Trigger: Yes
- Press "Create Function"
- Replace the code contents of the Lambda function generated by AWS with the contents of the file lambda_matillion_email_to_execution.py
- Amend the variables at the top of the Python code to the specifics of your environment:
- S3_OUTPUT_BUCKETNAME: The bucket created for processed files to be output to
- SQS_MATILLION_QUEUE: The SQS Matillion is listening to
- MTLN_GROUP: The Project Group Name in Matillion ETL
- MTLN_PROJECT: The Project Name in Matillion ETL
- MTLN_ENV: The name of the environment to execute against in Matillion ETL
- Save the function (note: you may need to amend the Permissions on your S3 Process Attachment bucket to allow the Lambda function to write to it)
- Import the appropriate JSON file into your Matillion ETL instance (based on whether you’re using Redshift or Snowflake)
- Ensure your Matillion ETL instance is listening to the same SQS Queue as configured in the Python Lambda
- That’s it! Test by emailing the Data_matillion_email_to_execute_example.csv file attached to this article to your SES configured email address - If you are watching Matillion ETL you should see the imported job run.
- A table should be created in the target default schema containing the contents of the CSV, with a unique name.
Most problems you may experience with this will be related to permissions, ensure your Lambda function has the required permissions to:
- Read from your source S3 bucket
- Write to your target S3 bucket
- Write messages to your SQS queue
Do not hesitate to Contact Support for further assistance.