AWS S3 lifecycle rule
Matillion ETL uses a Multipart upload when sending data to S3 which can potentially be interrupted if Matillion ETL crashes or is shut down during the upload. This means that partial data will remain on the S3 bucket that Matillion ETL cannot clean up as it normally would for uploads that complete normally.
Partial data is, however, still taking up space on the S3 bucket and is billable by AWS which means that removing the data can be important to the user. To this end, we recommend using a Lifecycle Rule.
A Lifecycle Rule is a simple way of instructing an S3 bucket to clean up files from failed multipart uploads after a designated amount of time.
Adding a new Lifecycle Rule
- Log into your AWS account dashboard.
- Browse to Services → S3.
- Select the S3 Bucket to add a Lifecycle Rule to (usually one used by Matillion ETL for staging data).
- Go to the Management tab.
- Click Add lifecycle rule.
Configuring the Lifecycle Rule
- Give your lifecycle rule a name but it does not need a filter. Click Next.
- Click Next on Transitions
- Check 'Current version' and 'Clean up incomplete multipart uploads' only on Expiration
Your lifecycle rule is now in effect and will automatically clean up failed uploads from Matillion ETL.