Skip to content

Azure Blob Storage lifecycle policy for file management

When using Azure Blob Storage as a streaming destination, the process described below provides a flexible and automated solution for managing object deletion based on tags in Azure Blob Storage.

By combining Azure Blob Storage lifecycle management with Azure Functions and Event Grid, you can automatically delete objects from pre-built pipelines based on their tags. The process requires some custom implementation, described below.

Here's how you can configure a solution to delete objects tagged with matillion_cdc_processed = true from the new pre-built pipelines:

Enable lifecycle management for the storage account

  1. Log in to the Azure Portal.
  2. Navigate to Storage Accounts.
  3. Choose a BlobStorage storage account.
  4. In the left-hand sidebar, type "lifecycle" into the search bar and click Lifecycle management.
  5. Click Enable.

Create an Azure Function

  1. Create an Azure Function that will handle the deletion of objects with the specified tag matillion_cdc_processed = true.
  2. In the Azure Function, implement the logic to query the blob container using the Azure Blob Storage SDK or Azure Blob Storage REST API.
  3. Iterate through the objects in the blob container and check for the matillion_cdc_processed = true tag.
  4. Delete the objects that meet the tag criteria.

Trigger the Azure Function with Azure Event Grid

  1. Set up Azure Event Grid to monitor the blob container for new object creations or changes.
  2. Configure an Event Grid subscription to trigger the Azure Function when a new object is created or changed in the blob container.

Add the tag during object upload

Ensure that the new pre-built pipelines add the tag matillion_cdc_processed = true to the objects they upload to the blob container. Tags can be added using the Azure Blob Storage SDK or Azure Blob Storage REST API during the object upload process.

When the new shared jobs upload objects with the matillion_cdc_processed = true tag, Event Grid will trigger the Azure Function. The Azure Function will then delete the objects from the blob container based on the tag criteria.

Got feedback or spotted something we can improve?

We'd love to hear from you. Join the conversation in the Documentation forum!