Skip to content

Snowflake Azure Storage Integration setup guide

Overview

This topic covers how to set up a storage integration for use in Snowflake instances on Microsoft Azure. Matillion ETL for Snowflake on Azure requires users to select a storage integration when configuring data staging components.

A storage integration is a named Snowflake object that removes the need to pass explicit cloud provider credentials such as secret keys or access tokens. An integration object references an Azure storage account.

Note

  • Only users with the Snowflake role ACCOUNTADMIN, or a role with the global CREATE INTEGRATION privilege, can execute the SQL commands mentioned in this topic.
  • Any identifiers not enclosed in double quotes ( " " ) aren't case sensitive. For example, if you create an integration called DocsTeam it will be passed by Snowflake as DOCSTEAM.

Create a storage integration in Snowflake

  1. Log in to your Snowflake account.
  2. In either an existing Snowflake worksheet or a new worksheet, use the CREATE STORAGE INTEGRATION command.

The form of the command is as follows:

CREATE STORAGE INTEGRATION <integration_name>
    TYPE = EXTERNAL_STAGE
    STORAGE_PROVIDER = AZURE
    AZURE_TENANT_ID = '<tenant_id>'
    ENABLED = TRUE
    STORAGE_ALLOWED_LOCATIONS = ('azure://<account>.blob.core.windows.net/<container>/<path>/', 'azure://<account>.blob.core.windows.net/<container>/<path>/')
    [ STORAGE_BLOCKED_LOCATIONS = ('azure://<account>.blob.core.windows.net/<container>/<path>/', 'azure://<account>.blob.core.windows.net/<container>/<path>/') ]

This uses the following variables:

  • <integration_name> The name of the new integration.
  • <tenant_id> The ID for your Office 365 tenant that the allowed and blocked storage accounts belong to. A storage integration can authenticate to only one tenant, and so the allowed and blocked storage locations must refer to storage accounts that all belong this tenant.
  • <account> The name of your Azure storage account. Use the blob.core.windows.net endpoint for all supported types of Azure blob storage accounts, including Data Lake Storage Gen2.
  • <container> The name of a Azure blob storage container that stores your data files.
  • <path> An optional file path used to provide granular control over objects in the Azure container.

To find your <tenant ID>, log in to the Azure portal and click Azure Active Directory, then click Properties. The tenant ID is displayed in the Directory ID field.

The parameters STORAGE_ALLOWED_LOCATIONS and STORAGE_BLOCKED_LOCATIONS let you specify any containers that you wish to restrict access to, or block access from, respectively. As shown in the above template, locations are enclosed in single quotation marks, and multiple locations separated by commas. The STORAGE_BLOCKED_LOCATIONS parameter is optional.

Additional stages that also use this integration can reference the permitted container.


Grant Snowflake access to the storage locations

  1. In your Snowflake worksheet, execute the Describe Integration command to retrieve the consent URL.

    DESC STORAGE INTEGRATION <integration_name>
    

    Where <integration_name> is the name of the integration you created in Create a storage integration in Snowflake.

  2. In the output table created by the DESC command, locate the AZURE_CONSENT_URL and AZURE_MULTI_TENANT_APP_NAME entries in the property column, and make note of the values given for them in the property_value column.

  3. In a web browser, navigate to the AZURE_CONSENT_URL you noted earlier. The page displays a Microsoft permissions request page.
  4. Click Accept. This allows the Azure service principal created for your Snowflake account to obtain an access token on any resource inside your tenant. Obtaining an access token succeeds only if you grant the service principal the appropriate permissions on the container. The page then redirects to the Snowflake corporate site (snowflake.com).
  5. Log in to the Microsoft Azure portal.
  6. Navigate to Azure Services then Storage Accounts. Click on the name of the storage account you are granting the Snowflake service principal access to.
  7. Click Access Control (IAM), then Add role assignment.
  8. Select the desired role to grant to the Snowflake service principal:
    • Storage Blob Data Reader grants read-only access. This allows loading data from files staged in the storage account.
    • Storage Blob Data Contributor grants read and write access. This allows loading data from or unloading data to files staged in the storage account. The role also allows the removal of files staged in the storage account.
  9. Search for the Snowflake service principal. This is everything before the underscore in the AZURE_MULTI_TENANT_APP_NAME property noted earlier.

    It can take an hour or longer for Azure to create the Snowflake service principal. If the service principal is not available immediately, we recommend waiting an hour or two and then searching again.

  10. Click Save to complete the role assignment.

    Role assignments may take up to five minutes to propagate.


Create an external stage

Creating a stage that uses a storage integration requires a role that has the CREATE STAGE privilege for the schema, as well as the USAGE privilege on the storage integration.

Create an external Azure stage that references the storage integration you created in Create a storage integration in Snowflake using the following commands:

use schema <schema>;

create stage <my_azure_stage>
    storage_integration = <integration_name>
    url = 'azure://<account>.blob.core.windows.net/<container>/<path>'
    file_format = <format>;

Where:

  • <schema> is the current database and schema for the user session.
  • <my_azure_stage> is the name of the stage you are creating.
  • <integration_name>is the name of the integration you created in Create a storage integration in Snowflake.
  • <account> is the name of your Azure storage account. Use the blob.core.windows.net endpoint for all supported types of Azure blob storage accounts, including Data Lake Storage Gen2.
  • <container> is the name of a Azure blob storage container that stores your data files.
  • <path> is an optional file path used to provide granular control over objects in the Azure container.
  • <format> is a named file format object.

Using the integration in Matillion ETL

  1. In Matillion ETL, create a new orchestration job, and drag your chosen data staging component onto the canvas.
  2. Click on the data staging component, then click its Properties tab and set the following properties:
    • Stage Platform: Select Existing Azure Blob Storage Location.
    • Storage Account: Select a storage account with your desired blob container to be used for staging the data.
    • Blob Container: Select a blob container to be used for staging the data.
    • Stage Authentication: Select Storage Integration.

Contact support

If you require assistance creating a storage integration in Snowflake, or with any other aspect of Matillion ETL, please visit our Getting Support page.

Please consult the Snowflake documentation to read deeper into these topics, for example, to learn how to modify an existing stage to use a storage integration.