Skip to content

Setup guide - Hybrid SaaS Databricks on Azure

This document describes the necessary steps to follow to set up your first working project in the Data Productivity Cloud for the following configuration options:


Prerequisites

Azure requirements

  • An Azure subscription with appropriate permissions to provision cloud resources in the Azure environment and manage access control, specifically for managing:
    • Resource groups.
    • Virtual networks.
    • Key vaults.
    • Container apps.
  • A suitable resource group already defined in your Azure environment.
  • A suitable virtual network already defined in your Azure environment. The virtual network must:
    • Be fully configured, including routing to on-premises resources.
    • Allow egress to Matillion's Data Productivity Cloud IP ranges.
    • Have room for one additional subnet with at least /27 IP range.
  • A suitable key vault. You can use an existing key vault or a new one will be created as part of the setup process.
  • Minimum permissions to include the following:
    • Create subnet.
    • Create managed identity.
    • Create key vault.
    • Create log analytics workspace.
    • Create container app and container app environment.
    • Modify subnet delegation.
  • Role assignments in the resource group and key vault.

Databricks requirements

Connectivity requirements

Git requirements

If you choose to use your own Git provider instead of the Matillion-hosted Git option, you need the following:


Setup steps

  1. Register for a Data Productivity Cloud account.
  2. Create accounts for users and admins who will be active in the Data Productivity Cloud.
  3. Create an agent in the Data Productivity Cloud.
  4. Deploy a Container App agent in Azure using the recommended ARM template.
  5. Create a project, making the following choices:
    • Select Advanced settings.
    • Select the agent you created and deployed previously.
    • Select the Git provider you wish to use.
  6. Create an environment using your Databricks credentials.
  7. Set up secret definitions for passwords, API keys, and tokens.
  8. Create a Git branch in which to begin pipeline work.
  9. Create your first pipeline.