Connect your Databricks account to the Data Productivity Cloud via Partner Connect
This guide will help you connect your Databricks workspace to Matillion's Data Productivity Cloud using Databricks Partner Connect. Once connected, you can design and run data pipelines in the Data Productivity Cloud that interact with your Databricks workspace.
Start in Databricks Partner Connect
This section describes the steps you need to take in your Databricks workspace to connect to the Data Productivity Cloud.
Choose the Data Productivity Cloud listing
- Log in to your Databricks workspace.
- In the left-hand navigation pane, click Partner Connect.
- Search for "Matillion" and select the "Data Productivity Cloud" listing.
- Click Connect.
Select a compute resource
You will need to choose a compute resource that the Data Productivity Cloud can use to execute your data pipelines. This compute resource can be an existing Databricks cluster or a new cluster.
Select a Unity Catalog and schema
You will need to select a Unity Catalog and schema that the Data Productivity Cloud can use to create and manage your data. If you do not have a Unity Catalog or schema, you will need to create one.
Review and confirm setup
At this stage, review your selections and accept the terms and conditions. Then, click Connect to launch the Data Productivity Cloud.
Complete the setup in the Data Productivity Cloud
This section describes the steps you need to take in the Data Productivity Cloud to complete the connection to your Databricks workspace.
Register or log in
- If you're new to Matillion, read Registration to get started with a new account.
- For existing users, log in to the Data Productivity Cloud and select your account.
Resource provisioning
If you're signing up for the first time, a short introductory video will play, giving you an overview of the Data Productivity Cloud. During this time, Matillion is creating the necessary infrastructure for your account and securely connecting to your Databricks workspace. This will take a couple of minutes.
Get started with the Data Productivity Cloud
Once setup is finished, your page will refresh and you will see Designer, Matillion's UI for building data pipelines. Read Designer UI basics to learn more about Designer and its capabilities.
You will see a default project with the Databricks logo in the top left of the Designer project bar, confirming you're connected to your Databricks workspace. The project will precede a /
and then the name of the environment, which is using your chosen Unity Catalog and schema.
Click Schemas in the project bar to open the Schemas panel. If this panel displays a list of existing Databricks schemas, your connection is working correctly. If the panel is empty, try clicking the refresh icon to reload the list of schemas.
Next steps
You're now ready to build your first data pipeline in the Data Productivity Cloud.
- Use orchestration pipelines to move data into Databricks from your source databases.
- Use transformation pipelines to shape, clean, and prepare data that is already in Databricks.
Note
Many components let you sample the data output from that component, giving you the option to observe the state of data at various stages in your pipeline. Sampling also helps you to confirm that a component is set up correctly before you use it in a production pipeline.
Video guides
Matillion's YouTube channel has a range of video guides to help you get started with the Data Productivity Cloud. Visit How-to video guides to build your Data Productivity Cloud skills.
Got feedback or spotted something we can improve?
We'd love to hear from you. Join the conversation in the Documentation forum!