Launching Matillion ETL for BigQuery - GCP
Matillion ETL for BigQuery is available through the Google Cloud Launcher service, hosted on the Google Cloud Platform Marketplace. It's capable of using select Google Cloud Platform services, such as Storage and BigQuery. If you haven't already done so, refer to the documentation on GCP Account Setup for BigQuery and Storage for more information.
Warning
The Hub doesn't support launching Matillion ETL for BigQuery. This guide describes how to launch this product directly from the Google Cloud Platform Marketplace.
It's strongly advised to review and follow the steps in GCP Account Setup for BigQuery and Storage, if you're setting up a new Matillion ETL Instance.
Prerequisites
Before launching a Matillion ETL instance you will require:
- Adequate knowledge about the cloud service account (AWS, Azure, GCP) and Cloud Data Warehouse (Snowflake, Redshift, Google BigQuery and Delta Lake on Databricks) you want to launch.
- A user with the correct permissions who can access the intended cloud service account. For more information, read GCP Roles for Launching Matillion ETL.
- Access to a cloud storage bucket (S3, Azure, Blob Storage or Google Cloud Storage) to house the transient staging files Matillion used to load data to the cloud.
- A network path to access the intended data sources. This may involve working with your network team to enable access to on-premise databases.
GCP Roles for Launching Matillion ETL
To prevent a permission error you must ensure when launching a Matillion ETL instance for BigQuery you have the GCP role "App Engine Deployer" [roles/appengine.deployer] or above. This role grants adequate permissions to deploy using the App Engine Admin API. Further information about GCP roles can be found here.
You can find out what roles are granted on a resource by using the GCP Console, the getIamPolicy()
method, or the gcloud
command-line tool. For instructions see Granting, Changing, Revoking Access to Project Members.
Launching Matillion ETL on GCP
Use the following steps to launch Matillion ETL for BigQuery on the Google Cloud Platform:
- Access the Google Cloud Platform Marketplace. Use the magnifying glass icon on the top-right of the Marketplace homepage to search for the product. Type in to the search field "Matillion" or "Matillion ETL" to view the Matillion ETL products. Additional information will also be available about the product, such as an Overview, Pricing, Documentation and Support.
-
Click Launch. A new page will open that lets you configure your initial setup for your Matillion ETL instance.
-
Complete the following information for your deployment:
- Deployment Name
- Zone
- Machine Type
- Disk Type
- Dish Size in GB
- Network Name
- Subnet Name
- Firewall rules and tags to allow specific network traffic from the internet
Note
- You can use the default values in these fields if you're unsure about the specifications.
- When completing the Machine Type field, click Customize to refine the settings of your host machine.
-
When all the information has been completed, click Deploy at the bottom of the page to launch your Matillion ETL instance.
- After deploying, Google Cloud Launcher will begin the process of setting up the instance. In this transitional period, all details for the instance are likely to be listed as Pending. This process typically takes a minute or two.
- Once successfully deployed, the following list will change to show completed values. Before leaving this page, make note of the Admin user and Admin password (Temporary) for your Matillion ETL instance because these will be required for the initial login. Your instance can now be accessed by using the link next to Site Address.
- Site Address
- Admin User
- Admin Password (Temporary)
- Instance
- Instance Zone
- Instance Machine Type
- To revisit this page, log in to your Google Cloud Platform console, and select Deployment Manager under the Tools heading in the top-left menu of the homepage.
-
Log in to your Matillion ETL instance using "gcp-user" and use the default password given to you by Google when you launched the instance in your GCP console earlier.