Skip to content

Agent (GCE) template

This article details how to install the Matillion CDC agent on Google Compute Engine using templates.

It's expected that users who choose to use a Terraform template will have a working knowledge of infrastructure as Code using Terraform in Google Cloud. Users should familiarize themselves with the official documentation before continuing:

Terraform template files can be found in the Templates section of this article.

The template provides a blueprint for installation that you may use verbatim, but you may need to modify it to suit your own needs and rules governing your cloud infrastructure.


Created Resources

This template will create the following resources in your Google Cloud account:

  • A Google Compute Engine instance.
  • A Google service account (to be used by the Compute Engine instance).
  • A custom IAM role with permissions required to access the Cloud Storage bucket and Secret Manager.
  • Grants service account the custom IAM roles to access the Cloud Storage bucket.
  • Grants service account the custom IAM roles to access platform key secret and database password secret.
  • Grants service account the Logs Writer role.
  • Grants service account the Metric Writer role.

Prerequisites

Edit the template

Users should inspect the template in a text editor and ensure all the values are specified before proceeding. In particular, the Agent ID, Organization ID, and the Platform Websocket Endpoint URL environment variable should be edited to match the expected endpoint and region. See Environment Variables for more information.

Resources

The template assumes you have certain resources already set up in your GCP stack. You will also need to provide details on these resources, such as names and paths.

  • Google Cloud Project ID where resources will be created.
  • Google Cloud Region and Google Cloud Zone.
  • Google Cloud Storage bucket name (the template doesn't create the Cloud Storage bucket).
  • Google Secret Manager Secrets.
  • Platform Key secret name (for information on the Platform Key see here).
  • Database Password secret name.
  • customer private cloud Network name to be attached to the Compute Engine instance.
  • The customer private cloud network and associated firewall rules should be configured so that the agent can communicate with the Matillion CDC platform.

User Access

You as a user are also expected to have access to certain details and permissions:

  • Access to a valid Terraform installation.
  • Access to the Hub account and Data Loader.
  • CDC agent environment variables (generated in Data Loader when creating a new agent) Agent ID, Organization ID, and the Platform Websocket Endpoint URL.
  • Data Loader platform key (generated once per Data Loader account the first time you make an agent).
  • Google Cloud account with the ability to create an instance on a billable account and create/grant IAM Roles. You may require an administrator from your organization to either give access or perform this process with you.

Recommendations

This template is intended for users who are accustomed to setting up their own GCE instance with a level of permissions and access to the resources.

  • Create a new project specifically for CDC resources.
  • Use an installation template, if possible.
  • Consult your cloud/network administrator for advice on customer private clouds, subnets, and other resources such as Google Cloud regions.
  • Keep resources in the same Google Cloud region.

Environment Variables

The template needs the following environment variables in order for Data Loader to recognize the agent.

Environment Variable Description
project_id This is your Google Cloud Project ID.
region Google Cloud region where the Compute Engine instance will be deployed.
zone Google Cloud zone where the Compute Engine instance will be deployed.
network_name Google Cloud network to attach to the Compute Engine instance
instance_name Name of the Compute Engine instance. For example, matillion-cdc-agent.
storage_bucket_name Name of the Google Cloud Storage bucket where the agent will land the data.
organization_id This is provided to you by the Data Loader client when setting up a new agent.
agent_id This is provided to you by the Data Loader client when setting up a new agent.
platform_websocket_endpoint Set this to wss://ws-us.matillion-cdc-prod.matillion.com/ws where <region> is either eu or us depending on the Data Loader region you are building the pipeline in.
platform_key_secret_name Name of the Platform Key Secret stored in the Google Secret Manager.
database_password_secret_name Name of the source Database Password Secret stored in the Google Secret Manager.

Templates