Launching Matillion ETL using CloudFormation Templates
Note
New customers must go through the Hub to select their preferred cloud provider and data warehouse to begin their Matillion ETL journey.
Overview
This guide explains how to create a Matillion ETL instance using the Hub, launching on AWS via a CloudFormation template. You are able to launch Matillion ETL using one of three templates:
- Single Node
- Single Node and RDS
- Clustered
The process for launching a Matillion ETL instance using these templates is identical, regardless of your chosen cloud data warehouse, except when you are required to specify parameters for your AWS stack. As such, this guide can be used for any of the templates selected, and any differences where applicable, are described below.
Prerequisites
Prior to launching a Matillion ETL instance you will need to register for a Hub account. You will also require:
- Adequate knowledge about the cloud service account (AWS, Azure, GCP) and Cloud Data Warehouse (Snowflake, Redshift or Google BigQuery) you want to launch.
- A user with admin permissions who can access the intended cloud service account.
- Access to a cloud storage bucket (S3, Azure, Blob Storage or Google Cloud Storage) to house the transient staging files Matillion used to load data to the cloud.
- A network path to access the intended data sources. This may involve working with your network team to enable access to on-premise databases.
Before you create a stack from a CloudFormation template in AWS, you must ensure that all dependent parameters that the template requires are already defined. See our List of CloudFormation templates for links to the templates.
Launching Matillion ETL using an AWS CloudFormation template
Use the following steps to access the Hub to create and launch a Matillion ETL instance through AWS, using a CloudFormation template:
-
Sign in to the Hub. Follow the steps in Matillion ETL Instance Creation.Make sure you select AWS as your cloud provider, and CloudFormation Template when you choose how you want to deliver your Matillion ETL instance, and continue to follow the steps you will see on the upcoming steps in the Hub.
-
Based on your selections, click Continue in AWS. The Hub and AWS will pre-populate the CloudFormation template for you to review. You will be redirected to the AWS console, where the Quick create stack page will be displayed.
Note
- A template can use or refer to both existing AWS parameters, and parameters declared in the template itself. AWS CloudFormation checks references to parameters in the template, and also checks references to existing parameters to ensure that they exist in the region where you are creating the stack.
- Although you may see some references to Amazon Redshift in the screenshots, this process is applicable regardless of the chosen cloud data platform you choose.
-
In AWS, the Stack Name will be pre-populated. You can enter a custom name for your stack, if you wish.
- Specify your stack's parameters. For detailed information about these, see Specifying stack parameters.
Specifying stack parameters
The parameters at this stage will vary according to the template you selected at the beginning of this process. Use the parameter drop-down menus to complete the details in AWS for the relevant parameters below.
Network and security configuration
Keypair name
The Keypair Name drop-down menu will be pre-populated. Select the key pair you want to add to the set of keys authorized for your intended Matillion ETL instance.
The CloudFormation template contains an input parameter, KeyName
, that specifies the key pair used for the Amazon EC2 instance that is declared in the template. The template depends on the user who creates a stack from the template to supply a valid Amazon EC2 key pair for the KeyName
parameter. If you supply a valid key pair name, the stack creates successfully. If you don't supply a valid key pair name, the stack is rolled back.
Make sure you have a valid Amazon EC2 key pair and record the key pair name before you create the stack. To see your key pairs, open the Amazon EC2 console, then click Key Pairs in the navigation pane.
Note
If you don't have an Amazon EC2 key pair, you must create the key pair in the same region where you are creating the stack. For information about creating a key pair, see Create a key pair using Amazon EC2.
Primary Subnet
The Primary Subnet drop-down menu will be pre-populated. Use the drop-down menu to select an existing public subnet to launch the Matillion ec2 instance into. For more information, read Subnets for your VPC.
Private Subnets
The Primary Subnet drop-down menu will be pre-populated. Use the drop-down menu to select two or more private subnets across multiple availability zones, so they can be used by secondary resources. For example, Postgres Failover. For more information, read Subnets for your VPC.
Public subnet
The Public Subnet drop-down menu will be pre-populated as it's already been configured when you created a VPC. The public subnet must be within the VPC CIDR range. Use the drop-down menu to select a public subnet to launch the Matillion ec2 instance into.
VPC Id
A virtual private cloud (VPC) is a virtual network dedicated to your AWS account. It is logically isolated from other virtual networks in the AWS Cloud. You can launch your AWS resources, such as Amazon EC2 instances, into your VPC.
The VPC Id drop-down menu will be pre-populated. Use the drop-down menu to select the VPC in which to create security groups. This must be the VPC containing the subnet.
VPC IPv4 CIDR
The VPC IPv4 CIDR drop-down menu is pre-populated, as it's been configured into the CloudFormation template, and been passed into the field set. Use the drop-down menu to make your selection.
When you create a VPC, you must specify a range of IPv4 addresses for the VPC in the form of a Classless Inter-Domain Routing (CIDR) block; for example, 10.0.0.0/16
. You can't specify an IPv4 CIDR block larger than /16. This is the primary CIDR block for your VPC.
AZ 1
The AZ 1 drop-down menu is pre-populated. Use the drop-down menu to select an Availability Zone from your current region. For more information, read Regions and Zones.
Inbound IPv4 CIDR
Enter the Inbound IPv4 CIDR range in the field provided for your Matillion ETL instance. The aforementioned range for a Matillion Instance is 0.0.0.0/0
Note
If you have chosen to use Amazon Redshift as your cloud data platform, you must configure a Redshift Cluster as part of your Stack. To complete this, refer to Redshift Configuration for Matillion ETL.
Capabilities
- Tick the Capabilities checkbox at the bottom of the page to confirm you acknowledge that AWS CloudFormation might create IAM resources.
- Click Create Stack. Upon completion, launch your Matillion ETL instance.
Next steps
For the next steps, refer to the following documentation for further guidance: