Skip to content

Installing Matillion ETL using the Universal Installer (RPM install)

Matillion ETL is distributed as a preconfigured platform image for all of our supported cloud platform and cloud data warehouse combinations. This contains all of the required services and configurations required to use the product. However, this may not be the most suitable way to deploy Matillion ETL in your environment. If you have particular requirements, it's possible to build your own image for use. These requirements may include:

  • A desire to use a different base operating system than our default.
  • Additional operating system services or security hardening features—typically additional logging or monitoring functionality.
  • Features not exposed in our preconfigured images, such as particular partitioning requirements.

We have made our Universal Installer (URPM) repository publicly accessible. It is the same repository that we use for preparing our preconfigured images.

It is not possible to use marketplace licensing with URPM images. All images must be either bring your own license (BYOL) or Hub licensed. For simple Hub images, we suggest using the subscription manager in Hub to select your preferred cloud provider and cloud data warehouse. For further information, read Hub - Instance Creation.

To update Matillion ETL to a specific version, read Updating to a specific release.

Note

It is recommended that you open port 443 for outgoing traffic for the duration of the installation only.


Select a base virtual machine

You will need to have a base virtual machine image available. Matillion ETL can be installed directly 'on the metal', straight onto an unvirtualized machine, but this is a rare use case and we assume that you are not going to do this. Matillion verifies URPM installations on the following base operating systems:

  • Amazon Linux 2
  • CentOS Linux 7 (default before version 1.68)
  • Red Hat Enterprise Linux 7

Since the release of version 1.69, the following are also supported:

  • CentOS Stream 8 (default after version 1.69)
  • Red Hat Enterprise Linux 8
  • Rocky Linux 8

Since the release of version 1.75, the following are also supported:

  • CentOS Stream 9 (default after version 1.75)
  • Red Hat Enterprise Linux 9
  • Rocky Linux 9

Other operating systems may also install and run correctly. Matillion ETL is distributed as an RPM package, meaning that Yum-based operating systems should be able to carry out all of the installation steps. Unverified operating systems may cause us difficulties in responding to your support cases, however.

Our instance size recommendations also apply to your own base image. A CPU with 4 threads and 8 GB of RAM, and a disk image of at least 16 GB are required at a very minimum. Your use case may require additional resources and we would recommend more than this.

Log in to your virtual machine with an administrator account to continue.

Configure the installation process

The installation process itself is configured using an installation.vars file. This specifies the versions of services that are going to be installed, as well as some additional configuration options. Usually, the default settings are sufficient, but you can change them as you wish.

To download the default installation.vars file, run the following command:

curl https://bitbucket.org/matillion/matillion-etl-system-configuration/raw/production/example-config/installation.vars --output /tmp/installation.vars

For example, to change the versions of Python that are installed, find the following line:

# PYTHON_VERSIONS: 2.7.18 3.12.2

To skip installation of Python2 completely, and to use the latest version of Python3.10 instead—perhaps you have in-house modules that depend on this version—uncomment it and change it to:

PYTHON_VERSIONS: 3.10.14

Configure the Matillion ETL basic properties

The Emerald.properties file (which will be installed at /usr/share/emerald/WEB-INF/classes/Emerald.properties following installation) is used to define a number of the configuration options of Matillion ETL. Significantly, at the very least it must contain entries for your desired combination of cloud platform and cloud data warehouse, and if using Hub billing, the entries that enable it.

It can also contain any other environment variables that can be used in Matillion ETL and that you would usually set on your instances, as per the other documentation on this site.

PLATFORM_TYPE=one of [aws | azure | gcp ]
DATABASE_ENVIRONMENT=one of [ bigquery | deltalake | redshift | snowflake | synapse ]

# details on the Postgres persistence store, which default to the following:
PERSISTENCE_PASSWORD_POSTGRES=postgres
PERSISTENCE_USERNAME_POSTGRES=postgres
PERSISTENCE_STORE_NAME=postgres

# if billing is to be enabled, it will also contain the following:
SEC_ENABLE_SAAS_BILLING=true
BILLING_API_BASE_URL=https://api.billing.matillion.com
BILLING_URL=https://billing.matillion.com

We have prepared 'default' examples of these for every supported cloud platform and cloud data warehouse combination:

Warehouse AWS Azure GCP
Snowflake BYOL Hub BYOL Hub BYOL Hub
Delta Lake on Databricks BYOL Hub BYOL Hub n/a
Redshift BYOL Hub n/a n/a
BigQuery n/a n/a BYOL Hub
Synapse n/a BYOL Hub n/a

Save this file as /tmp/Emerald.properties. For instance, to save the default configuration for AWS Snowflake using Hub licensing, copy the link above and run a command like this to save it:

curl https://bitbucket.org/matillion/matillion-etl-system-configuration/raw/production/example-config/Emerald.properties-AWS-SF-HUB --output /tmp/Emerald.properties

Execute the install script

Finally, run the following commands to download and run the installation script:

curl https://bitbucket.org/matillion/matillion-etl-system-configuration/raw/production/orchestrate-install.sh --output /tmp/orchestrate-install.sh
sudo bash /tmp/orchestrate-install.sh /tmp/installation.vars

Depending on the instance size and installation options that you have selected, the installer script typically takes between twenty and sixty minutes to run. Once it has completed, your Matillion ETL instance will be ready to use.


Send the universal installer output to a file on disk

If you need support, capturing the output of the universal installer helps our support team to work with you to solve your support issue.

These steps can also be found here.

  1. When executing the installer's orchestration shell script, orchestrate-install.sh, pipe the standard output of the script (and child Ansible playbook) to a file in the desired location. In the following example, the installer log file can be located at /var/log/metl-installer.log.

    sudo bash /tmp/orchestrate-install.sh /tmp/installation.vars | tee /var/log/metl-installer.log
    
  2. If necessary, review the logs for signs of a critical failure and remediate as possible.

  3. If assistance is required, submit a support case at https://support.matillion.com and include the log file for review guidance as needed.