Configuring a high availability cluster (Azure)
Overview
This guide explains how to configure a High Availability (HA) Matillion ETL clustered instance, with an external PostgreSQL database on Microsoft Azure. Configuring HA will allow your Matillion ETL instance to continue running even if certain components fail.
Prerequisites
Before following the HA configuration process, you will be required to create:
- One Azure Database for PostgreSQL. Alternatively, you can use a different PostgreSQL database.
- Two Matillion ETL instances. Make sure both have Enterprise mode enabled. For more information, read Instance Sizes Guide.
- A Load Balancer. This provides high availability by distributing incoming data into your Matillion ETL instances, and lets you point users to one or more endpoints. There are a number of load balancers you can create. It's recommended that you use an Azure Application Gateway. For more information, read the Azure Application Gateway documentation.
Note
The "stickiness" in an application gateway is removed when the user closes the browser, and doesn't have a timeout like an AWS Application Load Balancer (ALB). You can use other load balancers if this does not meet your requirements.
Note
Your Matillion ETL instances must be in the same region due to network latency issues, which can occur if HA instances are cross-region.
Configuring the database
You need to configure your two Matillion ETL instances by pointing the database configuration to the Azure database for the PostgreSQL server that you created earlier.
For the following servers to interact with each other, ports must be open between 5701-5703
in your Matillion ETL instances, and port 5432
must be open for the PostgreSQL server. Read Matillion ETL access ports to learn more.
Follow these steps to configure your database.
- Start by connecting to your Matillion ETL instances via SSH.
-
Once you establish a connection, run the following command to indicate you are now logged in with administrator privileges (root):
sudo -i
-
Navigate to the following file:
/usr/share/emerald/WEB-INF/classes/Emerald.properties
. -
Comment out the
PERSISTENCE_
lines:#PERSISTENCE_STORE_NAME=postgres #PERSISTENCE_USERNAME_POSTGRES=postgres #PERSISTENCE_PASSWORD_POSTGRES=postgres
-
To retrieve the base64 password, run the following command in your terminal and substitute [PASSWORD] with the PostgreSQL password:
echo -n '[PASSWORD]' | base64
-
Add the following lines to the end of the Emerald.properties file:
PERSISTENCE_STORE_NAME=postgres PERSISTENCE_USERNAME_POSTGRES=[ADMIN USERNAME] PERSISTENCE_PASSWORD_POSTGRES={enc:base64}[BASE64PASSWORD] PERSISTENCE_URL_POSTGRES=jdbc:postgresql://[SERVER NAME]:5432/postgres?ssl=true&sslmode=require
The following is an example for user reference:
MTLN_PERSISTENCE_STORE_NAME=postgres MTLN_PERSISTENCE_USERNAME_POSTGRES=username@shorthostname MTLN_PERSISTENCE_PASSWORD_POSTGRES={enc:base64}bjMxMEthR2frght MTLN_PERSISTENCE_URL_POSTGRES=jdbc:postgresql://shorthostname.postgres.database.azure.com:5432/postgres?ssl=true&sslmode=require
Note
The value that follows PERSISTENCE_PASSWORD_POSTGRES={enc:base64} is the base64 password retrieved in step 5.
-
Enter the [PASSWORD] you set when you created the database earlier. To retrieve the above [ADMIN USERNAME] and [SERVER NAME] values, and if you want to reset your database password, log in to the Azure Portal. Within the portal, type Azure Database for PostgreSQL servers, and select this service. Click into one of your databases, and then click Overview. From here, you will be able to view these server details.
-
Once the password has been retrieved, you will need to restart your Matillion ETL instances by issuing the following command:
systemctl restart tomcat