Configuring a high availability cluster (Azure)
This guide explains how to configure a High Availability (HA) Matillion ETL clustered instance, with an external PostgreSQL database on Microsoft Azure. Configuring HA will allow your Matillion ETL instance to continue running even if certain components fail.
Prerequisites
Before following the HA configuration process, you will be required to create:
-
One Azure Database for PostgreSQL - Flexible Server. Alternatively, you can use a different PostgreSQL database.
Note
Matillion ETL relies on a stable connection to its backend PostgreSQL database. Disruption to this connection may cause an application outage.
-
Two Matillion ETL instances. Make sure both have Enterprise mode enabled. For more information, read Instance Sizes Guide.
- A Load Balancer. This provides high availability by distributing incoming data into your Matillion ETL instances, and lets you point users to one or more endpoints. There are a number of load balancers you can create. It's recommended that you use an Azure Application Gateway. For more information, read the Azure Application Gateway documentation.
Note
In an Azure Application Gateway, session "stickiness" ends when the user closes their browser. Unlike an AWS Application Load Balancer (ALB), it doesn't support a configurable timeout. If this behavior doesn't meet your needs, you can use an alternative load balancer.
When setting up the Azure Application Gateway under Backend settings, make sure to disable Override with new host name. You'll also need to update the SSL certificates on each node to include the Application Gateway domain.
Note
Your Matillion ETL instances must be in the same region due to network latency issues, which can occur if HA instances are cross-region.
Configuring the database
You need to configure your two Matillion ETL instances by pointing the database configuration to the Azure database for the PostgreSQL server that you created earlier.
For the following servers to interact with each other, ports must be open between 5701-5703
in your Matillion ETL instances, and port 5432
must be open for the PostgreSQL server. Read Matillion ETL access ports to learn more.
Follow these steps to configure your database.
- Start by connecting to your Matillion ETL instances via SSH.
-
Once you establish a connection, run the following command to indicate you are now logged in with administrator privileges (root):
sudo -i
-
Navigate to the following file:
/usr/share/emerald/WEB-INF/classes/Emerald.properties
. -
Comment out the
PERSISTENCE_
lines:#PERSISTENCE_STORE_NAME=postgres #PERSISTENCE_USERNAME_POSTGRES=postgres #PERSISTENCE_PASSWORD_POSTGRES=postgres
-
To retrieve the base64 password, run the following command in your terminal and substitute [PASSWORD] with the PostgreSQL password:
echo -n '[PASSWORD]' | base64
-
Add the following lines to the end of the Emerald.properties file:
PERSISTENCE_STORE_NAME=postgres PERSISTENCE_USERNAME_POSTGRES=[ADMIN USERNAME] PERSISTENCE_PASSWORD_POSTGRES={enc:base64}[BASE64PASSWORD] PERSISTENCE_URL_POSTGRES=jdbc:postgresql://[SERVER NAME]:5432/postgres?ssl=true&sslmode=require
The following is an example for user reference:
PERSISTENCE_STORE_NAME=postgres PERSISTENCE_USERNAME_POSTGRES=username@shorthostname PERSISTENCE_PASSWORD_POSTGRES={enc:base64}bjMxMEthR2frght PERSISTENCE_URL_POSTGRES=jdbc:postgresql://shorthostname.postgres.database.azure.com:5432/postgres?ssl=true&sslmode=require
Note
The value that follows PERSISTENCE_PASSWORD_POSTGRES={enc:base64} is the base64 password retrieved in step 5.
-
Enter the [PASSWORD] you set when you created the database earlier. To retrieve the above [ADMIN USERNAME] and [SERVER NAME] values, and if you want to reset your database password, log in to the Azure Portal. Within the portal, type Azure Database for PostgreSQL servers, and select this service. Click into one of your databases, and then click Overview. From here, you will be able to view these server details.
-
Once the password has been retrieved, you will need to restart your Matillion ETL instances by issuing the following command:
systemctl restart tomcat