Skip to content

Connecting to an external PostgreSQL database

Matillion ETL stores metadata in a PostgreSQL database, which is automatically installed with the Matillion ETL instance.

Optionally, you can elect to use your own PostgreSQL database, external to the Matillion ETL instance. This may be useful if you want to separate the application and underlying data storage in an existing environment, so they can be maintained (backed up, restored, archived etc.) independently.

Read Software Versions for details of which PostgreSQL versions are supported.

Warning

The process of connecting to an external database requires the creation of a temporary instance of Matillion ETL. Depending on your licensing, this temporary instance may incur Matillion charges in addition to the cost of the cloud resources used (virtual machine, hard disk, etc.).


Prerequisites

Your original and temporary instances must both be eligible for Enterprise licensing to take advantage of external Matillion ETL data storage.

The external database must be network-accessible from the Matillion ETL instance. Specifically, open outbound networking on the PostgreSQL service's port (5432 by default).


Configuring the connection

  1. Create a new PostgreSQL database instance on an external virtual machine or in a database service in your cloud platform, ensuring that it's network-accessible to your Matillion ETL instance.
  2. Create a new temporary Matillion ETL instance of the same version as your primary instance.

    Note

    If you are unable to locate a machine image with a matching version, you can create an instance of a lower version and upgrade it to the specific version of the original instance.

  3. Configure the temporary Matillion ETL instance to use external persistence, as described in Setting up an external connection to a Matillion database.

  4. Migrate content from the original instance to the new instance, as described in Migration.
  5. Shut down the new instance.
  6. Reconfigure the original instance to use the external database, as described in Setting up an external connection to a Matillion database.
  7. Test your instance by logging in, browsing a few projects, and running a few sample jobs. If you observe any unexpected results, revert the original instance to use the onboard PostgreSQL server and attempt to migrate any missing content.
  8. Permanently disable the onboard PostgreSQL database in the original instance, as follows:

    1. SSH to the Matillion ETL instance as the root user.
    2. List the PostgreSQL service details, noting the version:

      systemctl status | grep -E "postgres.*.service"
      
    3. Stop the PostgreSQL service (if not already):

      systemctl stop postgresql-<version>
      

      Where <version> is the version you noted above.

    4. Open the Matillion ETL login page for the instance to ensure the application is still running and able to use the external PostgreSQL database.

    5. Disable the service permanently, providing the version noted above.

      systemctl disable postgresql-<version>
      

      Where <version> is the version you noted above.

  9. Shut down and terminate the temporary instance.

  10. You should create firewall rules for the external Postgres database, minimizing which network connections to it can be made. The default Postgres port is 5432. Connections only need to be made from the Matillion ETL nodes, and occasionally from designated database administration machines. All other ports can be locked down.