Adding a third-party JDBC driver
Overview
Matillion ETL's Database Query uses JDBC drivers to query relational databases that are acting as data sources for your application.
A number of commonly used JDBC drivers are provided with Matillion ETL, and can be used out of the box. Some others, notably Oracle, Teradata, DB2, and MySQL, have licencing restrictions that require you to manually upload an individual copy of the JDBC driver to Matillion ETL. This can be done by clicking Admin → Manage Database Drivers.
Many other JDBC drivers can be used with Matillion ETL, requiring extra steps that are described in this document.
Matillion ETL ships with a certain version of the PostgreSQL JDBC driver (currently 9.4 as of June 2021). You cannot force Matillion ETL to use a newer version of this particular JDBC driver.
Downloading your JDBC driver
Acquire a suitable JDBC driver for your source database. This will be in the form of one or more .jar files that you can download. The download package may actually be a .zip or .tgz file, in which case you will need to un-zip or decompress it to extract the individual .jar files.
RDBMS vendors often offer many different drivers, in which case you should look for:
- JDBC Type 4, sometimes known as "Pure Java" or "Direct to Database", as other types of driver require additional client software and will not work.
- Compatible with Java 7, sometimes known as JDBC version 4, as some drivers are simply provided in the form of a single .jar file, which is ideal.
Some drivers are implemented as multiple .jar files, and you will need to download all of them to use the driver.
If a driver has further non-JDBC dependencies, such as logging libraries, the RDBMS vendor should provide documentation for these cases. You must download all of the dependency .jar files before continuing.
The RDBMS vendor should also provide documentation on some of the main features of the driver, such as the Java class name, and the URL format. You will need this information to configure Matillion ETL.
Configuring parameters for third-party drivers
The list of JDBC drivers that you can see in the Database Type property of the Database Query component is governed by a single configuration file. To add a new entry to this list, you will need to log in to the Matillion ETL instance and edit the file to add parameters for the driver you are adding.
- SSH into your Matillion ETL instance as root, locate the file
jdbc-providers.properties
, and edit it:
sudo su -
cd /usr/share/emerald/WEB-INF/classes
vi jdbc-providers.properties
-
The file is made up of a single JSON array, with one entry per supported database. Add a new entry, remembering to end the new entry with a comma if it's partway down the list. Follow the syntax of one of the existing entries and set values for:
-
name: The display name.
- driver: The Java class name of the driver, which should be in the vendor's documentation.
- url: A sample URL, used for documentation only.
- allowUpload: Always set this to true.
- fetchSize: Set this to 500 initially, but the value can be tuned.
-
limit: This is a string that Matillion ETL uses internally to create an SQL command for sampling. Accepted values include:
- fetch-first-n: Uses a JDBC loop (this is the best default option).
- limit-inline: Uses the LIMIT keyword.
- limit-outer: Uses LIMIT outside a nested SELECT.
- rownum: Uses a ROWNUM rowstop (Oracle style).
- top-n: Uses the TOP keyword (SQL Server style).
-
Depending on the driver, additional properties may also be set, including:
-
autocommit: Normally true, but can be set to false.
-
defaultProperties: Another JSON array with properties that are meaningful to the driver in use.
-
Once you have finished editing the file, it's worth checking that the JSON syntax is still:
cat jdbc-providers.properties | jq "."
The command will fail with an error message if the JSON is invalid. Note that syntax errors in the file will prevent the Matillion ETL instance from restarting.
Restart Matillion ETL
After editing the jdbc-providers.properties
file, restart the Matillion ETL instance using the following command:
sudo service tomcat restart
Wait a few minutes while the new configuration comes into effect.
Upload the JAR files into Matillion ETL
- Once Matillion ETL has restarted, click Admin → Manage Database Drivers. This should open a dialog that allows .jar files to be uploaded for all of the JDBC drivers that have the allowUpload property set to true.
- Upload the .jar files one at a time, remembering to include any dependency-related .jar files.
- Once all of the .jar files are uploaded, everything required to begin to use the new JDBC driver is in place.
Test the Database Query component
- Drag a Database Query component onto an orchestration job, and open its Database Type dialog box. The new driver should be available for selection.
- The example URL should help you to configure the correct Connection URL for your database, normally by configuring the host, port, and database name.
- The SQL syntax of the SQL Query statement is in the dialect of your new source database.
Troubleshooting
These common issues are caused by incorrect driver configuration:
- The Database Query component won't accept a selection, instead reverting to the default setting.
- The component complains that "Parameters contain errors".
- Errors occur during execution of the component, even though the SQL is valid.
- If an "Unable to find driver of database type [x]" error is displayed, ensure the URL in the
jdbc-providers.properties
file is a valid connection string pattern.
If these errors occur, acquire the Matillion ETL server log and contact support. The server log is a file called catalina.out, and can be obtained from Matillion ETL by clicking Admin → Download Server Log.