SCM integration
Overview
For many customers, best practice topology requires multiple instances of Matillion ETL (for development, testing and live) combined with a Source Control Management (SCM) system combined and build server or script environment to manage the migration between instances. This allows Matillion ETL to support DevOps scenarios where all configuration is held centrally.
Example
The following steps will demonstrate the typical application of SCM to Matillion ETL. In the below example, the following job was used:
Group: Matillion \\ Project: Marketplace Data Warehouse \\ Job: Load Marketplace Data \\ Version: BW_21-7-2016
Please Note
In the example, Git is used. However, the steps are generic and can be adapted to almost any SCM system.
Parameters and Description
Below is the list of endpoint parameters and their brief description:
Parameter | Description |
---|---|
<InstanceURI> | The server IP address or domain name of the Matillion ETL instance |
<GroupName> | The name of the group within which the Matillion ETL project is located |
<ProjectName> | The name of the project within which the Matillion ETL job is located |
<VersionName> | The version of the Matillion ETL job |
<FileName> | The name of the file into which the Matillion ETL project will be exported |
Step 1: Export a Project
Exporting all the data for a given version is simple—identify the version to export and ensure all the jobs are included in the result using export=true
. This done via a standard REST-based API using GET
.
Additionally, if required, the output may also be formatted in JSON by adding | jq '.' >
and the target file name after the URL.
https://<instanceURI>/rest/v0/projects?groupName=<GroupName>&projectName=<ProjectName>&versionName=<VersionName>&export=true | jq '.' > <FileName>.json
In this example, this would be interpreted as follows:
https://10.12.1.28/rest/v0/projects?groupName=Matillion&projectName=Marketplace%20Data%20Warehouse&versionName=BW_21-7-2016&export=true | jq '.' > MatillionMarketplace.json
Please Note
- The username and password of a user with access to the Project within the Matillion ETL instance will need to be used to authenticate the connection. These may need to be specified before the command when using a command line tool—for example:
-u user:password
. - In some cases, the URL may need to be placed inside quotes ("") to protect
&
characters from being misinterpreted.
Step 2: Commit to Git
The following commands will need to executed from within Git:
-
Create a workspace with Git:
mkdir MatillionWorkspace
-
Change to the newly created directory:
cd MatillionWorkspace
-
Initialise the Git repository:
git init
-
Create a branch to match the version in Matillion ETL:
git checkout -b BW_21-7-2016
-
Copy the file retrieved from the API earlier into the newly created directory, then add them to the Git repository:
git add *
-
Commit the files:
git commit -m "Matillion Marketplace initial commit"
-
Push changes to the repository:
git remote add origin https://matillion-admin@bitbucket.org/matillion/matillion-etl-scm.git git push origin --all
Step 3: Checkout a Project and Push it to a New Matillion ETL Instance
-
Check out the project from Git:
git clone https://matillion-admin@bitbucket.org/matillion/matillion-etl-scm.git
-
Change to the project directory:
cd matillion-etl-scm/
-
Ensure the correct branch (or version) is being used:
git branch
-
Once again, using a standard REST-based API, push the checked-out JSON data onto a new server using
POST
:http://10.12.1.28/rest/v0/projects -H "Content-Type: application/json" --data-binary @MatillionMarketplace.json
Please Note
- The username and password of a user with access to the Project within the Matillion ETL instance will need to be used to authenticate the connection. These may need to be specified before the command when using a command line tool—for example:
-u user:password
. - The content type and target file will also need to be specified.
- The username and password of a user with access to the Project within the Matillion ETL instance will need to be used to authenticate the connection. These may need to be specified before the command when using a command line tool—for example:
-
If successful, the below return value should be received.
{"success":true,"msg":"Import successful","id":-1}
Error
If an invalid ID is returned from this call (due to this operation generating many new IDs for projects, environments, versions, jobs and so on), the current project structure can be displayed by doing a GET request to the same URL but without the
export=true
option.