Skip to content

Migrating from Matillion ETL - process

Public preview

This article is aimed at current users of Matillion ETL who wish to migrate their workloads to Matillon's Data Productivity Cloud.

Before beginning any migration project, read Migration considerations.

The migration process copies your Matillion ETL jobs as Data Poductivity Cloud pipelines. It won't create any other required Data Productivity Cloud elements such as projects or environments, so you must manually create these items before beginning the migration. The full list of things you must create and configure in advance of migration is given in Prerequisites, below.


Prerequisites

Hub account

If you do not have a Hub account, you must create one in order to use Data Productivity Cloud. Read Registering for a Hub account for details.

Logging into Hub as an admin user, you should see the Design data pipelines tile. If you do not see it, it may not yet have been enabled. Contact your Matillion Account Manager for assistance.

Note

Contact your Matillion Account Manager to discuss potential billing changes when you begin using the Data Productivity Cloud to run pipelines.

Users

Any members of your organization who will be using Data Productivity Cloud must be added as users to the Hub. Read Manage account users for details.

Projects

Create a project where you want to import your migrated pipelines, as described in Projects. Later you can add additional projects if you want to segregate your pipelines.

Environments

Create an environment as described in Environments.

Branches

Your project will have a main branch created by default, but good practice is to perform all development work in a different branch, and only merge the work into main when ready for production. Therefore, we recommend you create a new branch with some appropriate name, such as metl-migrations, to hold the migrated pipeleines. Read Branches for details.

Credentials

For security reasons, we do not migrate credentials such as secrets, passwords, or OAuths from Matillion ETL to the Data Productivity Cloud. Any secrets or other credentials you have set up in Matillion ETL will have to be recreated manually in Data Productivity Cloud to allow your pipelines to run. Read Secret definitions, Cloud provider credentials, and OAuth for details.


Export from Matillion ETL

Migration is a two-stage process. The first step is performed from within Matillion ETL, and involves exporting jobs from there by following the process given in Exporting.

Exported job information is saved in a JSON format file, in whichever default download location is used by your browser and operating system. This JSON file will be used by the import function of Data Productivity Cloud.

You can select which jobs you want to export; it doesn't have to be an entire project. You can also export project variables (which will become environment variables in Data Productivity Cloud).

Note

The maximum size of a JSON export file that the Data Productivity Cloud can import is 5MB. If your export file is larger than this, you should split the export into several smaller export operations, for example by exporting individual jobs instead of an entire project group.


Import to Data Productivity Cloud

Note

If you are an existing Data Productivity Cloud user, we strongly recommend you create a new, dedicated project and branch in which to perform the migration, in order to to avoid interfering with your existing projects. If you try to import a job into a branch that has an existing pipeline of the same name, the import will overwrite the existing pipeline, and imported variables will also overwrite existing variables of the same same.

Remember that you can copy pipelines between projects later, after you have completed any required refactoring and verified that they will work.

  1. Open your Data Productivity Cloud project and branch.
  2. In the Pipelines tab, click ... next to the root folder, then click Import.
  3. In the file navigator, browse to the JSON file you exported from Matillion ETL and click Open.
  4. Before completing the import, the migration tool will analyse the Matillion ETL export and produce a report of its compatibility for use as a Data Productivity Cloud pipeline. The results are presented under the following headings, and can be viewed in the import dialog or exported to a report:

    • Jobs to be converted without changes: These are the jobs that can be imported without any changes.
    • Jobs to be auto-converted: These are jobs that will be converted automatically into a form suitable for importing, using acceptable substitute parameters.
    • Jobs to be manually refactored: This will typically be jobs with components that have no equivalent component in Data Productivity Cloud. These jobs will be imported, but you will have to manually refactor the jobs before they will work. The exact method of doing this will depend entirely on the design and function of the job.
    • Project variables to be be converted: This lists every environment variable exported from Matillion ETL that is compatible with the Data Productivity Cloud and will be impotyed as a project variable.
    • Project variables to be auto-converted: These are environment variables that will be converted automatically into a form suitable for importing as project variables, for example, converting shared variables to copied variables.
    • Project variables to be manually refactored: These variables will be imported, but you will have to manually refactor the variables before they will work. The exact method of doing this will depend entirely on the function of the variable and the type of incompatibility.
    • Project variables that will not be imported: These variables aren't compatible with the Data Productivity Cloud and won't be migrated. The import can still happen, but without these variables being included. This requires careful consideration, as missing variables will almost certainly cause pipelines to fail, and mitigating measures will need to be taken, either in the Matillion ETL job before exporting, or in the Data Productivity Cloud pipeline after importing.
  5. If satisfied with the eligibility report, click Import to complete the import process.

Regardless of the result of the job eligibility report, it is critical that you review all pipelines after importing, to ensure that they behave as expected and are suitable for production use.


After import

After import, there are a few additional manual changes you may need to make before the pipeline is production ready.

  • You will need to manually add any component-level credentials from Matillion ETL as secrets.
  • You will need to manually specify environments to be used.
  • For the variable types the Data Productivity Cloud doesn't support, you will need to test the new behavior, for example ensuring that your dates still work in expressions when they have been converted to strings.
  • Unsupported components will be identified in the pipeline as "unknown component", and will need to be replaced with alternatives, or the pipeline will need to be refactored to make them unnecessary.