Skip to content

Convert workloads from other platforms🔗

Public preview

This section provides a technical overview of how Maia, Matillion's integrated agentic data team, automates the conversion of existing ETL/ELT workloads from other platforms into production-ready Data Productivity Cloud pipelines.

Maia is designed to simplify complex and resource-intensive platform migration projects. Its integrated capabilities help accelerate the conversion of data workloads from existing platforms into optimized, cloud-native Data Productivity Cloud pipelines.


Prerequisites🔗

Before you begin, ensure the following requirements are met:

  • Export project files from the source platform and store them locally or in an accessible Git repository (such as GitHub or GitLab). For more information on integrating your Git account with the Data Productivity Cloud, read Connect your own Git repo.
  • An active Matillion Data Productivity Cloud account is available.
  • Maia is enabled and accessible in your Data Productivity Cloud environment. Read Edit account details to learn how to enable Maia if it isn't enabled for your account.
  • Your Data Productivity Cloud role includes the required permissions to create and deploy pipelines in the target Data Productivity Cloud project and environment. For more information on account roles, read Role based Access Control overview.

    Note

    • Maia supports a growing set of data workload conversions from major data platforms. Support for additional platforms such as SSIS, Oracle ODI, Ab Initio, IBM DataStage, AWS Glue, Azure Data Factory, and WhereScape is coming soon.
    • If you need to convert from a platform that isn't currently supported, contact your account team to discuss your requirements.

Supported file types for data workload conversions🔗

Platform Supported file types
Alteryx
  • .yxmd
  • .yxmc
  • .yxwz
  • .yxzp
  • .yxdb
  • .zip
Qlik Sense
  • .qvf
  • .qvs
  • .qvd
  • .qvx
  • .zip
Informatica PowerCenter
  • .xml
  • .zip
Informatica IDMC
  • .json
  • .zip
Talend
  • .item
  • .xml
  • .properties
  • .cntxt
  • .zip
SAS Enterprise Guide
  • .egp
  • .sas
  • .sas7bdat
  • .sas7bcat
  • .zip
dbt
  • .sql
  • .yml
  • .yaml
  • .csv
  • .json
  • .zip
Palantir Foundry
  • .py
  • .sql
  • .java
  • .r
  • .yaml
  • .yml
  • .md
  • .zip
  • .gz
  • .parquet
  • .avro
  • .csv

Convert workloads with Maia🔗

  1. Log in to the Data Productivity Cloud and select the project you want to use.
  2. Click the Files icon, select Add (+) from the dropdown menu, and then select Convert workloads.
  3. In the Convert workloads with Maia dialog , select a workload type. All supported platforms are listed under Workload type.
  4. Choose a file loading method: Git repository files or Upload files.

    Note

    • If you choose Git repository files, make sure all workload files are already available in your Git repository, use the search feature to select the files you want Maia to convert, and ensure that each file is 10 MB or smaller.
    • If you choose Upload files, you can upload up to 10 files at a time, provided the combined size of all selected files does not exceed 20 MB.
  5. Click Submit to begin the analysis. Maia analyzes the workload files. The analysis time varies based on file size.

  6. After the analysis is complete, choose how you want to proceed:
    • Present conversion plan in Maia: Maia displays a conversion plan for review. No changes are made in the Data Productivity Cloud.
    • Begin conversion in Maia: Maia immediately creates pipelines in the Data Productivity Cloud environment with no review required.
  7. Click Submit.

Validate Data Productivity Cloud pipeline outputs🔗

When you use Maia to create data pipelines in a Data Productivity Cloud environment, validation helps ensure that the output from the newly created Data Productivity Cloud pipelines is consistent and comparable with the output from the source platform workloads.

Load source outputs into the cloud data warehouse🔗

If the output from the source platform workloads is not already in the same cloud data warehouse used by the Data Productivity Cloud, load it into that cloud data warehouse to simplify comparison. Load the output into a different schema within the same database as the Data Productivity Cloud output. Files such as .txt, .csv, or .xlsx can be saved to cloud storage and loaded using a Data Productivity Cloud orchestration pipeline. Using fixed input data makes side-by-side validation easier.

Note

You can choose not to load the output into the cloud data warehouse; however, this approach requires more manual validation and is generally less efficient.

Compare pipeline outputs🔗

Follow these steps to compare the output of a Data Productivity Cloud pipeline with the output from another product.

  1. Ensure that the output from both the Data Productivity Cloud and the other product is available in the same cloud data warehouse.
  2. Create a new transformation pipeline in the Data Productivity Cloud and add the Table Input component.
  3. Use the Table Input component to load the output tables from both the Data Productivity Cloud and the other product.
  4. Add components such as Detect Changes, Assert View, or Aggregate to compare the results.

Note

You can also ask Maia to compare the two tables and review the results.


Troubleshoot workload conversion errors🔗

When converting workloads from other platforms to Data Productivity Cloud pipelines, you may encounter the following error messages. Review the advice column for each message to resolve the issue.

Error message Advice
No convertible pipelines found in archive {fileName} Check the archive contains valid pipeline files for the selected platform
Nested archive depth limit exceeded (max: 3) in file {fileName} Extract nested archives before uploading
Zip contains too many entries ({count} > {limit}) Split into smaller archives or remove unnecessary files
File exceeded size limit while reading: {actual} MB > {limit} MB Reduce file size or split into multiple uploads
Zip slip attack detected: path traversal in entry {entryName} Re-create the archive without path traversal in filenames
File {fileName} size {actual} bytes exceeds maximum allowed size of {max} bytes Reduce file size or split into multiple uploads
Failed to chunk file {fileName}: No chunks could be created from pipeline content Check the file is a valid pipeline for the selected platform; if persistent, contact support
File {fileName} is empty or contains only whitespace Check the file has actual content
Too many concurrent conversion requests. Please try again later. Wait a moment and retry; if persistent, contact support
Failed to analyze chunk {name}: {cause} Retry the conversion; if persistent, contact support
Failed to generate intent analysis: {cause} Retry the conversion; if persistent, contact support
Bedrock rate limit exceeded Wait a moment and retry; if persistent, contact support
File {fileName} not found Check the selected git file still exists in the repository

Got feedback or spotted something we can improve?

We'd love to hear from you. Join the conversation in the Documentation forum!