Skip to content

Upgrade: Bash scripts

Unlike Matillion ETL, which runs on your own Linux virtual machine, the Data Productivity Cloud uses a SaaS operating model, which means there is no underlying VM you can use to run Bash scripts on. Instead, you can use the Bash Pushdown component to run your Bash scripts on a Linux VM that you provide and connect to via SSH.

Advantages of Bash Pushdown are:

  • You can set the CPU and memory on the Linux VM as needed.
  • You can install any packages or third-party applications you need on the Linux VM.

However, when choosing to use Bash Pushdown also consider the following:

  • You need to set up, secure, update, and manage the Linux VM yourself.
  • You need network access from the Data Productivity Cloud agent to the VM.

Upgrade path

If you have a job that contains a Bash Script component, follow these steps in the Data Productivity Cloud to configure the imported pipeline to use Bash Pushdown instead:

  1. Click Edit preferences in the Importing files panel.
  2. Toggle Convert Bash Pushdown to On. The default setting for this option is Off, meaning that no scripts will be converted unless you explicitly enable it.
  3. This will present you with a set of parameters to configure the connection to your Linux machine. Complete these as follows:

    • Host: The hostname or IP address of your Linux VM.
    • User Name: The username to connect to your Linux VM.
    • Connection Timeout (ms): The timeout period for the connection, in milliseconds. The default is 3000 (3 seconds).
    • Port: The SSH port on your Linux VM. The default is 22.
    • Authentication Type: Choose between Password or Key Pair authentication.

      • If you choose Password, use the Password drop-down to select the secret that contains the value of your password.
      • If you choose Key Pair, use the Private Key drop-down to select the secret that contains the value of your private key. If your private key is passphrase protected, set Require Passphrase to Yes and use the Passphrase drop-down to select the secret that contains the value of your passphrase.

      Note

      Read Secrets and secret definitions to learn how to create a new secret definition.

    For a description of the other properties that can be configured for the component, see the Bash Pushdown documentation.

  4. Click Apply & re-run.

  5. If you examine the migration report, it will now show "Bash Script component has been converted to a Bash Pushdown component."
  6. Click Import to complete the import process, and continue as described in Import to the Data Productivity Cloud.

Alternative approaches

If you choose not to use the Bash Pushdown component, you will need to refactor any jobs that use Bash scripts to run in a Data Productivity Cloud pipeline. The following guidelines can help you decide how to do this:

  • See if the functionality has a native component equivalent, such as the Print Variables component.
  • See if the workload can be written in a Python script using the Python Pushdown component. This option is only available for Snowflake environments.
  • See if the workload can be written in a Python script using the Python Script component. This option is only available for Hybrid SaaS deployments.

If none of the above options are suitable, you can continue to use the Bash Script component, without converting to Bash Pushdown, but we don't recommend this option. The script won't run in a Full SaaS deployment, and may run with problems in a Hybrid SaaS deployment. If you wish to continue with this option, we recommend you discuss the matter with Matillion Support.

Automatic variables

The Data Productivity cloud doesn't support directly accessing automatic variables through the Bash Script component.

If you require this functionality, you can use an Update Scalar component to write the values to user-defined variables, which can then be passed to the script.