Skip to content

Data Transfer

Data Transfer is an orchestration component that enables users to transfer files from a chosen source to a chosen target.

This component can use a number of common network protocols to transfer data to a variety of sources. This component copies, not moves, the target file. Setting up this component requires selecting a source type and a target type. The component's other properties will change to reflect those choices.

Currently supported data sources include:

  • Azure Blob Storage
  • FTP
  • SFTP
  • HTTP
  • HTTPS
  • Amazon S3
  • Windows Fileshare
  • Microsoft SharePoint
  • Google Cloud Storage

Currently supported targets include:

  • Azure Blob Storage
  • Amazon S3
  • Windows Fileshare
  • SFTP
  • Google Cloud Storage

Currently, only the following encryption algorithms are supported by the component:

  • ssh-ed25519
  • ecdsa-sha2-nistp256
  • ecdsa-sha2-nistp384
  • ecdsa-sha2-nistp521
  • rsa-sha2-512
  • rsa-sha2-256

Note

  • To ensure that instance credentials access is managed correctly at all times, we always advise that customers limit scopes (permissions) where applicable.
  • When reading from or writing to Windows Fileshare, the SMB2 protocol will be preferred. SMB1 is still supported.

If the component requires access to a cloud provider, it will use credentials as follows:

  • If using Matillion Full SaaS: The component will use the cloud credentials associated with your environment to access resources.
  • If using Hybrid SaaS: By default the component will inherit the agent's execution role (service account role). However, if there are cloud credentials associated to your environment, these will overwrite the role.

Properties

Name = string

A human-readable name for the component.


Source Type = drop-down

Select the type of data source. The source type will determine which source properties are required.

Currently supported data sources include:

  • Azure Blob Storage
  • FTP
  • Google Cloud Storage
  • HTTP
  • HTTPS
  • S3
  • Windows Fileshare
  • Microsoft SharePoint
  • SFTP

Unpack ZIP File = boolean

Select Yes if the source data is a ZIP file that you wish to unpack before being transferred.


Target Type = drop-down

Select the target type for the new file. The target type will determine which target properties are required.

Currently supported targets:

  • Azure Blob Storage
  • Google Cloud Storage
  • S3
  • Windows Fileshare
  • SFTP

Gzip Data = boolean

Select Yes if you wish to gzip the transferred data when it arrives at the target.


Target Object Name = string

The filename of the new file.


Source properties

Blob Location = string

The URL that points to the source file that exists on Azure Blob Storage.

Clicking this property will open the Blob Location dialog. This displays a list of all existing storage accounts. Select a storage account, then a container, and then a subfolder if required. This constructs a URL with the following format:

AZURE://<account>/<container>/<path>

You can also type the URL directly into the Storage Accounts path field, instead of selecting listed elements. This is particularly useful when using variables in the URL, for example:

AZURE://${jv_blobStorageAccount}/${jv_containerName}

Special characters used in this field must be URL-safe.

Set Home Directory as Root = boolean

  • Yes: URLs are relative to the user's home directory.
  • No: (default) URLs are relative to the server root.

Source URL = string

The URL, including full path and file name, that points to the source file. This source URL should follow the below logic where appropriate:

ftp://[username[:password]@]hostname[:port][path]

Example:

ftp://johndoe@example.com/home/johndoe/documents/report.txt

Wherever possible, use the Source Username and Source Password component properties for credentials instead of the URL. If you require SFTP instead of FTP, use the SFTP option in the Source Type property.


Source Username = string

This is your URL connection username. It is optional and will only be used if the data source requests it.


Source Password = string

This is your URL connection password. It is optional and will only be used if the data source requests it.

Source URL = string

The URL that points to the source file that exists on Google Cloud Storage.

Clicking this property will open the Source URL dialog. This displays a list of all existing buckets. Select a bucket, then a folder, and then a subfolder if required. This constructs a URL with the following format:

GS://<bucket>/<folder-path>

You can also type the URL directly into the GCS Buckets path field, instead of selecting listed elements. This is particularly useful when using variables in the URL, for example:

GS://${jv_gcsAccount}/${jv_folderName}

Special characters used in this field must be URL-safe.

Source URL = string

The URL, including full path and file name, that points to the source file. The source URL should follow the below logic where appropriate:

http://[username[:password]@]hostname[:port][absolute-path]

For example:

http://user:password@www.example.com/products/list

Wherever possible, use the Source Username and Source Password component properties for credentials instead of the URL.


Source Username = string

This is your URL connection username. It is optional and will only be used if the data source requests it.


Source Password = string

This is your URL connection password. It is optional and will only be used if the data source requests it.

Perform Certificate Validation = drop-down

Check that the SSL certificate for the host is valid before taking data.


Source URL = string

The URL, including full path and file name, that points to the source file. The source URL should follow the below logic where appropriate:

https://[username[:password]@]hostname[:port][absolute-path]

For example:

https://user:password@www.example.com/products/list

Wherever possible, use the Source Username and Source Password component properties for credentials instead of the URL.


Source Username = string

Your URL connection username. It is optional and will only be used if the data source requests it.


Source Password = string

The corresponding password. It is optional and will only be used if the data source requests it.

Source URL = string

The URL, including full path and file name, that points to the source file.

Clicking this property will open the Source URL dialog. This displays a list of all existing buckets. Select a bucket, then a folder, and then a subfolder if required. This constructs a URL with the following format:

S3://<bucket>/<folder-path>

When a user enters a forward slash character / after a folder name, a validation of the file path is triggered.

Source URL = string

The URL, including full path and file name, that points to the source file. This preferably follows the format:

smb://[[authdomain;]user@]host[:port][/share[/dirpath][/name]][?context]

Not all fields are required, depending on your exact setup. This component also accepts other methods such as HTTP and HTTPS.

Examples:

smb://mydomain;jdoe@fileserver01/documents/projects
http://11.22.3.4/share5/mydata.txt

Source Domain = string

The domain that the source file is located on.


Source Username = string

Your URL connection username. It is optional and will only be used if the data source requests it.


Source Password = string

The corresponding password. It is optional and will only be used if the data source requests it.

Service Type = drop-down

The type of API that the component connects to. REST requires OAuth authentication, whereas SOAP requires basic authentication (username and password).


Authentication = drop-down

If connecting to a REST Service Type, select a configured OAuth profile from the drop-down menu. Click Manage to navigate to the OAuth tab in a new browser tab and review your existing OAuth profiles and create new ones. To set up a new OAuth profile:

  1. Click Add OAuth connection.
  2. Provide a unique, descriptive OAuth name.
  3. Select your provider. In this case, SharePoint.
  4. Set the authentication type to OAuth 2.0 Authorization Code Grant.
  5. Provide your SharePoint URL. Refer to URL below for more information about the URL format you need to use.
  6. Click Authorize. A new browser tab will open to log in to https://login.microsoftonline.com/.
  7. Provide your username and click Accept.
  8. The browser tab will close and you will be returned to the Data Productivity Cloud.
  9. Return to the Authentication parameter in your Data Transfer component and locate your new OAuth profile in the drop-down menu. If the profile has not propagated, click out of the dialog and then try again.

URL = string

If connecting to a SOAP Service Type, enter the web address that you visit to log in to your SharePoint account. For example, https://companyname.sharepoint.com.


User = string

If connecting to a SOAP Service Type, enter a valid SharePoint username to use for authentication.


Password = drop-down

If connecting to a SOAP Service Type, select the secret definition storing the password for your SharePoint account. Your password should be saved as a secret definition before using this component.


SharePoint Edition = drop-down

Select your edition of Microsoft SharePoint. Options are SharePoint Online and SharePoint On-Premise.


Connection Options = column editor

  • Parameter: A JDBC parameter supported by the database driver. The available parameters are explained in the data model. Manual setup isn't usually required, since sensible defaults are assumed.
  • Value: A value for the given parameter.

File Type = drop-down

Select whether the file type is a Document or Attachment.


Library = drop-down

The SharePoint library where the file is stored. The drop-down will be populated with libraries contained within your SharePoint instance.

For more information, read Introduction to libraries.


File URL = string

The URL that points to the file.

Set Home Directory as Root = boolean

  • Yes: URLs are relative to the user's home directory.
  • No: URLs are relative to the server root.

Source URL = string

The URL, including full path and file name, that points to the source file. This source URL should follow the below logic where appropriate:

sftp://[username[:password]@]hostname[:port][path]

Example:

sftp://johndoe@example.com/home/johndoe/documents/report.txt

Wherever possible, use the Source Username and Source Password component properties for credentials instead of the URL. If you require FTP instead of SFTP, use the FTP option in the Source Type property.


Source Username = string

Your URL connection username. It is optional and will only be used if the data source requests it.


Source Password = string

The corresponding password. It is optional and will only be used if the data source requests it.


Source SFTP Key = string

Your SFTP private key. It is optional and will only be used if the data source requests it.

This must be the complete private key, beginning with "-----BEGIN RSA PRIVATE KEY-----" and conforming to the same structure as an RSA private key.

The following private key formats are currently supported:

  • DSA
  • RSA
  • ECDSA
  • Ed25519

In a Hybrid SaaS configuration, you need to manually convert the private key into a format that allows it to be stored in your AWS Secrets Manager. You can do this with the following command:

ssh-keygen -p -f YOUR_PRIVATE_KEY -m pem

Target properties

Target Object Name = string

The filename of the new file.


Blob Location = string

The URL that points to the target file location on Azure Blob Storage.

Clicking this property will open the Blob Location dialog. This displays a list of all existing storage accounts. Select a storage account, then a container, and then a subfolder if required. This constructs a URL with the following format:

AZURE://<account>/<container>/<path>

You can also type the URL directly into the Storage Accounts path field, instead of selecting listed elements. This is particularly useful when using variables in the URL, for example:

AZURE://${jv_blobStorageAccount}/${jv_containerName}

Special characters used in this field must be URL-safe.

Target Object Name = string

The filename of the new file.


Target URL = string

The URL that points to the target file location on Google Cloud Storage.

Clicking this property will open the Target URL dialog. This displays a list of all existing buckets. Select a bucket, then a folder, and then a subfolder if required. This constructs a URL with the following format:

GS://<bucket>/<folder-path>

You can also type the URL directly into the GCS Buckets path field, instead of selecting listed elements. This is particularly useful when using variables in the URL, for example:

GS://${jv_gcsAccount}/${jv_folderName}

Special characters used in this field must be URL-safe.

Target Object Name = string

The filename of the new file.


Target URL = string

The URL (without file name) that points to where the new file will be created.

Clicking this property will open the Target URL dialog. This displays a list of all existing buckets. Select a bucket, then a folder, and then a subfolder if required. This constructs a URL with the following format:

S3://<bucket>/<folder-path>

Access Control List Options = drop-down

Choose from ACL settings that Amazon provide. Leaving it empty doesn't change the current settings. A full list can be found here.


Encryption = drop-down

Decide how the files are encrypted inside the S3 bucket. This property is available when using an existing Amazon S3 location for staging.


KMS Key ID = drop-down

The ID of the KMS encryption key you have chosen to use in the Encryption property.

Target Object Name = string

The filename of the new file.


Target URL = string

The URL (without file name) that points to where the new file will be created. This should follow the below logic where appropriate:

smb://[[authdomain;]user@]host[:port][/share[/dirpath]][?context]

Other expected methods are also available. Examples:

smb://mydomain;jdoe@fileserver01/documents/projectfiles/
http://11.22.3.4/share5/

Target Domain = string

The domain that the newly created file is to be located on.


Target Username = string

Your URL connection username. It is optional and will only be used if the target system requests it.


Target Password = string

The corresponding password. It is optional and will only be used if the target system requests it.

Set Home Directory as Root = boolean

  • Yes: URLs are relative to the user's home directory.
  • No: URLs are relative to the server root.

Target URL = string

The URL (without file name) that points to where the new file will be created. This should follow the below logic where appropriate:

sftp://[username[:password]@]hostname[:port][path]

Example:

sftp://johndoe@example.com/home/johndoe/documents/

Wherever possible, use the Target Username and Target Password component properties for credentials instead of the URL.


Target Username = string

Your SFTP connection username. This is optional and will only be used if the data source requests it.


Target Password = string

Your SFTP password. This is optional and will only be used if the target system requests it.

Select the secret definition storing the password for your SFTP username. The password should be saved as a secret definition before using this component.


Target SFTP Key = string

Your SFTP private key. This is optional and will only be used if the target system requests it.

Select the secret definition storing the private key for your SFTP account. The password should be saved as a secret definition before using this component.

This must be the complete private key, beginning with "-----BEGIN RSA PRIVATE KEY-----" and conforming to the same structure as an RSA private key.

The following private key formats are currently supported:

  • DSA
  • RSA
  • ECDSA
  • Ed25519

In a Hybrid SaaS configuration, you need to manually convert the private key into a format that allows it to be stored in your AWS Secrets Manager. You can do this with the following command:

ssh-keygen -p -f YOUR_PRIVATE_KEY -m pem

Snowflake Databricks Amazon Redshift