Data Transfer
Data Transfer is an orchestration component that enables users to transfer files from a chosen source to a chosen target.
This component can use a number of common network protocols to transfer data to a variety of sources. This component copies, not moves, the target file. Setting up this component requires selecting a source type and a target type. The component's other properties will change to reflect those choices.
Currently supported data sources include:
- Azure Blob Storage
- FTP
- SFTP
- HTTP
- HTTPS
- Amazon S3
- Windows Fileshare
- Microsoft SharePoint
- Google Cloud Storage
Currently supported targets include:
- Azure Blob Storage
- Amazon S3
- Windows Fileshare
- SFTP
- Google Cloud Storage
Currently, only the following encryption algorithms are supported by the component:
- ssh-ed25519
- ecdsa-sha2-nistp256
- ecdsa-sha2-nistp384
- ecdsa-sha2-nistp521
- rsa-sha2-512
- rsa-sha2-256
Note
- To ensure that instance credentials access is managed correctly at all times, we always advise that customers limit scopes (permissions) where applicable.
- When reading from or writing to Windows Fileshare, the SMB2 protocol will be preferred. SMB1 is still supported.
If the component requires access to a cloud provider, it will use credentials as follows:
- If using Matillion Full SaaS: The component will use the cloud credentials associated with your environment to access resources.
- If using Hybrid SaaS: By default the component will inherit the agent's execution role (service account role). However, if there are cloud credentials associated to your environment, these will overwrite the role.
Properties
Name
= string
A human-readable name for the component.
Source Type
= drop-down
Select the type of data source. The source type will determine which source properties are required.
Currently supported data sources include:
- Azure Blob Storage
- FTP
- Google Cloud Storage
- HTTP
- HTTPS
- S3
- Windows Fileshare
- Microsoft SharePoint
- SFTP
Unpack ZIP File
= boolean
Select Yes if the source data is a ZIP file that you wish to unpack before being transferred.
Target Type
= drop-down
Select the target type for the new file. The target type will determine which target properties are required.
Currently supported targets:
- Azure Blob Storage
- Google Cloud Storage
- S3
- Windows Fileshare
- SFTP
Gzip Data
= boolean
Select Yes if you wish to gzip the transferred data when it arrives at the target.
Target Object Name
= string
The filename of the new file.
Source properties
Blob Location
= string
The URL that points to the source file that exists on Azure Blob Storage.
Clicking this property will open the Blob Location dialog. This displays a list of all existing storage accounts. Select a storage account, then a container, and then a subfolder if required. This constructs a URL with the following format:
AZURE://<account>/<container>/<path>
You can also type the URL directly into the Storage Accounts path field, instead of selecting listed elements. This is particularly useful when using variables in the URL, for example:
AZURE://${jv_blobStorageAccount}/${jv_containerName}
Special characters used in this field must be URL-safe.
Set Home Directory as Root
= boolean
- Yes: URLs are relative to the user's home directory.
- No: (default) URLs are relative to the server root.
Source URL
= string
The URL, including full path and file name, that points to the source file. This source URL should follow the below logic where appropriate:
ftp://[username[:password]@]hostname[:port][path]
Example:
ftp://johndoe@example.com/home/johndoe/documents/report.txt
Wherever possible, use the Source Username and Source Password component properties for credentials instead of the URL. If you require SFTP instead of FTP, use the SFTP option in the Source Type property.
Source Username
= string
This is your URL connection username. It is optional and will only be used if the data source requests it.
Source Password
= string
This is your URL connection password. It is optional and will only be used if the data source requests it.
Source URL
= string
The URL that points to the source file that exists on Google Cloud Storage.
Clicking this property will open the Source URL dialog. This displays a list of all existing buckets. Select a bucket, then a folder, and then a subfolder if required. This constructs a URL with the following format:
GS://<bucket>/<folder-path>
You can also type the URL directly into the GCS Buckets path field, instead of selecting listed elements. This is particularly useful when using variables in the URL, for example:
GS://${jv_gcsAccount}/${jv_folderName}
Special characters used in this field must be URL-safe.
Source URL
= string
The URL, including full path and file name, that points to the source file. The source URL should follow the below logic where appropriate:
http://[username[:password]@]hostname[:port][absolute-path]
For example:
http://user:password@www.example.com/products/list
Wherever possible, use the Source Username and Source Password component properties for credentials instead of the URL.
Source Username
= string
This is your URL connection username. It is optional and will only be used if the data source requests it.
Source Password
= string
This is your URL connection password. It is optional and will only be used if the data source requests it.
Perform Certificate Validation
= drop-down
Check that the SSL certificate for the host is valid before taking data.
Source URL
= string
The URL, including full path and file name, that points to the source file. The source URL should follow the below logic where appropriate:
https://[username[:password]@]hostname[:port][absolute-path]
For example:
https://user:password@www.example.com/products/list
Wherever possible, use the Source Username and Source Password component properties for credentials instead of the URL.
Source Username
= string
Your URL connection username. It is optional and will only be used if the data source requests it.
Source Password
= string
The corresponding password. It is optional and will only be used if the data source requests it.
Source URL
= string
The URL, including full path and file name, that points to the source file.
Clicking this property will open the Source URL dialog. This displays a list of all existing buckets. Select a bucket, then a folder, and then a subfolder if required. This constructs a URL with the following format:
S3://<bucket>/<folder-path>
When a user enters a forward slash character /
after a folder name, a validation of the file path is triggered.
Source URL
= string
The URL, including full path and file name, that points to the source file. This preferably follows the format:
smb://[[authdomain;]user@]host[:port][/share[/dirpath][/name]][?context]
Not all fields are required, depending on your exact setup. This component also accepts other methods such as HTTP and HTTPS.
Examples:
smb://mydomain;jdoe@fileserver01/documents/projects
http://11.22.3.4/share5/mydata.txt
Source Domain
= string
The domain that the source file is located on.
Source Username
= string
Your URL connection username. It is optional and will only be used if the data source requests it.
Source Password
= string
The corresponding password. It is optional and will only be used if the data source requests it.
Service Type
= drop-down
The type of API that the component connects to. REST requires OAuth authentication, whereas SOAP requires basic authentication (username and password).
Authentication
= drop-down
If connecting to a REST Service Type, select a configured OAuth profile from the drop-down menu. Click Manage to navigate to the OAuth tab in a new browser tab and review your existing OAuth profiles and create new ones. To set up a new OAuth profile:
- Click Add OAuth connection.
- Provide a unique, descriptive OAuth name.
- Select your provider. In this case, SharePoint.
- Set the authentication type to OAuth 2.0 Authorization Code Grant.
- Provide your SharePoint URL. Refer to URL below for more information about the URL format you need to use.
- Click Authorize. A new browser tab will open to log in to
https://login.microsoftonline.com/
. - Provide your username and click Accept.
- The browser tab will close and you will be returned to the Data Productivity Cloud.
- Return to the Authentication parameter in your Data Transfer component and locate your new OAuth profile in the drop-down menu. If the profile has not propagated, click out of the dialog and then try again.
URL
= string
If connecting to a SOAP Service Type, enter the web address that you visit to log in to your SharePoint account. For example, https://companyname.sharepoint.com
.
User
= string
If connecting to a SOAP Service Type, enter a valid SharePoint username to use for authentication.
Password
= drop-down
If connecting to a SOAP Service Type, select the secret definition storing the password for your SharePoint account. Your password should be saved as a secret definition before using this component.
SharePoint Edition
= drop-down
Select your edition of Microsoft SharePoint. Options are SharePoint Online and SharePoint On-Premise.
Connection Options
= column editor
- Parameter: A JDBC parameter supported by the database driver. The available parameters are explained in the data model. Manual setup isn't usually required, since sensible defaults are assumed.
- Value: A value for the given parameter.
File Type
= drop-down
Select whether the file type is a Document or Attachment.
Library
= drop-down
The SharePoint library where the file is stored. The drop-down will be populated with libraries contained within your SharePoint instance.
For more information, read Introduction to libraries.
File URL
= string
The URL that points to the file.
Set Home Directory as Root
= boolean
- Yes: URLs are relative to the user's home directory.
- No: URLs are relative to the server root.
Source URL
= string
The URL, including full path and file name, that points to the source file. This source URL should follow the below logic where appropriate:
sftp://[username[:password]@]hostname[:port][path]
Example:
sftp://johndoe@example.com/home/johndoe/documents/report.txt
Wherever possible, use the Source Username and Source Password component properties for credentials instead of the URL. If you require FTP instead of SFTP, use the FTP option in the Source Type property.
Source Username
= string
Your URL connection username. It is optional and will only be used if the data source requests it.
Source Password
= string
The corresponding password. It is optional and will only be used if the data source requests it.
Source SFTP Key
= string
Your SFTP private key. It is optional and will only be used if the data source requests it.
This must be the complete private key, beginning with "-----BEGIN RSA PRIVATE KEY-----" and conforming to the same structure as an RSA private key.
The following private key formats are currently supported:
- DSA
- RSA
- ECDSA
- Ed25519
In a Hybrid SaaS configuration, you need to manually convert the private key into a format that allows it to be stored in your AWS Secrets Manager. You can do this with the following command:
ssh-keygen -p -f YOUR_PRIVATE_KEY -m pem
Target properties
Target Object Name
= string
The filename of the new file.
Blob Location
= string
The URL that points to the target file location on Azure Blob Storage.
Clicking this property will open the Blob Location dialog. This displays a list of all existing storage accounts. Select a storage account, then a container, and then a subfolder if required. This constructs a URL with the following format:
AZURE://<account>/<container>/<path>
You can also type the URL directly into the Storage Accounts path field, instead of selecting listed elements. This is particularly useful when using variables in the URL, for example:
AZURE://${jv_blobStorageAccount}/${jv_containerName}
Special characters used in this field must be URL-safe.
Target Object Name
= string
The filename of the new file.
Target URL
= string
The URL that points to the target file location on Google Cloud Storage.
Clicking this property will open the Target URL dialog. This displays a list of all existing buckets. Select a bucket, then a folder, and then a subfolder if required. This constructs a URL with the following format:
GS://<bucket>/<folder-path>
You can also type the URL directly into the GCS Buckets path field, instead of selecting listed elements. This is particularly useful when using variables in the URL, for example:
GS://${jv_gcsAccount}/${jv_folderName}
Special characters used in this field must be URL-safe.
Target Object Name
= string
The filename of the new file.
Target URL
= string
The URL (without file name) that points to where the new file will be created.
Clicking this property will open the Target URL dialog. This displays a list of all existing buckets. Select a bucket, then a folder, and then a subfolder if required. This constructs a URL with the following format:
S3://<bucket>/<folder-path>
Access Control List Options
= drop-down
Choose from ACL settings that Amazon provide. Leaving it empty doesn't change the current settings. A full list can be found here.
Encryption
= drop-down
Decide how the files are encrypted inside the S3 bucket. This property is available when using an existing Amazon S3 location for staging.
- None: No encryption.
- SSE KMS: Encrypt the data according to a key stored on KMS. Read AWS Key Management Service [AWS KMS] to learn more.
- SSE S3: Encrypt the data according to a key stored on an S3 bucket. Read Using server-side encryption with Amazon S3-managed encryption keys [SSE-S3] to learn more.
KMS Key ID
= drop-down
The ID of the KMS encryption key you have chosen to use in the Encryption property.
Target Object Name
= string
The filename of the new file.
Target URL
= string
The URL (without file name) that points to where the new file will be created. This should follow the below logic where appropriate:
smb://[[authdomain;]user@]host[:port][/share[/dirpath]][?context]
Other expected methods are also available. Examples:
smb://mydomain;jdoe@fileserver01/documents/projectfiles/
http://11.22.3.4/share5/
Target Domain
= string
The domain that the newly created file is to be located on.
Target Username
= string
Your URL connection username. It is optional and will only be used if the target system requests it.
Target Password
= string
The corresponding password. It is optional and will only be used if the target system requests it.
Set Home Directory as Root
= boolean
- Yes: URLs are relative to the user's home directory.
- No: URLs are relative to the server root.
Target URL
= string
The URL (without file name) that points to where the new file will be created. This should follow the below logic where appropriate:
sftp://[username[:password]@]hostname[:port][path]
Example:
sftp://johndoe@example.com/home/johndoe/documents/
Wherever possible, use the Target Username and Target Password component properties for credentials instead of the URL.
Target Username
= string
Your SFTP connection username. This is optional and will only be used if the data source requests it.
Target Password
= string
Your SFTP password. This is optional and will only be used if the target system requests it.
Select the secret definition storing the password for your SFTP username. The password should be saved as a secret definition before using this component.
Target SFTP Key
= string
Your SFTP private key. This is optional and will only be used if the target system requests it.
Select the secret definition storing the private key for your SFTP account. The password should be saved as a secret definition before using this component.
This must be the complete private key, beginning with "-----BEGIN RSA PRIVATE KEY-----" and conforming to the same structure as an RSA private key.
The following private key formats are currently supported:
- DSA
- RSA
- ECDSA
- Ed25519
In a Hybrid SaaS configuration, you need to manually convert the private key into a format that allows it to be stored in your AWS Secrets Manager. You can do this with the following command:
ssh-keygen -p -f YOUR_PRIVATE_KEY -m pem
Snowflake | Databricks | Amazon Redshift |
---|---|---|
✅ | ✅ | ✅ |