Bash Script
Overview
Run a Bash script.
The script is executed in an external Bash process hosted by the Matillion ETL instance. Any errors encountered while running the script will immediately halt it.
All the usual variables are made available in the bash environment and any changes made to such variables will never be visible outside of the current script execution.
Matillion ETL runs as a Tomcat user and care must be taken to ensure this user has sufficient access to resources and does not uninstall any customer-installed Bash libraries.
Note
To ensure that instance credentials access is managed correctly at all times, we always advise that customers limit scopes (permissions) where applicable.
Note
Since Matillion ETL is based on the latest Linux, the command line tools are all installed. Furthermore, the credentials stored in your current environment are exported into the shell, so you may (and indeed should) omit security keys from your scripts when calling the APIs.
Note
If you cancel a task while a Bash script is running, then it is killed. If the timeout is exceeded, the script is also killed. The purpose of the timeout is to ensure scripts will never run forever even if they enter an infinite loop, or are blocked by an external resource.
Properties
Name
= string
A human-readable name for the component.
Script
= code editor
The Bash script to execute. Output from commands should be brief, as it is sent into the Task Status message.
Timeout
= integer
The number of seconds to wait for script termination. After the set number of seconds has elapsed, the script is forcibly terminated. The default is 300 seconds (5 minutes).
User
= drop-down
Set the user type. For legacy Jobs, this property will not be available. The default setting is Restricted to prevent accidental or inexperienced coding errors that could damage the Matillion ETL instance.
- Privileged: A privileged user will have access to credentials and Tomcat folders.
- Restricted: Restricted users do not have access to credentials stored in the instance, nor any Tomcat folders. When creating new jobs, this is the default setting.
Strategy
Runs the Bash script, redirecting any output it produces into the task message.
Enabling the User Property
To enable the User property, follow these steps:
- ssh into the Matillion ETL instance.
-
Create a .sh file and paste the following script into the .sh file:
#!/bin/bash useradd -r restricteduser usermod -a -G restricteduser tomcat cat <<HEREDOC < /etc/sudoers.d/matillion-sudo # User rules for tomcat Cmnd_Alias EMD = /sbin/service tomcat restart, /sbin/service tomcat restart, /bin/systemctl restart tomcat.service, /bin/systemctl restart tomcat.service, /sbin/sv restart tomcat, /sbin/sv restart tomcat, /bin/mkdir /usr/share/*, /usr/bin/yum -y check-update matillion-*, /usr/bin/yum -y update matillion-*, /bin/touch /usr/share/*, /bin/chmod 755 /usr/share/*, /bin/chown tomcat\\:tomcat /usr/share/emerald/oom, /bin/chown tomcat\\:tomcat /usr/share/emerald/oom/*, /usr/sbin/logrotate --force /etc/logrotate.d/tomcat, /usr/sbin/logrotate --force /etc/logrotate.d/tomcat, /usr/bin/su restricteduser -c /bin/bash /tmp/interpreter-input-*.tmp, /usr/bin/su restricteduser -c /usr/bin/python /tmp/interpreter-input-*.tmp, /usr/bin/su restricteduser -c /usr/bin/python3 /tmp/interpreter-input-*.tmp tomcat ALL=(ALL) NOPASSWD:EMD HEREDOC
-
Save and close this .sh file and then run the following command:
chmod +x {FILE_NAME}
-
The above command makes the .sh file executable. Once you have done that, run the following command:
./{FILE_NAME}
- The newly created .sh file should run successfully.
- In Matillion ETL, click Admin and then click Restart Server. Once the server restarts, the User property should be selectable on the Bash Script component.
Restricting component availability
You can, if required, configure the Bash Script component to be unavailable on your Matillion ETL instance. To do this, follow these steps:
- SSH into the instance.
- Open the file
emerald.properties
as a root user (sudo). - Locate
MTLN_ALLOW_BASH_COMPONENTS
and set the value tofalse
. - Save and close the file and restart the server.
Note
By default, MTLN_ALLOW_BASH_COMPONENTS
is set to true
.
Snowflake | Delta Lake on Databricks | Amazon Redshift | Google BigQuery | Azure Synapse Analytics |
---|---|---|---|---|
✅ | ✅ | ✅ | ✅ | ✅ |