Matillion ETL features a scheduler that will launch orchestration jobs automatically at a pre-defined, regular time interval. For scheduled or queued Matillion ETL jobs to run, the Matillion ETL instance must be running. Schedules are created against a project, and users can set up multiple schedules. The scheduler is based on Quartz technology.
Scheduling is not supported on t3.xlarge instance sizes in AWS.
Click Project → Manage Schedules to access the Manage Schedules dialog.
Existing schedules are listed at the top of the dialog with the following properties:
- Schedule: The name of the schedule.
- Last Run: The timestamp this schedule was last run.
- Next Run: The timestamp of the next time this schedule is to be run.
- Status: Whether this schedule is configured properly and activated for runs, or stopped.
- Run Now: Run the schedule. This schedule's Status must be Enabled before it is run.
- Edit: Allows you to edit details of the schedule. See the Creating Schedules section for more information.
- Delete: The delete icon, X, deletes and disables the schedule.
Clicking a schedule will display the information on the times this schedule has executed:
- Version: The name of the version the job existed in at the time of this run.
- Job: The job the selected schedule executed.
- Environment: The environment in which the job was executed.
- Started: The time this schedule run began.
- Duration: The time taken for the job to complete.
- Run Status: The job end status for this schedule run. Either failure (red icon) or success (green icon).
- Task Info: Click for a breakdown of the components in this job run, their run times, row counts and messages.
And the following options:
- Refresh: The Refresh button updates the dialog for any unloaded runs.
- Full History: The schedule history for this specific schedule.
Click + in the Manage Schedules dialog to access the Create Schedule dialog.
Complete the following fields on this dialog, then click OK to save the new schedule.
- Name: A name for the schedule.
- Enabled: Select to enable the schedule. Clear to disable the schedule.
- Timezone: The timezone to which Matillion ETL's scheduler should correspond.
- Hours: Select the hours of the day the scheduled job should run in. Hours are 0–23 inclusive. If you wish to run the scheduled job across multiple hours, separate hours with a comma, for example:
0,3,7,12,16,19. Use the wildcard character,
*, to signify every hour.
- Minutes: Select the minutes of an hour that the scheduled Job should run. Minutes are 0–59 inclusive. To run the job multiple times in an hour, separate minutes with a comma, for example:
0,15,30,45. Use the wildcard character,
*, to signify every minute in an hour.
- Days of Week: The days of the week to run the job. Click Select All if you wish to select all days.
- Days of Month: A comma-separated list of days within the month to run the scheduled job. Use the wildcard character,
*, to signify every day of the month. Use
Lto denote the last day of the month. You can use an expression such as
15Wto designate the closest weekday to the 15th.
- Version: Select the Matillion ETL project version to use in this schedule.
- Job: Select the job to schedule.
- Environment: Select the Matillion ETL environment.
- Run job in clone: Select this option to schedule the job to run in a clone of the environment or database. An Environment must be selected before this option can be selected, and this environment will be cloned at the point the job is run. Note that this feature is only available to Enterprise users of Matillion ETL for Snowflake. For a full description of this feature, read Creating a Snowflake Zero-Copy Clone.
- Configure 'Run options': If the job is scheduled to run in a clone (see above), selecting this will open a dialog that allows you to configure the clone's run options. All run options will be selected by default if you don't configure them differently here. For a full description of clone run options, read Running a job in a clone.
- Prevent duplicate job: When this option is selected, Matillion ETL detects whether the scheduled job is already running in the selected environment. If the job is already running, Matillion ETL prevents the queuing of a duplicate of that job. For more information, see Job start control, below.
- Ignore misfires: A "misfire" occurs when a job is unable to run at the scheduled time, for example if the Matillion ETL instance isn't running at that time. The scheduler will normally attempt to re-run misfired jobs, with the result that a large number of missed jobs may all be triggered when the Matillion ETL instance is next started. To prevent this behavior, Ignore misfires should be selected. If this option is selected, jobs that don't run at the specified time won't run at all.
- Test: Click this button to test that the schedule is configured correctly. Testing the setup won't create the schedule.
Create the schedule by clicking Confirm.
Additional setup guidance
- A time increment can be specified using
/to separate two values. The first value is the number to begin from and the second is the number to increment.
10/5would mean to start at 10 with increments of 5 (10,15,20,25...). We can use these in the Hours and Minutes fields to run a schedule at regular intervals. For example:
0/15in the Minutes field would mean to schedule the task every 15 minutes.
- The job itself is run by Matillion ETL, which is always based on UTC and is offset to local time using the Timezone dropdown menu. The Next Run for all job times shows a similar time offset, but in the timezone of the browser, rather than the one selected in the dropdown menu. Recent changes to timezone and/or daylight saving rules may not be reflected until you perform an update of the instance that Matillion ETL is running on. Update instructions can be found in Updating and migrating overview.
- You'll see the tasks run in the Tasks panel.
- If you choose a time zone with daylight savings, you might find unexpected results for schedules up to three hours after midnight on the day the clocks are adjusted.
- Schedules can also be created for any given job by right-clicking the job in the Explorer panel and selecting Schedule Job.
- The scheduling system is based on Quartz, a powerful scheduling system with some advanced features that are covered in more detail in the Quartz Documentation.
- Scheduled tasks can't occur during the maintenance window of the EC2 cluster that Matillion ETL is running on. Inputting a time during the maintenance window will show an error message when the schedule is tested.
- If you select the option Run job in clone, the selected environment will be cloned at the point the scheduled job is run. If the clone would cause you to exceed the permitted number of environments at run time, the clone won't be created and therefore the scheduled job will be cancelled. An error message will be displayed in the Tasks panel.
Enabling and disabling schedules
There are times when you need to temporarily disable the scheduler to prevent all scheduled jobs from running. This may be, for example, in response to some emergency event, or because you are performing a migration to a new instance. Although it's possible to disable individual schedules one-by-one by editing their properties, it's quicker and easier to disable the scheduler, which will disable all scheduled jobs.
Because schedules are specific to a project, disabling the scheduler will only apply to the currently active project. To disable all schedules for a Matillion ETL instance, you will have to disable the scheduler for each project you have created.
To disable the scheduler, click Project, then click Manage Schedules to access the Manage Schedules dialog. Clear the Schedules are enabled for project checkbox at the top of the dialog. You will be asked to confirm that you want to pause all schedules currently running. Click Pause Service to confirm. All pending schedules will now be disabled until you re-enable them by selecting Schedules are enabled for project again.
Note that it's also possible to disable individual schedules by clearing the Enabled checkbox in the Edit Schedule dialog.
Disabling the scheduler will over-ride the setting of the Enabled/Disabled status of any individual schedule: if the scheduler is disabled, all schedules will be halted, regardless of their individual status.
The current status of each schedule is shown in the Manage Schedules dialog. There are three possible values for this:
- Enabled: The scheduler is enabled and this specific schedule is enabled; the job will run as scheduled.
- Disabled: The scheduler is enabled but this specific schedule is disabled; the job won't run as scheduled.
- Enabled (*): This specific schedule is enabled but the scheduler is disabled; the job won't run as scheduled.
When migrating a Matillion ETL instance, you will be given the option to Disable schedules. Selecting this option will cause the scheduler to be disabled in the target instance, and you will have to re-enable it in the Manage Schedules dialog of the target instance after migration, if required.
Job start control
Any high-frequency scheduled jobs risk developing a backlog of queued duplicate jobs. If a Matillion ETL job is scheduled to run frequently (for example, every ten minutes) it's conceivable that before one run of the job finishes, another ten minutes will have elapsed, thus queuing a duplicate run of the same job.
You can prevent these duplicate job as follows:
Near the bottom of the Create Schedule dialog is a field labelled Prevent duplicate job. When this is selected, Matillion ETL detects whether the scheduled job is already running in the selected environment. If the job is already running, Matillion ETL prevents the queuing of a duplicate of that job.
If you require assistance preventing duplicate jobs, read Getting Support.
Normally when a job is running automatically, it's desirable to have some notification as to when it has completed. The optimal way to configure this is using the SNS Message component in your orchestration.