Skip to content

Components overview

Components are the basic building blocks of pipelines. Each component represents a single task within the pipeline. For example, the Create Table component executes a task that creates a table when the pipeline runs, while the Salesforce Query component executes a task that returns data from Salesforce. Multiple components can be linked together to perform complex pipelines involving multiple tasks. For example, a pipeline may pull data from Salesforce, filter the data according to some criteria, and write the data to files in an S3 bucket, using the Salesforce Query, Filter, and S3 Unload components executed sequentially.

Each component is specifically applicable to one type of pipeline, orchestration or transformation, and it is impossible to add it to the other type of pipeline. However, see the Run Transformation component for a way to include a transformation pipeline (and its components) in an orchestration pipeline.


View components

To view the components that make up a pipeline, open the pipeline from the Pipelines panel by double-clicking its name or right-clicking its name and then clicking Open Pipeline. This will display the full pipeline on the canvas. Each component is represented by a separate icon on the canvas, and the connectors between the component icons shows the pipeline flow—the order in which components are executed.

When you have a pipeline open, the components available to use in that pipeline are shown in the Components panel at the bottom-left of the Designer screen.


Add components to a pipeline

With the pipeline open in the canvas, locate the component you want to add in the Components panel. Components are arranged in a tree view that groups similar components together. For example, expand the Scripting group to show the two scripting components, Bash Script and Python Script.

The Components panel only shows the components that can be used in the type of pipeline currently open. For example, Create Table is an orchestration component and can only be added to orchestration pipelines, never to transformation pipelines, and so is only visible in the Components panel when you have an orchestration pipeline open.

To add a component to the pipeline, simply drag it from the Components panel to the canvas, and drop it in some convenient position. You can subsequently drag components around to new positions on the canvas if you wish.

The component begins with a default name on the canvas, which can be changed in the component's Properties.

To delete a component from the canvas, select the component and click Delete on the context menu. You can undo the delete operation by pressing CTRL + Z (Windows) or CMD + Z (Mac). After undoing, you can redo the delete operation by pressing CTRL + Y (Windows) or CMD + Y (Mac).

Copying components

You can copy a component to create a duplicate elsewhere in the pipeline. Use the component's context menu to copy the component, and then use the context menu to paste a copy of the component, including all configured properties, elsewhere on pipeline. Connect the copy into the pipeline as required.

You can also copy a component between pipelines. Use the component's context menu to copy the component, switch to a different pipeline, and use the context menu to paste the component onto the new pipeline. The copied component is identical to the original, including all configured properties. The two pipelines must be of the same type; you can't paste from an orchestration pipeline to a transformation pipeline, or vice versa.

Select multiple components by holding down the SHIFT key while you click all the components you want to select, or hold down SHIFT while you drag the mouse pointer over the components you want to select. You can then copy all selected components as a single operation. To clear the selection, click on any empty part of the canvas.

Components context menu

Function Description Windows shortcut Mac shortcut
Copy Right-click on any component, and select Copy. CTRL + C CMD + C
Paste Right-click anywhere on the blank canvas and click Paste to create a copy of the component at that location. The copied component is identical to the original, including all configured properties. The Name will be slightly modified, and can be changed as normal. CTRL + V CMD + V
Delete Right-click on any component, and select Delete to remove the component from the pipeline canvas. DELETE FN + BACKSPACE or DELETE

Connect components

The components will then need to be connected to one another in the order in which they will be executed in the pipeline. Each component is surrounded by connection rings: usually (but not always) one on the left and three on the right. To link two components, hover over a connection ring on the right of one component until the pointer changes to a + symbol. Then click and drag a line to the connection ring on the left of the second component. This will draw a permanent connection between the components.

To disconnect components, click the line that joins them and press Backspace or Delete.

The order that components are joined is crucial to the order that tasks are executed in the pipeline. The flow of pipeline execution proceeds from left to right, so the connection rings at the left of a component represent the input to the component, passed from the previous component, while the rings on the right represent the output of the component, which is passed to the next component.

To take an example of a Salesforce Query component taking data from Salesforce and writing it to an S3 bucket with S3 Unload, the pipeline starts with the Salesforce Query, so we place that on the left side of the canvas. One of the output rings on the right of that component should then be connected to the input ring at the left of the S3 Unload component, which naturally should be placed at the right side of the canvas for convenience. This tells the pipeline which order these components are to be executed in, and that the data generated by the first component should be made available for the second component to work with.

You can create complex, branching pipelines by linking two or more of a component's outputs to two or more other components.

The connection rings on the right of a component are color-coded, to denote the following specific functions:

  • Green: Continue on to the next component if this component executed successfully.
  • Red: Continue on to the next component if this component failed to execute.
  • Grey: Continue on to the next component regardless of the success or failure of this component.

These connection options allow you to direct the pipeline flow according to whether the component task is successful or not, creating different branches for each outcome.


Component properties

Every component has properties that can be configured to tailor the component's behavior. When a component is selected on the canvas, its properties are displayed in the Properties panel below the canvas. A component's properties are described in the documentation for each component.

Changing a component's properties won't affect the properties of any other component, even components of the same type.


Sampling output

Most transformation components give you the ability to sample the data output by that component. This allows you to see the data the component is working with, to confirm that the component is set up correctly before using it in a live pipeline.

Before you can sample a component's output, you must first validate the pipeline. Read The pipeline canvas for more information.

To sample, select the component on the canvas and click the Sample tab in the panel below the canvas. Then click Sample data. The tab will display a table of data that the component would return when the pipeline runs, up to a maximum of 1000 rows.

Component SQL

Most components produce SQL statements that act upon your data. You can view (but not alter) these SQL statements to learn how the component is acting. This may be useful in debugging failed pipelines, for example.

To view the SQL statements, select the component on the canvas and click the SQL tab in the panel below the canvas.