Skip to content

Grid variables

Grid variables are a special type of variable. A grid variable is a two-dimensional array that holds multiple values in named columns.

A grid variable can only be declared as a pipeline variable, not a project variable.


Creating grid variables

Creating a grid variable follows a similar process to creating any other variable, but has some important differences as described here.

  1. Open a pipeline on the pipeline canvas.
  2. Click the Manage variable icon, (x), on the control box in the top-right of the canvas. The Manage your variables dialog will open, listing all pipeline and project variables available to the current pipeline. In this dialog, you can edit or delete any existing variables, or create a new variable.
  3. Click Create variable to open the Create a variable dialog.
  4. Select Pipeline variable.
  5. Select Grid in the Type drop-down.
  6. Complete the following fields, and then click Next:

    • Variable name: Each variable (grid or otherwise) must have a unique name within this pipeline, though variables with the same name can be defined in a different pipeline. Variable names can include letters, digits, and underscores, but can't begin with a digit. For example, my_table_02 and _02_my_table are valid names, but 02_my_table is not. A variable name must not be a JavaScript reserved word (e.g. var or const). If you want to use the variable in a Python or Bash script, it must not have the same name as a reserved word in those scripts.
    • Description: A description to inform other users of the purpose of the variable. This is optional, but its use is recommended.
    • Visibility: Select Public or Private. Read Grid variable visibility, below, for a full explanation.

    A private grid variable is only visible to the pipeline it is defined in. If the pipeline is called from another pipeline, the calling pipeline can't "see" the private variable and so can't use its value, reset its value, or otherwise interact with it in any way. A public grid variable is visible outside the pipeline it is defined in, so it can be "seen" and used by any pipeline that calls the pipeline where it's set.

  7. Create columns in the grid variable, clicking + to add additional columns as needed. See Columns, below, for an explanation of how to use columns. Click Next when all the required columns are defined.

  8. Add default values to the variable if required. This isn't mandatory, but adding sensible defaults can help reduce errors and aid in debugging when the pipeline runs. See Row values, below, for an explanation of using a grid variable to store values.
  9. Click Finish to save the grid variable.

Columns

A grid variable is a two-dimensional array, or table, consisting of one or more named columns with one or more values in each column. The columns in a grid variable are defined and fixed when the variable is created, but the values they contain can be changed dynamically at runtime.

Each column in the grid can be a different variable type, so you could have a column of text and another column of numbers, for example.

To define the columns, enter the following details when the grid variable is created:

  • Column Name: A name to identify the column. This must be unique within the grid variable, but can be the same as column names defined in other grid variables. The name can only contain letters, numbers, underscores _ and dollar symbols $. It can't start with a number.
  • Column Type: The data type that the column will contain. Select Text or Number. The values stored in each column must be suitable for the data type of the column.

Row values

Each row of a grid variable holds a single value for each of the defined columns. This allows you to construct a "table" of related information in a single grid variable. For example, a grid variable with three columns defined as "Name" (text), "Salary" (number) and "Location" (text) could have three rows of values as follows:

Name Salary Location
Alice 35000 New York
Bob 40000 San Francisco
Charlie 30000 Chicago

When a pipeline run begins, all of its grid variables are initialized with the default values that were specified when the variable was created. Variable values can be assigned or reassigned dynamically while the pipeline runs. Values set during the run are valid only for the current run of the pipeline, and for the next run of the pipeline the variable will revert to its default values, or a new value must be assigned.

If you construct a branching pipeline and update a grid variable value in one branch, the value only changes in that branch; the other branch will use the original value of the variable. In effect, each branch of a complex pipeline has its own "local copy" of a grid variable, each of which can be updated independently. If the branches later rejoin, for example through the use of an And or Or component, the grid variable will revert to its default values, regardless of any updates made in either branch.


Grid variable visibility

Setting the correct visibility for a pipeline grid variable is important when you want to call a pipeline from another pipeline, using the Run Orchestration or Run Transformation components.

A Private pipeline variable is only visible to the pipeline it is defined in. If the pipeline is called from another pipeline, the calling pipeline can't "see" the Private variable and so can't use its value, reset its value, or otherwise interact with it in any way.

A Public pipeline variable is visible outside the pipeline it is defined in, so it can be "seen" and used by any pipeline that calls the pipeline where it is set.


Using grid variables

The following components are specifically for use with grid variables:

  • Grid Iterator implements a loop over the rows of a grid variable, running an attached component multiple times, each time with a different row of values from the grid.

Grid variables may also be used to populate property values in many different components. Where grid variables can be used, the dialog which sets the property value will include a checkbox labelled Use Grid Variable.

Some common use cases for grid variables in components are described below.

Data selection

Query components have a Data Selection property that lets you select the columns you want from the data source. Instead of manually selecting the columns when you configure the query component, you can specify that a grid variable will provide the column names. By using a variable instead of hard-configuring the component, you can re-use the same column names across all components in the pipeline, making it easier to consistently set the property and make global changes to all such properties.

To use a grid variable in a Data Selection property:

  1. Create a grid variable with a single column of type Text.
  2. Populate the default values of the grid variable with the names of the columns you want to include in the Data Selection property.

    Note

    These default values could be set or changed dynamically at runtime by use of other components such as Python Script.

  3. In the Data Selection property of your query component, select Use Grid Variable. This will change the dialog to display Grid Variable and Grid Column drop-downs instead of the column selection listboxes.

  4. Use the drop-downs to select the grid variable and the name of the grid variable column.
  5. Click Save.

Populate metadata

In the Create Table and Create External Table components, the metadata required to define table columns in the Columns property can be assigned from a grid variable, as follows:

  1. Create a grid variable with one column for each piece of metadata you want to populate (Column Name, Data Type, Size, Precision, etc.).
  2. Populate the default values of each column with the values of the metadata.

    Note

    These default values could be set or changed dynamically at runtime by use of other components such as Python Script.

  3. In the Columns dialog, select Use Grid Variable. This will change the dialog to display Grid variable and Column Mapping drop-downs instead of the column definition fields.

  4. In the Grid variable drop-down, select which grid variable to use for metadata.
  5. In each Column Mapping drop-down, select the grid variable column that will be used to provide a value to each piece of column metadata.
  6. Click Save.

Passing variables between pipelines

Grid variable values can be set when you call one pipeline from another using the Run Orchestration and Run Transformation components, which allows you to pass variable values from the "parent" to the "child" pipeline it is running. Use the Set Grid Variables property on those components to configure what values will set in the "child" grid variables and what will be passed to them from the "parent".


Using grid variables in scripts

Python scripts

Grid variables are visible to Python scripts created in the Python Script or Python Pushdown components, and in those scripts they will act as any other Python variable.

In the Python script, you can use Python's context object to read and write grid variable values. Because grid variables are arrays, you can use Python arrays to hold the grid variable data.

To get the values from a grid variable, use:

context.getGridVariable('<GridName>')

To write values into a grid variable, use:

context.updateGridVariable('<GridName>', <values>)

The following Python script example takes data from one grid variable, "people", puts the data into a Python array called "p_array", then copies it to different a grid variable, "names":

p_array = context.getGridVariable('people')
context.updateGridVariable('names', p_array)

New values written to a grid variable by Python will be used in all subsequent components that use that grid variable as the pipeline continues to run. This gives you a powerful tool for creating complex yet flexible pipelines, as some very simple Python scripts can dynamically change the behavior of the pipeline at runtime by manipulating variable values.

Note that you can manipulate the variable array in any desired way within the Python script, but the value of the variable within the pipeline itself won't change unless you pass it back with the context object.

Warning

Python will allow you to change the type of a variable, for example by assigning a string to a variable that previously held an integer. When you intend to pass the variable back to the pipeline using the context object, you must take care that you do not change the variable type in your Python script. Doing so will have consequences in your pipeline and will likely cause it to fail.

A full discussion of Python scripting is beyond the scope of this article. Refer to any good Python reference for further information.