Skip to content

Unite

The Unite transformation component lets you combine two or more datasets. You can choose whether to combine all columns, or only the columns that appear in all datasets.

For columns with the same name, the data from one dataset is appended to the data in the other dataset(s). Columns that don't exist in one or more datasets will contain null values in their rows in these datasets.

Use case

This component is useful if you need to combine two or more datasets without using a more complex Join operation, particularly when the datasets are similar. For example, you could use it to combine the same sales data for multiple retail stores, to analyze all your sales data in one dataset.


Properties

Name = string

A human-readable name for the component.


Method = drop-down

  • All Columns: All columns from all inputs are included in the output. Columns that do not exist in one of the input sources will have the SQL NULL value on any rows that come from that source.
  • Overlapping Columns: Only columns that appear in all input sources are included in the output. Columns that do not exist in all of the input sources are dropped.

Cast Types = boolean

  • Yes: If the same-named columns from multiple inputs have differing input types, attempts to cast them to a common type. This is not guaranteed to work, so you should check your data carefully.
  • No: If the same-named columns from multiple inputs have differing input types, reports an error and will not continue.

Add Source Component Column = boolean

  • Yes: Adds a column called "source_table" with the value of the input component name that provided each row of output.
  • No: Do not add an additional column to identify the input component name.

Remove Duplicates = boolean

  • Yes: Remove (merge) duplicate rows so that only one of the duplicate rows remains and all resulting rows are unique.
  • No: Do not remove duplicate rows. Allow duplicate rows to exist in the output table.