Skip to content

Transpose Rows

The Transpose Rows transformation component lets you combine multiple input rows into a single output row. One row will be produced for each value of the columns specified in the Groupings parameter. This is especially helpful when you want to restructure long datasets into a wide format for reporting, comparisons, or aggregations.

The output of this component differs slightly depending on your cloud data warehouse environment:

  • In a Snowflake environment, this component concatenates each value into a delimited string with the format [ value, value, value ]. You could then use a Split Field component after the Transpose Rows component to convert the delimited strings into a set of new columns, if required.
  • In a Databricks environment, this component returns values as Databricks ARRAY <String> types. These values can't be manipulated by a Split Field component.
  • In an Amazon Redshift environment, this component concatenates each value into a delimited string with the format value,value,value. You could then use a Split Field component after the Transpose Rows component to convert the delimited strings into a set of new columns, if required.

Note

  • For an alternative approach to transforming rows into columns, consider the Pivot component instead.
  • If you want to map columns from the input dataset into rows in the output data, use the Transpose Columns component.

Use case

This component can be used to restructure long datasets into a wide format for reporting, comparisons, or aggregations. Some typical uses of this are:

  • Pivoting long-form time-based data into a wide format for period-over-period comparisons.
  • Condensing repeated rows (for example, in survey responses or log files) into a single record.

Properties

Name = string

A human-readable name for the component.


Aggregations = dual listbox

The input column to aggregate.

To use grid variables, tick the Use Grid Variable checkbox at the bottom of the Aggregations dialog.


Sort Order = dual listbox

Rows included in the "groupings" are ordered using this Sort Order parameter. If you're performing multiple transpositions, ordering will ensure that the items are in a consistent order.

To use grid variables, tick the Use Grid Variable checkbox at the bottom of the Sort Order dialog.


Groupings = dual listbox

One or more source columns that form the groupings. The output will have one row for every combination of grouping column values.

To use grid variables, tick the Use Grid Variable checkbox at the bottom of the Groupings dialog.

Name = string

A human-readable name for the component.


Aggregations = dual listbox

The input column to aggregate.

To use grid variables, tick the Use Grid Variable checkbox at the bottom of the Aggregations dialog.


Sort Order = dual listbox

Rows included in the "groupings" are ordered using this Sort Order parameter. If you're performing multiple transpositions, ordering will ensure that the items are in a consistent order.

To use grid variables, tick the Use Grid Variable checkbox at the bottom of the Sort Order dialog.


Groupings = dual listbox

One or more source columns that form the groupings. The output will have one row for every combination of grouping column values.

To use grid variables, tick the Use Grid Variable checkbox at the bottom of the Groupings dialog.

Name = string

A human-readable name for the component.


Aggregations = dual listbox

The input column to aggregate.

To use grid variables, tick the Use Grid Variable checkbox at the bottom of the Aggregations dialog.


Sort Order = dual listbox

Rows included in the "groupings" are ordered using this Sort Order parameter. If you're performing multiple transpositions, ordering will ensure that the items are in a consistent order.

To use grid variables, tick the Use Grid Variable checkbox at the bottom of the Sort Order dialog.


Groupings = dual listbox

One or more source columns that form the groupings. The output will have one row for every combination of grouping column values.

To use grid variables, tick the Use Grid Variable checkbox at the bottom of the Groupings dialog.


Delimiter = string

A delimiting character used to separate concatenated values. The default is a comma.


Null Replace = column editor

  • Field: Select a field from the drop-down menu. This field will have any NULL values replaced by a value the user specifies in the "Replacement" column of this parameter. This may be useful for retaining the order of elements, since otherwise NULL values will not be included.
  • Replacement: The value used to replace NULL values in the corresponding field (row).

Click the Text Mode toggle at the bottom of the Null Replace dialog to open a multi-line editor that lets you add items in a single block. For more information, read Text mode.

To use grid variables, tick the Use Grid Variable checkbox at the bottom of the Null Replace dialog.


Distinct Aggregations = Boolean

When "Yes", any duplicates from the selected aggregation columns are removed. Default setting is "No".


Example

Note

This example uses a Snowflake project. Other cloud data warehouses will give slightly different output formats.

We have a table of products and yearly sales information. However, the format gives us a lot of repeated data:

Input Data

+------------+------+---------+
| product_id | year | sales   |
+------------+------+---------+
| P123       | 2023 | 1000.00 |
| P124       | 2023 | 2500.00 |
| P125       | 2023 | 1800.00 |
| P126       | 2023 | 3500.00 |
| P123       | 2024 | 1500.00 |
| P124       | 2024 | 3000.00 |
| P125       | 2024 | 2200.00 |
| P126       | 2024 | 4000.00 |
+------------+------+---------+

We want to transpose the data into a condensed format that puts the data for each product into a single row:

Output Data

+------------+--------------------+----------------+
| product_id | year               | sales          |
+------------+--------------------+----------------+
| P126       | [ "2023", "2024" ] | [ 3500, 4000 ] |
| P123       | [ "2023", "2024" ] | [ 1000, 1500 ] |
| P124       | [ "2023", "2024" ] | [ 2500, 3000 ] |
| P125       | [ "2023", "2024" ] | [ 1800, 2200 ] |
+------------+--------------------+----------------+

To achieve this transformation, we connect the input table to a Transpose Rows component, which we will configure as follows:

  • Aggregations: year, sales
  • Sort Order: Left blank. In a long dataset, sorting on product_id might be appropriate.
  • Groupings: product_id

Got feedback or spotted something we can improve?

We'd love to hear from you. Join the conversation in the Documentation forum!