Transpose Columns
The Transpose Columns transformation component rotates a table by transforming columns from the input dataset into rows in the output data. This reshapes data by outputting multiple rows for each individual input row. Each set of input columns is mapped to an output column. The output rows are labelled to determine which column the value originated from.
This component effectively performs the reverse of a pivot operation on the data. Consider the Unpivot component for an alternative way of obtaining similar results.
Use case
This component is useful when you have "wide" data (many columns) that you want to make "long" or normalized to help with analytics, modeling, and reporting. Some typical uses of this are:
- Unpivoting time-based data to allow time-series analysis. For example, convert a table with one column per year into a table with one row per year.
- Converting multiple response columns from a survey into a normalized format for easier filtering and aggregation.
Properties
Name
= string
A human-readable name for the component.
Ordinary Columns
= dual listbox
Choose the ordinary columns, those that are not going to be transposed but are still required in the output. These are effectively a set of grouping columns that are passed to the output unchanged.
To use grid variables, select the Use Grid Variable checkbox at the bottom of the dialog. For more information, read grid variables.
Row Label Name
= string
Provide the name of a new column here. It will contain constants you enter into the Column to Row Mapping property, which identifies the original column that the new row originated from.
Output Columns
= column editor
- Name: A new column name to hold the output of multiple input columns.
- Type: Specify the data type for the column. Should be compatible with all input columns that will be mapped into this column. This is used to validate that the input columns all conform to the type of the output column. Choose from the following data types:
- VARCHAR: This type is suitable for numbers and letters. A varchar or Variable Character field is a set of character data of indeterminate length.
- NUMBER: This type is suitable for numeric types, with or without decimals.
- FLOAT: This type of values are approximate numeric values with fractional components.
- BOOLEAN: This type is suitable for data that is either "true" or "false".
- DATE: This type is suitable for dates without times.
- TIMESTAMP: This type is a timestamp left unformatted (exists as Unix/Epoch Time).
- TIME: This type is suitable for time, independent of a specific date and timezone.
- VARIANT: Variant is a tagged universal type that can hold up to 16 MB of any data type supported by Snowflake.
To use grid variables, select the Use Grid Variable checkbox at the bottom of the dialog. For more information, read grid variables.
Click the Text mode toggle at the bottom of the dialog to open a multi-line editor that lets you add items in a single block. For more information, read Text mode.
Column to Row Mapping
= editor
- Row Label Name: This editor column will actually appear as the label provided in Row Label Name. Enter an identifier to specify what the rows represent.
- Output Column-1: Each defined output column will appear as a column in this mapping. Add a row to this grid for each input column you want to map into an output column.
- Output Column-n: As above, if you are mapping multiple sets of input columns. When you map data into multiple output columns, there should be a set of similar input columns for each output column. For example, you may have a set of input columns for each quarterly revenue amount, and another set of input columns for quarterly profits.
To use grid variables, select the Use Grid Variable checkbox at the bottom of the dialog. For more information, read grid variables.
Name
= string
A human-readable name for the component.
Ordinary Columns
= dual listbox
Choose the ordinary columns, those that are not going to be transposed but are still required in the output. These are effectively a set of grouping columns that are passed to the output unchanged.
To use grid variables, tick the Use Grid Variable checkbox at the bottom of the Ordinary Columns dialog. For more information, read Grid Variables.
Row Label Name
= string
Provide the name of a new column here. It will contain constants you enter into the Column to Row Mapping property, which identifies the original column that the new row originated from.
Output Columns
= column editor
- Name: A new column name to hold the output of multiple input columns.
- Type: Specify the data type for the column. Should be compatible with all input columns that will be mapped into this column. This is used to validate that the input columns all conform to the type of the output column. Choose from the following data types:
- TEXT: A string that can hold any kind of data, subject to a maximum size.
- INTEGER: An integer data type is suitable for whole numbers (no decimals).
- NUMERIC: The numeric data type accepts numbers, with or without decimals.
- REAL: This type is suitable for data of a single precision floating point number.
- DOUBLE PRECISION: This type is suitable for data of a double precision floating point number.
- BOOLEAN: Data with a Boolean data type can be either "true" or "false".
- DATE: This type is suitable for dates without times.
- DATETIME: This type is suitable for dates, times, or timestamps (both date and time).
- SUPER: Uses the SUPER data type to store semi-structured data or documents as values.
To use grid variables, tick the Use Grid Variable checkbox at the bottom of the Output Columns dialog. For more information, read Grid Variables.
Toggle Text Mode to add information to the Output Columns dialog. For more information, read Text mode.
Column to Row Mapping
= editor
- Row Label Name: This editor column will actually appear as the label provided in Row Label Name. Enter an identifier to specify what the rows represent.
- Output Column-1: Each defined output column will appear as a column in this mapping. Add a row to this grid for each input column you want to map into an output column.
- Output Column-n: As above, if you are mapping multiple sets of input columns. When you map data into multiple output columns, there should be a set of similar input columns for each output column. For example, you may have a set of input columns for each quarterly revenue amount, and another set of input columns for quarterly profits.
To use grid variables, tick the Use Grid Variable checkbox at the bottom of the Column to Row Mapping dialog. For more information, read Grid Variables.
Example
We have a table of products and yearly sales information. However, the format doesn't allow us to easily do analysis by year:
Input Data
+-------------+------------+------------+
| product_id | sales_2023 | sales_2024 |
+-------------+------------+------------+
| P123 | 1000.00 | 1500.00 |
| P124 | 2500.00 | 3000.00 |
| P125 | 1800.00 | 2200.00 |
| P126 | 3500.00 | 4000.00 |
+-------------+------------+------------+
We want to transpose the data into a form that has a single "year" column, which will allow us to sort and filter on that column. Our output table should look like this:
Output data
+------------+------+---------+
| product_id | year | sales |
+------------+------+---------+
| P123 | 2023 | 1000.00 |
| P124 | 2023 | 2500.00 |
| P125 | 2023 | 1800.00 |
| P126 | 2023 | 3500.00 |
| P123 | 2024 | 1500.00 |
| P124 | 2024 | 3000.00 |
| P125 | 2024 | 2200.00 |
| P126 | 2024 | 4000.00 |
+------------+------+---------+
To achieve this transformation, we connect the input table to a Transpose Columns component, which we will configure as follows:
- Ordinary Columns: product_id
- Row Label Name: year
- Output Columns:
- Name: sales
- Type: NUMBER
- Columns To Row Mapping:
- year: 2023 sales: sales_2023
- year: 2024 sales: sales_2024
The Columns To Row Mapping property is the key to this transformation, as it tells us that for each product_id
in the input, data from the original sales_2023
column goes into an output row with "2023" in the year
column, while data from the original sales_2024
column goes into a separate output row with "2024" in the year
column.
Got feedback or spotted something we can improve?
We'd love to hear from you. Join the conversation in the Documentation forum!