Pivot

Project Availability:

Snowflake ✅ Databricks ✅ Amazon Redshift ❌

Component Type:

Transformation

Connection Inputs:

One

Connection Outputs:

Unlimited

The Pivot transformation component lets you turn your rows of data into columns. This generally means reducing the number of rows and increasing the number of columns.

The Pivot column you choose contains the data that you want to aggregate. The Value column you choose contains labels for that data, which define the groups by which the data in the Pivot column can be aggregated. The Pivot Values you specify (values in the Value column) are used as the new column names in the output.

This component is equivalent to an SQL PIVOT function.

Use case

This component can be used to analyze data by aggregating values from a single dataset in different categories. For example, you can use it to:

Calculate total sales per product using a dataset containing data from multiple individual sales.
Monitor average product downtime per day using a dataset containing downtime data for multiple products over a longer period of time.

Properties

Name = string

A human-readable name for the component.

Aggregate Function = drop-down

Select which aggregate function to call to combine the grouped values from the Pivot Column. For more information, read the following:

Pivot Column = drop-down

Specify the column from the source table or sub-query that will be aggregated. This requires an input component before configuring and uses the column names from the input component.

The data type expected in this column depends on the aggregate function you are depending to use. For example, you cannot average non-numeric values.

Value Column = drop-down

Specify the column from the source table or sub-query that contains the values that will group data for aggregations and from which new column names will be generated. This requires an input component before configuring and uses the column names from the input component.

The chosen Value Column will be missing from the output, replaced by a new column for each of the Pivot Values specified.

Pivot Values = multiple strings

A list of values that exist in the Value Column in the input data that are used to group data for aggregation. These values become new columns in the output, with data from the Pivot Column being aggregated into these new columns.

Quote Pivot Value Columns = drop-down

Choose whether to enclose the new column names (the selected pivot values) in quote marks. This can avoid problems with the component if your pivot values contain spaces or special characters, if your pivot values are numeric data, or if your data is in mixed case.

If you use Snowflake, we recommend selecting Yes because Snowflake has specific rules for column identifiers. Enclosing your pivot values in quotes helps to meet these rules.

Examples

Example 1Example 2

Goal: Calculate total monthly sales for each product

We have the following table, containing data from individual sales of three different products (caps, T-shirts, and hoodies) in January, February, and March. Let's use the Pivot component to group the sales data by month and calculate the total sales for each month.

Input data

Product	Month	Sales
T-Shirt	Jan	100
T-Shirt	Jan	50
T-Shirt	Feb	150
T-Shirt	Mar	200
Hoodie	Jan	200
Hoodie	Jan	100
Hoodie	Mar	250
Hoodie	Mar	180
Cap	Jan	60
Cap	Feb	90
Cap	Mar	120

Component configuration

Aggregate Function: Sum
Pivot Column: Month
Value Column: Sales
Pivot Values: Jan, Feb, Mar

Output data

Product	Jan	Feb	Mar
T-Shirt	150	150	200
Hoodie	300		430
Cap	60	90	120

Result: The pivot values Jan, Feb, and Mar are used as the new column names. The values in the value column Sales are aggregated using the Sum function and grouped by the values in the pivot column Month. There was no sales data for hoodies in February, so this value is null.

Goal: Calculate average student exam results by subject

We have the following table, containing different students' exam results for a range of subjects. Let's use the Pivot component to calculate each student's average score in each subject. The students are identified by their student ID number.

Input data

Student_ID	Subject	Score
S001	Science	85
S001	Science	90
S001	History	78
S002	Geography	88
S002	History	92
S002	History	86

Component configuration

Aggregate Function: Average
Pivot Column: Subject
Value Column: Score
Pivot Values: Science, History, Geography

Output data

Student_ID	Science	History	Geography
S001	87.5	78
S002		89	88

Result: The pivot values Science, History, and Geography are used as the new column names. The values in the value column Score are aggregated using the Average function and grouped by the values in the pivot column Subject. There are two null values here, as student S001 has no score for Geography and S002 has no score for Science.