Pivot
Pivot is a transformation component that uses an SQL PIVOT function to transform rows into columns. Generally this means reducing the number of rows and increasing the number of columns.
The component expects a Pivot column and a Value column to be specified from the input. The Pivot column contains data to be aggregated. The Value column contains labels for that data (groups by which data can be aggregated), which are then specified in the Pivot Values property.
Note that Pivot Values must be the names of values that appear in the Value Column of the input and these values are used as new column names in the pivoted table output.
Properties
Name
= string
A human-readable name for the component.
Aggregate Function
= drop-down
Select which aggregate function to call to combine the grouped values from the Pivot Column. For more information, read the following:
Pivot Column
= drop-down
Specify the column from the source table or sub-query that will be aggregated. This requires an input component before configuring and uses the column names from the input component.
The data type expected in this column depends on the aggregate function you are depending to use. For example, you cannot average non-numeric values.
Value Column
= drop-down
Specify the column from the source table or sub-query that contains the values that will group data for aggregations and from which new column names will be generated. This requires an input component before configuring and uses the column names from the input component.
The chosen Value Column will be missing from the output, replaced by a new column for each of the Pivot Values specified.
Pivot Values
= multiple strings
A list of values that exist in the Value Column in the input data that are used to group data for aggregation. These values become new columns in the output, with data from the Pivot Column being aggregated into these new columns.
Examples
We have a table of salespeople and how much of each product they've sold. However, it's a bit cumbersome to look at since the salesperson is often repeated.
Input Data
+-------------+---------+----------+
| SALESPERSON | PRODUCT | QUANTITY |
+-------------+---------+----------+
| John | Apples | 30 |
| John | Oranges | 20 |
| Jane | Apples | 15 |
| Jane | Oranges | 25 |
| Mike | Apples | 20 |
| Mike | Oranges | 30 |
| John | Apples | 5 |
+-------------+---------+----------+
Pivot the PRODUCT column and aggregate sales of each product.
Since a salesperson could have any number of entries (for example, John has 3 entries, two of which are for apple sales), we'd like to sum up their total sales for each fruit. This is a good way of summarising sales receipts.
Pivot component properties
- Aggregate Function: Sum
- Pivot Column: QUANTITY
- Value Column: PRODUCT
- Pivot Values:
- Apples
- Oranges
Output Data
+-------------+--------+---------+
| SALESPERSON | APPLES | ORANGES |
+-------------+--------+---------+
| John | 35 | 20 |
| Jane | 15 | 25 |
| Mike | 20 | 30 |
+-------------+--------+---------+
Pivot the SALESPERSON column and find average sales of each product.
It's interesting to see that a similar (but in some ways opposite) result can be gained by using SALESPERSON as our Value Column and changing the Pivot Values accordingly.
To demonstrate how the resulting table data can be reflective of the chosen aggregation function, we've set the Aggregate Function property to 'Average'.
In both of these examples, we've aggregated the same data. But it's important to think of which columns you want to end up with and what groupings you require to make that happen.
Pivot component properties
- Aggregate Function: Average
- Pivot Column: QUANTITY
- Value Column: SALESPERSON
- Pivot Values:
- John
- Jane
- Mike
Output Data
+---------+--------+--------+--------+
| PRODUCT | MIKE | JANE | JOHN |
+---------+--------+--------+--------+
| Apples | 20.000 | 15.000 | 35.000 |
| Oranges | 30.000 | 25.000 | 20.000 |
+---------+--------+--------+--------+
Snowflake | Databricks | Amazon Redshift |
---|---|---|
✅ | ✅ | ❌ |