Rewrite Table
The Rewrite Table transformation component writes the input dataset to a new table in your cloud data warehouse. If a table of the same name already exists, it will be replaced by a new table. Rewrite Table will be the last component in your pipeline, as you write the dataset out after all other processing is complete.
This is equivalent to the SQL CREATE OR REPLACE TABLE command.
When using this component, be aware of the following:
- Because the output table is overwritten each time the component is executed, you shouldn't use this component to output permanent data that you don't want to overwrite.
- Runtime errors may occur. For example, if a data value overflows the maximum allowed size of a field.
- The component isn't suitable for incremental updates, as all existing data in the table is deleted during the rewrite.
To write the dataset to a table without replacing the entire target table, use Table Output or Table Update instead.
Use case
The Rewrite Table component will completely replace the contents of a target table in your data warehouse with new data. Some common use cases for this include:
- Full refresh of dimension tables. When the data in a dimension table changes infrequently or is small enough to reload entirely, use Rewrite Table to replace the entire contents with the latest snapshot.
- Refresh staging tables. You can overwrite a staging table using Rewrite Table before running further transformation pipelines on it.
- Initial load of target tables. Create and populate a new table to give a clean load of initial data.
Properties
Name
= string
A human-readable name for the component.
Warehouse
= drop-down
The Snowflake warehouse used to run the queries. The special value [Environment Default]
uses the warehouse defined in the environment. Read Overview of Warehouses to learn more.
Database
= drop-down
The Snowflake database. The special value [Environment Default]
uses the database defined in the environment. Read Databases, Tables and Views - Overview to learn more.
Schema
= drop-down
The Snowflake schema. The special value [Environment Default]
uses the schema defined in the environment. Read Database, Schema, and Share DDL to learn more.
Target Table
= string
A name for the new table.
This component uses the CREATE OR REPLACE clause. When using the REPLACE clause, it also applies the COPY GRANTS clause. When you clone or create a new object (such as a table, view, schema, or database) from an existing one, the new object doesn't automatically inherit the original's grants (privileges). However, with the COPY GRANTS clause, you can seamlessly transfer object-level privileges from the source object to the new one. This helps maintain consistent access control and simplifies permission management when cloning or recreating objects. For more information, read Snowflake COPY GRANTS.
Order By
= column editor
Select the columns to order by. Set the corresponding column to be ordered ascending (default) or descending.
Name
= string
A human-readable name for the component.
Catalog
= drop-down
Select a Databricks Unity Catalog. The special value [Environment Default]
uses the catalog defined in the environment. Selecting a catalog will determine which databases are available in the next parameter.
Schema (Database)
= drop-down
The Databricks schema. The special value `[Environment Default]` uses the schema defined in the environment. Read [Create and manage schemas](https://docs.databricks.com/en/data-governance/unity-catalog/create-schemas.html) to learn more.
Table
= string
A name for the new table.
Partition Keys
= dual listbox
Select any input columns to be used as partition keys.
Table Properties
= column editor
- Key: A metadata property within the table. These are expressed as key=value pairs.
- Value: The value of the corresponding row's key.
Comment
= string
A descriptive comment of the view.
Name
= string
A human-readable name for the component.
Schema
= drop-down
Select the table schema. The special value [Environment Default]
uses the schema defined in the environment. For more information on using multiple schemas, read Schemas.
Target Table
= string
A name for the new table.
Table Sort Key
= column editor
Specify the columns from the input that should be set as the table's sort-key.
Sort-keys are critical to good performance. Read the Amazon Redshift documentation for more information.
Table Distribution Style
= drop-down
Select how the data in the table will be distributed. Choose from - ALL, EVEN, or KEY. For more information, read Distribution styles.
Table Distribution Key
= drop-down
This is only displayed if the Table Distribution Style is set to Key. It is the column used to determine which cluster node the row is stored on.