Skip to content

Convert Type

The Convert Type transformation component lets you change the data types of the input flow by generating a SELECT clause that casts column types to a new specified type.

If possible, it's better to change the source data so that it already has the correct types. However, sometimes it's necessary to convert the types explicitly.

  • Snowflake types include: VARCHAR, NUMBER, FLOAT, BOOLEAN, DATE, TIMESTAMP, TIME, and VARIANT. The use of these data types is detailed in the Snowflake documentation.
  • Databricks types include: INTEGER, NUMBER, FLOAT, TEXT, TIMESTAMP, DATE, BOOLEAN, and BINARY. The use of these data types is detailed in the Databricks documentation.
  • Amazon Redshift types include: TEXT, INTEGER, NUMERIC, REAL, DOUBLE PRECISION, BOOLEAN, DATE, DATETIME, and SUPER. The use of these data types is detailed in the Amazon Redshift documentation.

Note

  • When appropriate, values are first rounded to the requested decimal places before being cast to the requested size.
  • Users may experience casting errors if using binary values.
  • Although syntax is checked at validation time, runtime errors may occur during type conversion if the input data cannot fit into the requested target type.

Use case

This component is useful when you need to ensure that the data types in your dataset match the requirements of your cloud data warehouse or when you want to standardize data types across different datasets. For example, you might use this component to:

  • Standardize data types across multiple datasets for auditing and regulation.
  • Align mismatched types across disparate systems for downstream join or merge operations.
  • Convert string representations of dates or numbers into their respective date or numeric types for accurate calculations and comparisons.

Properties

Conversions = column editor

Enter the following details for each table column:

  • Column: The name of the column to convert.
  • Type: The type to cast the column to. Choose one of the cloud data platform tabs below for data types applicable to that platform.
  • Size: For Text types, this is the maximum length. This is a limit on the number of bytes, not characters. For Numeric types, this is the total number of digits allowed, whether before or after the decimal point.
  • Precision: The precision of the data in the column. Will be 0 (zero) for non-applicable types.
  • Format: (Amazon Redshift only) The DateTime format. This is only required if Type is set to DATE or DATETIME. If your input column is TEXT and you want to convert to DATE or DATETIME, specify the input format of the input column. For an exhaustive list of possible formats, read the Amazon Redshift documentation

Select your cloud data warehouse below to read about available data types.

  • Varchar: This type is suitable for numbers and letters. A varchar or Variable Character Field is a set of character data of indeterminate length. For more information read, Snowflake VARCHAR types.
  • Number: This type is suitable for numeric types, with or without decimals. For more information, read Snowflake Numeric Data types.
  • Float: These types of values are approximate numeric values with fractional components. For more information, read Snowflake Float type.
  • Boolean: This type is suitable for data that is either "true" or "false", or "1" or "0", respectively. For more information, read Snowflake Logical Operators.
  • Date: This type is suitable for dates without times. For more information, read Snowflake Date/Time types.
  • Timestamp: This type is a timestamp left unformatted (exists as Unix/Epoch Time). For more information, read Snowflake Date and Time Data types.
  • Time: This type is suitable for time, independent of a specific date and timezone. For more information, read Snowflake Time type.
  • Variant: Variant is a tagged universal type that can hold up to 16 MB of any data type supported by Snowflake. For more information, read Snowflake Variant type.
  • Integer: This type is suitable for whole-numbers (no decimals). For more information, read Databricks INT type.
  • Number: This is suitable for numeric types, with or without decimals. For more information, read Databricks Numeric Data types.
  • Float: These types of values are approximate numeric values with fractional components. For more information, read Databricks Float type.
  • Text: Represents character string values. For more information, read Databricks String type.
  • Boolean: This type is suitable for data that is either "true" or "false", or "1" or "0", respectively. For more information, read Databricks Boolean type.
  • Binary: This data type represents byte sequence values. For more information, read Databricks Binary type.
  • Date: This type is suitable for dates without times. For more information, read Databricks Date type.
  • Timestamp: This type is a timestamp left unformatted (exists as Unix/Epoch Time). For more information, read Databricks timestamp type.
  • Variant: This is a semi-structured data type that can store data of multiple types within a single column—similar to how JSON objects work. For more information, read Databricks Variant type.

Note

The VARIANT data type is supported only in Databricks Runtime version 15.4 and above. It's recommended to verify the runtime version you're using in your default environment, as VARIANT isn't compatible with versions earlier than 15.4.

  • TEXT: Represents character string values, subject to a maximum size. TEXT and BPCHAR types.
  • INTEGER: This type is suitable for whole-number data (no decimals). Integer types.
  • NUMERIC: This type is suitable for numeric data values, with or without decimals. DECIMAL or NUMERIC type.
  • REAL: This type is suitable for data of a single precision floating-point number. Floating-Point types.
  • DOUBLE PRECISION: This type is suitable for data of a double precision floating-point number. Floating-Point types.
  • BOOLEAN: This type is suitable for data that is either true or false. Boolean type.
  • DATE: This type is suitable for dates without times. Date type.
  • DATETIME: This type is suitable for dates, times, or timestamps (both date and time). Datetime types.
  • SUPER: Use the SUPER data type to store semi-structured data or documents as values. SUPER type.