Skip to content

Create External Table

This component lets users to create an "external" table that references externally stored data. This creates a table that references the data that is held externally, meaning the table itself does not hold the data. External tables can be queried but are read-only. To learn more, read Working with External Tables.

External tables require an external stage to stage data. External stages can be managed via the Manage Stages menu.

Referencing externally held data can be valuable when wanting to query large datasets without resorting to storing that same volume of data in Snowflake.

Data loaded in these tables takes the form of a single column ("VALUE") with a variant-type value with multiple properties. Each column of target data becomes a property within the variant on its respective row. Thus, the data will usually require some transformation to be in a desirable format. See also: Flatten Variant.

External Tables must be refreshed before use whenever the externally held data is changed using the Refresh External Table component. We recommend always refreshing an External Table before using it in a job.


Properties

Name = string

A human-readable name for the component.


Create/Replace = drop-down

  • Create: Create the new table with the given name. Will fail if a table of that name already exists.
  • Create if not exists: Will create the new table with the given name unless one already exists. Will succeed and continue in either case.
  • Replace: Will create the new table, potentially overwriting any existing table of the same name.

Database = drop-down

The Snowflake database. The special value, [Environment Default], will use the database defined in the environment. Read Databases, Tables and Views - Overview to learn more.


Schema = drop-down

The Snowflake schema. The special value, [Environment Default], will use the schema defined in the environment. Read Database, Schema, and Share DDL to learn more.


New Table Name = string

The name of the external table to be created or used.


Partition Columns = column editor

A partition is defined by an expression applied to a set of data, resulting in a partition that only contains rows that satisfy the expression. For example, partitioning a dataset by the value in the 'year' column.

For more information about setting up partitions with Snowflake's external tables, read Managing Regular Data Loads.

  • Name: The name of the new partition column.
  • Type: The data type of the partition column.
  • Size:The data size of the new column values.
  • Precision: The precision of the new column values.
  • Expression: The expression used to partition the data.

Stage Database = drop-down

Select a Snowflake database for the external stage.


Stage Schema = drop-down

Select a Snowflake schema for the external stage.


Stage = drop-down

Select an external stage for the data. Staging areas can be managed via the Manage Stages menu or created through Snowflake using the CREATE STAGE command.


Relative Path = string

The directory path to follow to the target data. File names cannot be specified. The path is relative to the storage location given in the external stage setup.


Pattern = string

Specify, using regular expression (RegEx) pattern syntax, files to be matched on the external stage.

For example: '.*flight.*[.]csv'


Format = drop-down

Select a pre-made file format that will automatically set many of the component properties accordingly. These formats can be created through the Create File Format component.


File Type = drop-down

The type of expected data to load. Some data may require additional formatting, explained in Preparing to Load Data.

Available options are: AVRO, CSV, JSON, ORC, PARQUET, and XML.

Component properties will change to reflect the choice made here and give options based on the specific file type.


Enable Octal = boolean

Enables parsing of octal numbers.


Compression = drop-down

Whether the input file is compressed in gzip format, BROTLI, BZ2, DEFLATE, RAW_DEFLATE, ZSTD, or not compressed at all.


Record Delimiter = string

The delimiter to be used that separates records (rows) in the file. Defaults to newline.

\ can also signify a newline. \\r can signify a carriage return.


Field Delimiter = string

The delimiter that separates columns. The default is a comma. A [TAB] character can be specified as \.


Skip Header = integer

The number of rows at the top of the file to ignore. The default setting is 0.


Escape = string

When this option is specified, the backslash character \ in input data is treated as an escape character.


Escape Unenclosed Field = string

Single character string used as the escape character for unenclosed field values. Accepts common escape sequences, octal values, or hex values. Also accepts a value of NONE (default). If a character is specified in the Escape field, it will override this field.


Trim Space = boolean

Removes trailing and leading whitespace from the input data.


Field Optionally Enclosed = string

A character that is used to enclose strings. Can be a single quote ' or a double quote " or NONE (default). Escape the character with an instance of the same character.


Null If = string

This option replaces the specified string with NULL in the output table. Use this if your data has a particular representation of missing data.


Error On Column Count Mismatch = boolean

Generate an error if the number of delimited columns in the input does not match that of the table. If false, extra columns are not loaded into the table and missing columns are recorded as NULL in the table.


Empty Field As Null = boolean

If this is set, empty columns in the input file will become NULL.


Encoding Type = drop-down

The type of encoding that has been applied to the data. The default setting is UTF-8.


Strip Outer Array = boolean

Enable removal of outer square brackets from JSONs.


Strip Null Values = boolean

Enables removal of NULL values.


Ignore UTF-8 Errors = boolean

When true, replaces invalid UTF-8 sequences with the Unicode replacement character (U+FFFD), instead of throwing an error.


Preserve Space = boolean

When parsing XML files, leading and trailing spaces in elements are preserved if set to true.


Strip Outer Element = boolean

When parsing XML files, this will strip the outermost XML element, exposing the second-level elements as separate documents.


Disable Snowflake Data = boolean

When parsing XML files, will disable recognition of Snowflake semi-structured data tags if set to true.


Disable Auto Convert = boolean

When parsing XML files, enables conversion of numeric and Boolean values from text to their native types if set to true.


Snowflake Delta Lake on Databricks Amazon Redshift Google BigQuery Azure Synapse Analytics