Skip to content

Google Cloud Storage Unload

The Cloud Storage Unload component writes data from a table or view into a specified Google Cloud Storage (GCS) bucket using your specified file format (CSV, JSON, or Parquet).

This component requires working Google Cloud Storage credentials with "write" access to the source data files.


Properties

Name = string

A human-readable name for the component.


Stage = drop-down

Select a staging area for the data. Staging areas can be created through Snowflake using the CREATE STAGE command. Internal stages can be set up this way to store staged data within Snowflake. Selecting [Custom] will allow the user to specify a custom staging area.


Google Storage URL Location = file explorer

To retrieve the intended files, use the file explorer to enter the container path where the Google Cloud Storage bucket is located, or select from the list of GCS buckets.

This must have the format GS://<bucket>/<path>.


File Prefix = string

A prefix applied to the filenames created in the GCS bucket. Each created file will be named using the prefix followed by a number denoting which node this was unloaded from. All unloads are parallel and will use the maximum number of nodes available at the time.

The default prefix is data.


Storage Integration = drop-down

Select the storage integration. Storage integrations are required to permit Snowflake to read from and write to a cloud storage location. Integrations must be set up in advance and configured to support Google Cloud Storage.


Warehouse = drop-down

The Snowflake warehouse used to run the queries. The special value, [Environment Default], will use the warehouse defined in the environment. Read Overview of Warehouses to learn more.

Database = drop-down

The Snowflake database. The special value, [Environment Default], will use the database defined in the environment. Read Databases, Tables and Views - Overview to learn more.


Schema = drop-down

The Snowflake schema. The special value, [Environment Default], will use the schema defined in the environment. Read Database, Schema, and Share DDL to learn more.


Target Table = string

Choose a target table to unload data from into a GCS bucket.

Warning

This table will be recreated and any existing table of the same name will be dropped.


Format = drop-down

Select a pre-made file format that will automatically set many of the Google Cloud Storage Unload component properties. These formats can be created through the Create File Format component. Select [Custom] to specify a custom format using the properties available in this component.


File Type = drop-down

Select the type of data for the unloaded data. Available data types are: CSV, JSON, and PARQUET. For additional information on file type options, read the Snowflake documentation.

Component properties will change to reflect the selected file type. Click one of the tabs below for properties applicable to that file type.

Compression = drop-down

Select the compression method if you wish to compress your data. If you do not wish to compress at all, select NONE. The default setting is AUTO.


Record Delimiter = string

Input a delimiter for records. This can be one or more single-byte or multibyte characters that separate records in the output file.

Accepted values include: leaving the field empty; a newline character \ or its hex equivalent 0x0a; a carriage return \\r or its hex equivalent 0x0d. Also accepts a value of NONE.

The specified delimiter must be a valid UTF-8 character and not a random sequence of bytes.

Do not specify characters used for other file type options such as Escape or Escape Unenclosed Field.

If the field is left blank, the default record delimiter is a newline character.

This delimiter is limited to a maximum of 20 characters.

While multi-character delimiters are supported, the record delimiter cannot be a substring of the field delimiter, and vice versa. For example, if the record delimiter is "aa", the field delimiter cannot be "aabb".


Field Delimiter = string

Input a delimiter for fields. This can be one or more single-byte or multibyte characters that separate fields in the output file.

Accepted characters include common escape sequences, octal values (prefixed by \), or hex values (prefixed by 0x). Also accepts a value of NONE.

This delimiter is limited to a maximum of 20 characters.

While multi-character delimiters are supported, the field delimiter cannot be a substring of the record delimiter, and vice versa. For example, if the field delimiter is "aa", the record delimiter cannot be "aabb".

The specified delimiter must be a valid UTF-8 character and not a random sequence of bytes.

Do not specify characters used for other file type options such as Escape or Escape Unenclosed Field.

The Default setting is a comma: ,.


Date Format = string

Define the format of date values in the data. If a value is not specified or is AUTO, the value for the DATE_INPUT_FORMAT session parameter is used. The default setting is AUTO.


Time Format = string

Define the format of time values in the data. If a value is not specified or is AUTO, the value for the TIME_INPUT_FORMAT session parameter is used. The default setting is AUTO.


Timestamp Format = string

Define the format of timestamp values in the data. If a value is not specified or is AUTO, the value for the TIMESTAMP_INPUT_FORMAT session parameter is used.


Escape = string

Specify a single character to be used as the escape character for field values that are enclosed. Default is NONE.


Escape Unenclosed Field = string

Specify a single character to be used as the escape character for unenclosed field values only. Default is \\. If you have set a value in the property Field Optionally Enclosed, all fields will become enclosed, rendering the Escape Unenclosed Field property redundant, in which case, it will be ignored.


Field Optionally Enclosed = string

Specify a character used to enclose strings. The value can be NONE, single quote character ', or double quote character ". To use the single quote character, use the octal or hex representation 0x27 or the double single-quoted escape ''. Default is NONE.

When a field contains one of these characters, escape the field using the same character. For example, to escape a string like this: 1 "2" 3, use double quotation to escape, like this: 1 ""2"" 3.


Null If = editor

Specify one or more strings (one string per row in the dialog) to convert to NULL values. When one of these strings is encountered in the file, it is replaced with an SQL NULL value for that field in the unloaded data. Click + to add a string.

Compression = drop-down

Select the compression method if you wish to compress your data. If you do not wish to compress at all, select NONE. The default setting is AUTO.


Nest Columns = drop-down

When Yes, the table columns will be nested into a single JSON object so that the file can be configured correctly. A table with a single variant column will not require this setting to be Yes. The default setting is No.


Null If = editor

Specify one or more strings (one string per row in the dialog) to convert to NULL values. When one of these strings is encountered in the file, it is replaced with an SQL NULL value for that field in the unloaded data. Click + to add a string.


Trim Space = boolean

When Yes, removes whitespace from fields. Default setting is No.

Compression = drop-down

Select the compression method if you wish to compress your data. If you do not wish to compress at all, select NONE. The default setting is AUTO.


Null If = editor

Specify one or more strings (one string per row in the dialog) to convert to NULL values. When one of these strings is encountered in the file, it is replaced with an SQL NULL value for that field in the unloaded data. Click + to add a string.


Trim Space = boolean

When Yes, removes whitespace from fields. Default setting is No.


Overwrite = drop-down

When Yes, overwrite existing data (if the target file already exists) instead of generating an error. Default setting is No.


Single File = drop-down

When Yes, the unload operation will work in serial rather than parallel. This results in a slower unload but a single, complete file.

The default setting is No.

When Yes, no file extension is used in the output filename (regardless of the file type, and regardless of whether the file is compressed). When No, a filename prefix must be included in the path.


Max File Size = integer

A number, greater than 0, that specifies the upper size limit (in bytes) of each file to be generated in parallel per thread.

The actual file size and number of files unloaded are determined by the total amount of data and number of nodes available for parallel processing.


Include Headers = drop-down

When Yes, write column names as headers at the top of the unloaded files. Default is No.


Snowflake Databricks Amazon Redshift