Skip to content

Create External Table

Create a table that references data stored in an external storage system, such as Google Cloud Storage.

For full information on working with tables on Google Cloud Platform, read REST Resource: tables.


Properties

Name = string

A human-readable name for the component.


Project = drop-down

Select the Google Cloud project. The special value, [Environment Default], will use the project defined in the environment. For more information, read Creating and managing projects.


Dataset = drop-down

Select the Google BigQuery dataset to load data into. The special value, [Environment Default], will use the dataset defined in the environment. For more information, read Introduction to datasets.


New Table Name = string

The name of the external table to be created or used.


Table Metadata = column editor

  • Column Name: The name of the new column.
  • Data Type: For more information on available BigQuery data types, read Data types.
    • String: This type can hold any kind of data, subject to a maximum size.
    • Integer: Suitable for whole-number types (no decimals).
    • Float: Suitable for numeric types, with or without decimals.
    • Numeric: Suitable for data of an exact numeric value, allowing 38 digits of precision and 9 decimal digits of scale.
    • Boolean: Suitable for data whose value is either 'true' or 'false'.
    • Date: A formatted date object without time.
    • Time: A formatted time object without date.
    • DateTime: A formatted timestamp containing both date and time that is easily readable by the user.
    • Timestamp: This type is a timestamp left unformatted (exists as Unix/Epoch time).
  • Mode: The field mode. Default is 'NULLABLE'.
    • NULLABLE: Field allows null values.
    • REQUIRED: Field does not accept null values.
    • REPEATED: Field can accept multiple values.
  • Table Metadata (Nested): When the Define Nested Metadata checkbox is ticked inside the Table Metadata property, a tree structure can be defined for metadata.

Create/Replace = drop-down

  • Create: Create the new table with the given name. Will fail if a table of that name already exists.
  • Create if not exists: Will create the new table with the given name unless one already exists. Will succeed and continue in either case.
  • Replace: Will create the new table, potentially overwriting any existing table of the same name.

Since other database objects depend upon this table, drop ... cascade is used, which may actually remove many other database objects.


Google Storage URL Location = string

The URL of the Google Storage bucket to get the files from. This follows the format gs://bucket-name/location, where location is optional.


Compression = drop-down

Whether the input file is compressed in gzip format or not compressed at all.


File Format = drop-down

  • Cloud Datastore Backup
  • CSV
  • JSON (New line delimited): this requires an additional "JSON Format".

Number of Errors Allowed = integer

The maximum number of individual parsing errors that cause the whole load to fail. Values up to this will be substituted as null values. This value defaults to 0.


Ignore Unknown Values = boolean

  • Yes: Accept rows that contain values that do not match the schema. Unknown values are ignored. Will ignore extra values at the end of a line for CSV files.
  • No: Omit any rows with invalid values.

Delimiter = drop-down

The delimiter that separates columns. The default is a comma. A [TAB] character can be specified as \.


CSV Quoter = string

Specifies the character to be used as the quote character when using the CSV option.


Encoding = drop-down

The encoding format. This defaults to UTF-8.


Header Rows To Skip = integer

The number of rows at the top of the file to ignore. Defaults is 0.


Allow quoted newlines = boolean

  • Yes: Allow a CSV value to contain a newline character when the value is encased in quotation marks.
  • No: A new line character, regardless of quotations, is always considered a new row.

Allow Jagged Rows = boolean

  • Yes: Missing values are treated as 'null' but accepted.
  • No: Rows with missing data are treated as bad records. A bad record will count toward the Maximum Errors count.

Snowflake Delta Lake on Databricks Amazon Redshift Google BigQuery Azure Synapse Analytics