Skip to content

Google Sheets connector

The Matillion Connector for Google Sheets is a Snowflake Native Application that lets you load and schedule data pipelines that ingest data from Google Sheets into Snowflake. Each time your pipeline runs, your data is truncated and reloaded.

The Snowflake database that hosts the Native App is not the same database that data is written to.

Note

This app is currently in private preview, and may evolve after further development and feedback. Additionally, certain parts of the experience may take a few seconds to load.

Note

You'll need to enable the Google Sheets API.

Warning

If you wish to uninstall an instance of this app, consider first transferring ownership of the database and its objects, or cloning the database to avoid losing any objects.

Warning

Do not refresh the page while installing this app, as it may cause the installation to fail.


Event monitoring and logging

To view logs and troubleshoot errors in your app, you'll need to set up an Events table for your Snowflake account. If you wish for Matillion to help you troubleshoot, you'll also need to set up sharing. Read Enabling Logging and Event Sharing for an Application in Snowflake for more details.

  1. Set up an Events table to enable logging.
  2. Enabling logging and event sharing for an application.

Get started

  1. Log in to your Snowflake account.
  2. In the left sidebar, click Apps.
  3. In the Recently Shared with You section, you will see Matillion Google Sheets Connector Demo. Click the Get button.
  4. Open the Options menu.
    1. Specify a name for the app or keep the existing name. This app name will be used in Snowflake worksheets later in this guide and is referenced here as <app_name>.
    2. Specify a warehouse that the app will use for installation.
  5. Click Get.
  6. Click Configure.

Authentication setup

If you don't have a database set up to store the secrets for your account, you'll need to create a new database. If you already have a database, you can proceed to step 2 and create your new secret there.

  1. In a fresh Snowflake worksheet, run the following commands to begin setting up your secret.

    CREATE DATABASE <database_name>;
    CREATE SCHEMA <database_name.schema_name>;
    USE <app_name>;
    
  2. Create an OAuth in Google Cloud Platform first.

    1. Log in to the Google Cloud console.
    2. Click the menu in the upper-left.
    3. Navigate to APIs and servicesCredentials.
    4. Click CREATE CREDENTIALSOAuth client ID.
    5. Choose Web application as the application type.
    6. Name the app appropriately.
    7. Scroll down to Authorised redirect URIs and click ADD URI.
    8. Paste your redirect URI into the field. Your redirect URI should take the format https://apps-api.c1.<region>.<cloud-vendor>.app.snowflake.com/oauth/complete-secret.
    9. Click CREATE.
    10. Make note of your client ID and client secret values and include these credentials in OAUTH_CLIENT_ID and OAUTH_CLIENT_SECRET respectively.

    If you prefer to read the Google documentation directly, read OAuth client credentials.

  3. Create a Google Sheets security integration.

    CREATE OR REPLACE SECURITY INTEGRATION <security_integration_name>
        TYPE = API_AUTHENTICATION
        AUTH_TYPE = OAUTH2
        OAUTH_CLIENT_AUTH_METHOD = CLIENT_SECRET_POST
        OAUTH_CLIENT_ID = '<Client ID from Google Cloud Workspace>'
        OAUTH_CLIENT_SECRET = '<Client secret from Google Cloud Workspace>'
        OAUTH_TOKEN_ENDPOINT = 'https://oauth2.googleapis.com/token'
        OAUTH_AUTHORIZATION_ENDPOINT = 'https://accounts.google.com/o/oauth2/auth?access_type=offline&prompt=consent'
        OAUTH_ALLOWED_SCOPES = ('https://www.googleapis.com/auth/spreadsheets.readonly')
        ENABLED = TRUE;
    
  4. Link your newly created security integration with a secret object:

    CREATE OR REPLACE SECRET <database_name.schema_name.new_secret_name>
        TYPE = OAUTH2
        API_AUTHENTICATION = <security_integration_name>;
    
  5. Next, link an API endpoint to your secret using an API integration:

    CREATE OR REPLACE API INTEGRATION <api_integration_name>
        API_ALLOWED_PREFIXES = ('https://sheets.googleapis.com/v4/spreadsheets/')
        DEFAULT_AUTHENTICATION_SECRET = <database_name.schema_name.secret_name>
        ENABLED = TRUE;
    
  6. You can now perform the token requests. Run the following command:

    SELECT SYSTEM$START_OAUTH_FLOW('<database_name.schema_name.secret_name>');
    

    This will request the token.

  7. The above command will return a clickable link in the Results tab of the worksheet. Click the link and complete the OAuth2 consent screen.

    Note

    Take note of the final URL in the consent flow, because you will need to copy and paste everything from state= onwards to complete the flow.

  8. Next, run:

    SELECT SYSTEM$FINISH_OAUTH_FLOW('<Full string from state=....>');
    

    This should complete the flow, providing a token for the application to use. The Results tab should read "OAuth flow successfully completed."

    If the OAuth2 client is changed or the refresh token is revoked, these steps will need to be re-run.

  9. If you wish to view metadata about your secret, run:

    DESCRIBE SECRET <database_name.schema_name.secret_name>;
    
  10. Grant usage of the API integration to the app:

    GRANT USAGE ON INTEGRATION <api_integration_name> TO APPLICATION <app_name>;
    
  11. Connect the app to the endpoint.

    CALL <app_name>.APP_CONFIG.CREATE_GOOGLE_SHEETS_EXTERNAL_FUNCTION('api_integration_name');
    
  12. Grant the app all the required Account level permissions:

    GRANT CREATE DATABASE ON ACCOUNT TO APPLICATION <app_name>;
    GRANT CREATE WAREHOUSE ON ACCOUNT TO APPLICATION <app_name>;
    GRANT EXECUTE TASK ON ACCOUNT TO APPLICATION <app_name>;
    
  13. In a separate browser tab, return to the Google Sheets connector app and click Configure Pipeline Resources. Specify your database, schema, and warehouse, and click Create.

    Note

    The names of your pipeline resources must be unique.

  14. Return to the worksheet.

  15. Grant permissions to read future and existing tables.

    GRANT ALL PRIVILEGES ON FUTURE TABLES IN SCHEMA <database_name.schema_name> TO ROLE <role_name>;
    GRANT ALL PRIVILEGES ON SCHEMA <database_name.schema_name> TO ROLE <role_name>;
    
  16. Grant usage on database permission to the role.

    GRANT USAGE ON DATABASE <database_name> TO ROLE <role_name>;
    

Note

Steps 15 and 16 must apply to any role that you wish to be able to view the data.


Create a new pipeline

  1. Return to the Google Sheets connector app.
  2. In the upper-right, ensure a warehouse is selected.
  3. Select the Create new pipeline radio button.
  4. Complete the properties (explained below).

Properties

Spreadsheet ID = string

The ID of the primary Google Sheets object. In the example below, spreadsheetID is the value required for this field.

https://docs.google.com/spreadsheets/d/spreadsheetId/edit#gid=0


Sheet Name = string

The name of the sheet (page/tab) object within the spreadsheet.


Header Row = checkbox

When this box is checked, values in the first row of the sheet will be used as column names for your new pipeline.


Warehouse = string

The Snowflake warehouse you wish to use.


Database = string

The Snowflake database you wish to use. A database is a logical grouping of schemas.


Schema = string

The Snowflake schema to use. A schema is a logical grouping of database objects.


Table = string

A human-readable name for the table.


Pipeline Name = string

A human-readable name for the pipeline.


Complete the pipeline

Once you have completed each property, click Create. Your new pipeline will be added to the Your Pipelines menu, available when you select the Your Pipelines radio button.


Manage your pipelines

Sync now

You can override a pipeline schedule and run an immediate sync of the pipeline by clicking Sync Now. This should not affect the pipeline schedule.

Note

If two syncs clash (e.g. a scheduled sync and a manual Sync Now click), the second sync request should be ignored:

  • If a scheduled pipeline run is already active when you click Sync Now, the manual sync will be ignored.
  • If you click Sync Now very shortly before a scheduled run, the scheduled run will be ignored if the manual run is still running.

Delete a pipeline

If you wish to delete a pipeline, click Delete on the corresponding pipeline.

Schedule a pipeline

To schedule a pipeline, click the drop-down field that will either display Not Scheduled or Every X Days/Hours/Minutes.

  1. Select the unit of frequency from days, hours, or minutes.
  2. Type the increment or use the + and - buttons.
  3. Click Save to confirm the schedule frequency.