Skip to content

Pinecone Vector Query

Use the Pinecone Vector Query component to ingest text search strings and return text data associated with similar vectors in your Pinecone vector database. The search strings are stored in your cloud data warehouse table.


Properties

Name = string

A human-readable name for the component.


Database = drop-down

The Snowflake database. The special value, [Environment Default], will use the database defined in the environment. Read Databases, Tables and Views - Overview to learn more.


Schema = drop-down

The Snowflake schema. The special value, [Environment Default], will use the schema defined in the environment. Read Database, Schema, and Share DDL to learn more.


Table = string

The Snowflake table that holds your source data.


Key Column = drop-down

Set a column as the primary key.


Query Column = drop-down

Set the column from which to load query search strings. The Pinecone Vector Query component will run these query search strings against your Pinecone index and retrive the raw data that most closely matches each query search string.


Limit = integer

Set a limit for the numbers of rows from the table to load. The default is 1000.


Embedding Provider = drop-down

The embedding provider is the API service used to convert the search term into a vector. Choose either OpenAI or Amazon Bedrock. The embedding provider receives a search term (e.g. "How do I log in?") and returns a vector.

Choose your provider:

OpenAI API Key = drop-down

Use the drop-down menu to select the corresponding secret definition that denotes the value of your OpenAI API key.

Read Secret definitions to learn how to create a new secret definition.

To create a new OpenAI API key:

  1. Log in to OpenAI.
  2. Click your avatar in the top-right of the UI.
  3. Click View API keys.
  4. Click + Create new secret key.
  5. Give a name for your new secret key and click Create secret key.
  6. Copy your new secret key and save it. Then click Done.

Embedding Model = drop-down

Select an embedding model.

Currently supports:

  • text-embedding-ada-002
  • text-embedding-3-small
  • text-embedding-3-large

Embedding AWS Region = drop-down

Select your AWS region.


Embedding Model = drop-down

Select an embedding model.

Currently supports:


API Batch Size = integer

Set the size of array of data per API call. The default size is 10. When set to 10, 1000 rows would therefore require 100 API calls.

You may wish to reduce this number if a row contains a high volume of data; and conversely, increase this number for rows with low data volume.


Pinecone API Key = drop-down

Use the drop-down menu to select the corresponding secret definition that denotes the value of your Pinecone API key.

Read Secret definitions to learn how to create a new secret definition.


Pinecone Index Name = drop-down

The name of the Pinecone vector search index to connect to. The list is generated once you pass a valid Pinecone API key.


Pinecone Namespace = string

The name of the Pinecone namespace. Pinecone lets you partition records in an index into namespaces. To retrieve a namespace name:

  1. Log in to Pinecone.
  2. Click PROJECTS in the left sidebar.
  3. Click a project tile. This action will open the list of vector search indexes in your project.
  4. Click on your vector search index tile.
  5. Click the NAMESPACES tab. Your namespaces will be listed.

Top K = integer

The number of results to return from the vector database query. Between 1-100. Default is 3.


Data Lookup Strategy = drop-down

Select the data lookup strategy. Pinecone only stores the vector associated with text data, and a JSON metadata blob. While the text data can be stored in the metadata blob, size limitations can affect coverage—for example when a user has a larger blob of text to be converted to a vector.

  • Raw data in metadata: Choosing this option adds an additional property, Data Path, to provide the path to text data within the metadata JSON blob.
  • Table details in metadata: Database, schema, table, key column, key, and data column key:value pairs are used in the metadata to look up the text data in your warehouse table. If you have upserted data using Pinecone Vector Upsert then the below defaults reflect metadata automatically set via the Pinecone Vector Upsert component.

To view the metadata of your upserted records:

  1. Log in to Pinecone.
  2. Click PROJECTS in the left sidebar.
  3. Click a project tile. This action will open the list of vector search indexes in your project.
  4. Click on your vector search index tile.
  5. While in the BROWSER tab, observe the metadata for a relevant record.

Database Path = string

The value of your database path. By default this is mtln_database.


Schema Path = string

The value of your schema path. By default this is mtln_schema.


Table Path = string

The value of your table path. By default this is mtln_table.


Key Column Path = string

The value of your key column path. By default this is mtln_keyColumn.


Key Path = string

The value of your key path. By default this is mtln_key.


Data Column Path = string

The value of your data column path. By default this is mtln_dataColumn.


Data Path = string

Set the path to the data in the metadata JSON blob.

Default is data.


Database = drop-down

The Snowflake destination database. The special value, [Environment Default], will use the database defined in the environment. Read Databases, Tables and Views - Overview to learn more.


Schema = drop-down

The Snowflake destination schema. The special value, [Environment Default], will use the schema defined in the environment. Read Database, Schema, and Share DDL to learn more.


Table Name = string

The new Snowflake table to load your prompt output into. Will create a new table if one does not exist. Otherwise, will replace any existing table of the same name.


Snowflake Databricks Amazon Redshift

Video