Vector Search

Editions

Production use of this feature is available for specific editions only. Contact our sales team for more information.

Project Availability:

Snowflake ✅ Databricks ❌ Amazon Redshift ❌

Component Type:

Transformation

Connection Inputs:

Two

Connection Outputs:

Unlimited

The Vector Search transformation component performs a search on an input table to find content that best answers specific questions, using vector embeddings to identify suitable answers within the input data.

The component requires two input tables: an Index Table that contains the data you are querying, and a Query Table that contains the questions you are asking, one question per table row. From each table, there should be at least one column (which contains an embedding) that will be used to perform the similarity search. The data in these tables must be in the form of Cortex vector embeddings. You can convert English-language text into Cortex embeddings using the Cortex Embed component, and use the output from that component as an input into Vector Search.

The component outputs the best-fit answers to each query input, with as many answers per query as asked for in the Top K property. The answer is output as plain-language text, and can be in the form of a multi-column table or a column of JSON objects.

To learn more about Cortex vector embeddings, read Vector Embeddings.

Use case

Typical use cases for a vector search include the following:

Performing a semantic text search to return the most contextually relevant documents, even if they don't share exact keywords.
Personalizing content retrieval by matching users to relevant content based on their interests or behavior embeddings.
Powering support systems by finding the closest pre-written response or FAQ entry for a customer's question.

Properties

Name = string

A human-readable name for the component.

Index Table = drop-down

The table that contains the data to be searched.

Query Table = drop-down

The table that contains the questions you want to have answered.

Similarity Function = drop-down

The measurement of similarity between vectors is performed by a Snowflake Cortex vector similarity function. Choose which of the three supported functions the search will use:

Cosine Similarity
L2 Distance
Inner Product

For descriptions of these functions, read About vector similarity functions.

Output Mode = drop-down

You can choose to output the results of the search as table Columns or JSON objects.

Index Table Embedding Column = drop-down

Select the column of the Index Table that contains the embeddings you want to query. The component operates on a single input column only. If you have multiple embedding columns in the table, you'll need to perform additional transformations on your data to reduce them to a single column before querying.

Additional non-embedding columns (i.e. not only the column selected here) will also be retrieved from the index table and displayed in the output.

Query Table Embedding Column = drop-down

Select the column of the Query Table that contains the question embeddings. The component operates on a single input column only. If you have multiple embedding columns in the table, you'll need to perform additional transformations on your data to reduce them to a single column before querying.

Additional non-embedding columns (i.e. not only the column selected here) will also be retrieved from the query table and displayed in the output.

Query Table Key Column = drop-down

Select the column that functions as the query table's primary key.

Top K = string

The number of results to return from the vector database query. Between 1-100. The default is 5, which will return the top five best-fitting answers to the query.