Skip to content

Vector Search

Public preview

The Vector Search transformation component performs a search on an input table to find content that best answers specific questions, using vector embeddings to identify suitable answers within the input data.

The component requires two input tables: an Index Table that contains the data you are querying, and a Query Table that contains the questions you are asking, one question per table row. From each table, we expect at least one column—which contains an embedding—which will be used to perform the similarity search. The data in these tables must be in the form of Cortex vector embeddings. You can convert English-language text into Cortex embeddings using the Cotex Embed component, and use the output from that component as an input into Vector Search.

The component outputs the best-fit answers to each query input, with as many answers per query as asked for in the Top K property. The answer is output as plain-language text, and can be in the form of a multi-column table or a column of JSON objects.

To learn more about Cortex vector embeddings, read Vector Embeddings.


Properties

Name = string

A human-readable name for the component.


Index Table = drop-down

The table that contains the data to be searched.


Query Table = drop-down

The table that contains the questions you want to have answered.


Similarity Function = drop-down

The measurement of similarity between vectors is performed by a Snowflake Cortex vector similarity function. Choose which of the three supported functions the search will use:

  • Cosine Similarity
  • L2 Distance
  • Inner Product

For descriptions of these functions, read About vector similarity functions.


Output Mode = drop-down

You can choose to output the results of the search as table Columns or JSON objects.


Index Table Embedding Column = drop-down

Select the column of the Index Table that contains the embeddings you want to query. The component operates on a single input column only. If you have multiple embedding columns in the table, you'll need to perform additional transformations on your data to reduce them to a single column before querying.

Additional non-embedding columns (i.e. not only the column selected here) will also be retrieved from the index table and displayed in the output.


Query Table Embedding Column = drop-down

Select the column of the Query Table that contains the question embeddings. The component operates on a single input column only. If you have multiple embedding columns in the table, you'll need to perform additional transformations on your data to reduce them to a single column before querying.

Additional non-embedding columns (i.e. not only the column selected here) will also be retrieved from the query table and displayed in the output.


Query Table Key Column = drop-down

Select the column that functions as the query table's primary key.


Top K = string

The number of results to return from the vector database query. Between 1-100. The default is 5, which will return the top five best-fitting answers to the query.


Snowflake Databricks Amazon Redshift