Skip to content

Databricks Vector Search

Public preview

The Databricks Vector Search transformation component performs a search on an input table to find content that best answers specific questions, using vector embeddings to identify suitable answers within data located in your Databricks account.

The component takes input from a single table that contains the questions you are asking in plain text, and searches a pre-existing index of data to find the best-fit answers.

The component outputs the best-fit answers to each query input row, with as many answers per query as asked for in the Top K property. The answers are output in the form of a column of JSON objects.

To learn more about the vector search function in Databricks, read the Databricks documentation.


Properties

Model Serving Endpoint = drop-down

Select an existing Databricks endpoint that will allow you to access the vector search index you wish to use.


Index = drop-down

Select an existing Databricks vector search index.


Query Column = drop-down

Select the column of the input that contains the questions. The component operates on a single input column only. If you have multiple question columns in the table, you'll need to perform additional transformations on your data to reduce them to a single column before querying.

Additional columns in the input table (i.e. not only the column selected here) will also be retrieved and displayed in the output.


Top K = string

The number of results to return from the vector database query. Enter a value in the range 1-100. For example, 5 will return the top five best-fitting answers to the query.


Snowflake Databricks Amazon Redshift