Skip to content

AI Similarity

AI Similarity is a transformation component that uses the Databricks ai_similarity() function to invoke generative AI to compare two strings and compute the semantic similarity score. This function uses a Databricks chat model serving endpoint made available by Databricks Foundation Model APIs.

The output is a float value, representing the semantic similarity between the two input strings. The output score is relative and should only be used for ranking. Scores of 1 indicate that the two texts are equal.


Make sure you have read and understand the Requirements set out by Databricks before using this component.


Name = string

A human-readable name for the component.

Columns = column editor

Base Column: The base column. Comparison Column: The column to compare against your base column.

Include Input Columns = boolean

  • Yes: Includes both your input columns and the new semantic similarity scores column. This will also include those input columns not selected in Columns.
  • No: Only includes the new semantic similarity scores column.

Snowflake Databricks Amazon Redshift