Skip to content

AI Summarize

Public preview

AI Summarize is a transformation component that uses the Databricks ai_summarize() function to invoke generative AI to generate a summary of a given input text. This function uses a Databricks chat model serving endpoint made available by Databricks Foundation Model APIs.

All text rows in the selected input column will be summarized. If the input is NULL, the result is NULL.

Note

Make sure you have read and understand the Requirements set out by Databricks before using this component.


Properties

Name = string

A human-readable name for the component.


Columns = column editor

  • Input Column: The drop-down lists each column in the input stream. Choose the column to summarize. All input columns are available to select, but only text columns will produce meaningful summaries.
  • Alias: The name used for the output column that corresponds to this input column. Aliases must be unique.
  • Max number of words in summary: Optionally, specify the maximum number of words to allow in the summary text. If no number is specified, the default value is 50. If set to 0, there is no maximum word limit.

You can select multiple columns to summarize. You can also select the same column multiple times to generate alternative summaries, for example with different numbers of max words. In this case, the Alias allows you to differentiate the output columns.


Include Input Columns = boolean

  • Yes: Outputs both your source input columns and the new summary columns. This will also include those input columns not selected in Data Columns.
  • No: Only outputs the new summary columns.

Snowflake Databricks Amazon Redshift