Skip to content

AI Classify

Public preview

Editions

Production use of this feature is available for specific editions only. Contact our sales team for more information.

The AI Classify transformation component uses the Databricks ai_classify() function to invoke generative AI to classify input text according to labels you provide. This function uses a Databricks chat model serving endpoints made available by Databricks Foundation Model APIs.

A label is a brief, descriptive string such as "urgent", "not urgent", etc., that you provide when you configure the component. The component's input is a column of text data, and the output is a new column named classification_<input-column-name>, that contains the most appropriate label from the provided list.

Note

Make sure you have read and understand the Requirements set out by Databricks before using this component.

Use case

AI Classify gives you a flexible, intelligent way to automate categorization tasks for unstructured data like text, descriptions, emails, or support tickets. Some typical uses for this include:

  • Automatically assign incoming support ticket categories like "Billing", "Technical Issue", or "Account Access", to enable suitable routing of the ticket.
  • Categorize customer feedback such as survey comments or NPS responses into themes like "Price", "Features", "Usability", or "Customer Service", to enable structured insights from large volumes of qualitative feedback.

Properties

Name = string

A human-readable name for the component.


Column = drop-down

Select an input column that contains the text to be classified.

Note

The component will operate on only a single column of the input.


Classification Labels = column editor

Enter a list of labels which will be used to classify the input text. You must provide at least two labels, and no more than 20.

Enter one label per row in the Classification Labels dialog. Click + to add a new row.

You can choose to populate the list of labels dynamically using a grid variable. Tick the Use Grid Variable checkbox at the bottom of the Classification Labels dialog.


Include Input Columns = boolean

  • Yes: Outputs both your source input columns and the classification label column. This will also include those input columns not selected in Column.
  • No: Only includes the classification label columns.