Skip to content

ML Classification

Public preview

Editions

Production use of this feature is available for specific editions only. Contact our sales team for more information.

The ML Classification transformation component uses the Snowflake Classification machine learning (ML) function to sort data into different classes using patterns detected in training data. The component supports both binary classification (two classes) and multi-class classification (more than two classes).

The component requires that you already have a trained classification model in your warehouse. This involves creating a classification model object and passing in a reference to the training data. The model is fitted to the provided training data. You then use the resulting schema-level classification model object to classify new data points and to understand the model's accuracy through its evaluation APIs.

Note

Read Snowflake's documentation for a full explanation of classification models, and for details about limitations, costs, preparation, and more.

Use case

Common use cases of classification include customer churn prediction, credit card fraud detection, and spam detection, as described in Snowflake's documentation.


Properties

Database = drop-down

The Snowflake database that the classification model resides in. The special value [Environment Default] uses the database defined in the environment. Read Databases, Tables and Views - Overview to learn more.


Schema = drop-down

The Snowflake schema that the classification model resides in. The special value [Environment Default] uses the schema defined in the environment. Read Database, Schema, and Share DDL to learn more.


Model = drop-down

The model to use. Read the Snowflake documentation to learn more about creating and using models.


On Error = drop-down

  • ABORT: Abort the entire prediction operation if any rows result in an error.
  • SKIP: Skip any rows that result in an error. The error is shown instead of the results for that row.

Model Output Column Name = string

The name of the column where the model's output will be stored.


Class Column Name = string

The name of the column where the classifications will appear. The predicted class is extracted from the model output for the user.

For example, if your data's classification will be either "true" or "false", a column will be created that explicitly lists "true" and "false" values for each row classified.


Include Input Columns = boolean

  • Yes: Includes both the column names of the input data and the output columns Model Output Column Name and Class Column Name.
  • No: Only includes the Model Output Column Name and Class Column Name columns.