Skip to content

Split field

The Split Field transformation component lets you split the contents of a selected column at the position of a specified delimiter, and output the separate parts to new columns. The delimiter can be any valid character or sequence of characters. For example, an input field of ABC-DEF split on the delimiter - will output the values ABC and DEF to two new columns.

This component is equivalent to using the SQL SPLIT_PART function.

This component works on columns with numeric or date data, as well as on columns of text data, with numbers and dates being split as if they were strings. For example, given a numeric column with the value 1234, splitting this column on a delimiter of 2 will produce output values of 1 and 34. Be aware of the regional date format when splitting dates, as it may give unexpected results. Sampling the input dataset to see how the date looks as a string may be helpful here. An alternative way of splitting dates would be to use a DATE_PART function in the Calculator component.

The output of this operation will always be string data, but you can subsequently use the Convert Type component to cast the data type appropriately.

Use case

This component is especially useful when dealing with semi-structured or loosely formatted data output by a source system. Some common uses of this are:

  • Splitting CSV or other delimited strings into columns. For example, splitting an address string on commas into separate columns for city, state, and ZIP code.
  • Splitting personal names into separate columns for first name and last name, by splitting the full name at the space character.
  • Parsing data that contains rows with predictable patterns, for example, log data that shows a date, time, and message.
  • Splitting a date into its component parts (day, month, year), for example, to allow trend analysis on a monthly basis.

Properties

Name = string

A human-readable name for the component.


Input Column = drop-down

The input field to split.


Delimiter = string

The delimiter to split on. The delimiter can be a string of several characters. For example, in a string of values enclosed in quotation marks and delimited by commas, you could split on ",", which would remove the quotes that would otherwise be included if you split only on commas.


Output Columns = column editor

Define the new columns that the split values will be put into.

  • Position: The index number of the split field fragment that will go into this column. This index starts at 1. For example, when splitting an input of ABC-DEF on the delimiter -, ABC is in position 1 and DEF is in position 2.
  • Output Column Name: The output column is the new column that will hold the extracted data from the specified position. The column will be a string data type, but you can use the Convert Type component later to cast the data type appropriately if required.

Include Input Column = boolean

Select whether to keep the original field in the output or not. If you have extracted all the parts of the field, you may not need to also keep the original.