Flattening nested arrays
Overview
In this article we will briefly cover how to unpack nested arrays from imported data using the Flatten Variant component. Nested arrays typically come from attempting to load externally held data in JSON form into a table.
Example
We begin by loading a JSON whose records include arrays and attempt to better represent the data in those arrays by flattening the arrays to make make new columns from the values within.
The JSON as a whole takes the form of a single array where each entry is a single person's records. If we attach a Flatten Variant component, similar to the previous example, we can flatten the array and output it as columns. First we map two values from the JSON to columns so that the output will consist solely of those two columns. There is a single row for each entry top-level entry in the JSON. Note each name is matched with a single credit amount.
In addition to the 'name' and 'balance' values, another value also exists in the JSON named 'friends'. This value 'friends' is actually an array containing a list of friends and their details. If we add this array as a single value and output it in the same way as we have done with 'name' and 'balance' above, we get an extra column in the output with this data. From the sample, we can see that this new column holds array data.
Unfortunately, this means we get an entire array's worth of data in a single value which is very difficult to read and almost impossible to analyze without further transformation. To fix this, we can flatten the 'friends' array in the same way we have flattened the original JSON. Much like our flattened JSON yielded the 'data' column as a source, flattening the 'friends' array will also create a new column. In the 'Columns Flatten' property, we select the 'data' column and choose the 'friends' property. We then Alias that second array as 'friends_flatten'.
We now have 2 columns to choose properties from; 'data' and 'friends_flatten'. Moving back to the 'Column Mapping' property, we can see that a new column is now available to pull values from. We select 'friends_flatten' and extact values for name and balance. Thus we have just created 2 new output columns, one for the names of a person's friends and then one more for each friend's balance.
We can check a sample from the Flatten Variant property to ensure the expansion has worked correctly.
But what if we have yet more nested arrays? What if our 'friends_flatten' column also contains arrays of its own?
We can solve this can repeating the above instructions; an array can be flattened so long as it is available inside a column we have previously defined. Since we defined the 'data' column from our JSON, we were able to find the 'friends' array and make our 'friends_flatten' column. Similarly, now we have 'friends_flatten' we can find the nested 'friends_of_friends' array and make yet another column, 'fof'.
Each time we define a new column through the 'Column Flattens' property, it becomes a data source that will appear in the dropdown list in the 'Column Mapping' property. We can now use the 'fof' source to add more output columns showing the names and balance held in the friends_of_friends array.
And finally, sampling this data will confirm that we are now flattening this nested array. No matter how deeply nested, an array can always be dug out and flattened by making each array its own column and using that as a data source to take the next array.
The flattened data can now be easily manipulated by other transformation components.