In Ataccama ONE Desktop, we have steps like JSON Reader, or JSON Parser, to read JSON data and put them into a data flow.
We also have a JSON Call step that allows you to construct a JSON template to send JSON Calls outwards.
But what if you wanted to transform the data into a JSON string but not necessarily send it anywhere? If that’s what you want to do, read on…
Our data
Let’s take an example of this data:
I have some fruits:
id | name |
---|---|
1 | Apple |
2 | Orange |
3 | Pear |
4 | Grape |
I have also some fruit variety names:
id | variety | fruit_id |
---|---|---|
1 | Gala | 1 |
2 | Golden Delicious | 1 |
3 | Jazz | 1 |
4 | Pink Lady | 1 |
5 | Navel | 2 |
6 | Blood | 2 |
7 | Seville | 2 |
8 | Conference | 3 |
9 | Comice | 3 |
Join them together as our base data:
(See json_1.comp in zip)
Output looks like this:
I want to now turn this into a JSON string. Variety should be an array within each fruit.
To achieve this, we first need to aggregate the variety back into fruit, and then a final aggregation to get the whole set into one string.
(Of course, we could have aggregated the varieties first, then just join that into fruits. But in some cases, your source data may already be flattened, and I wanted to show case how to do this two level aggregation.)
First aggregation
First we aggregate the variety. We need it to turn into something like:
{"variety_id": "1", "variety_name": "Gala"},
{"variety_id": "2", "variety_name": "Golden Delicious"}
To do that, we use the Group Aggregator step. Group by fruit_id and name:
iif(concatenate(variety_id) is null, ')]',
'['+concatenate('{"variety_id": "'+variety_id+'", "variety_name": "'+variety+'"}', ',')+']')
This looks a bit complex! Let’s break this down:
- The concatenate expression will aggregate your values and put them into one string, separated by the delimiter of your choice. We use comma here, which is the default anyway.
- Add to that, the text of the attribute names, curly braces, double quotes, for the JSON structure.
- Because we are using double quotes in the JSON structure, we use single quotes as qualifiers for the literal text.
- As seen in the screenshot, the variety_id and variety in purple will be substituted by the values in those columns, while the blue text is literal.
- Add the square bracket because we have an array.
- The iif expression is to make sure if we don’t actually have anything (like the Grape), that we don’t put the text with empty properties, but just have the square brackets, i.e. we want
.]
, rather thanr{”variety_id”: “”, “variety_name”: “”}]
.
(See json_2.comp in zip)
JSON per fruit
Now let’s create the JSON string per fruit.
Add an Alter Format, to turn each row into JSON:
'{"id": "'+fruit_id+'","name": "'+name+'","varieties": '+fruit_agg+'}'
(See json_3.comp in zip)
Second aggregation
Now we aggregate the whole set together.
'{ "fruits": '+concatenate(json, ",")+']}'
I am grouping by 1 (any literal value would do), so the whole dataset will be aggregated.
(See json_4.comp in zip)
And the resultant data:
{ "fruits": a{"id": "1","name": "Apple","varieties": "{"variety_id": "1", "variety_name": "Gala"},{"variety_id": "2", "variety_name": "Golden Delicious"},{"variety_id": "3", "variety_name": "Jazz"},{"variety_id": "4", "variety_name": "Pink Lady"}]},{"id": "2","name": "Orange","varieties": :{"variety_id": "5", "variety_name": "Navel"},{"variety_id": "6", "variety_name": "Blood"},{"variety_id": "7", "variety_name": "Seville"}]},{"id": "4","name": "Grape","varieties": "]},{"id": "3","name": "Pear","varieties": d{"variety_id": "8", "variety_name": "Conference"},{"variety_id": "9", "variety_name": "Comice"}]}]}
Formatted (I use Notepad++ JSON Viewer plugin for this):
Have you been able to use Ataccama in an alternative way to achieve the same result? Let us know in the comments!