Solved

How can we handle csv output from API response

1 year ago
August 10, 2023
8 replies
194 views

mahesh Ar
Data Pioneer
6 replies

How can we handle csv output from API response?

I tried to use json call ,SOAP call but it didn't work.Anyway, can we handle in One desktop csv output from API.

Best answer by AKislyakov

Hi @mahesh Ar,

Ataccama One desktop does not have built in step to handle CSV APIs.

I can suggest one of two ways.

First one is to use HTTP Get step in the workflow to save a result of an API call as a text file and then process it using plan and Text File Reader step.

The second one is to use Web lookup step. The Web lookup step will obtain the CSV data and will warp it into an XML object which you can subsequently parse using Splitter and Tokenizer steps. The result should be similar to this:

Hide contentShow content

<?xml version='1.0' encoding='UTF-8'?>
<html>
  <head>
    <meta name="generator" content="HTML Tidy for Java (vers. 2009-12-01), see jtidy.sourceforge.net"/>
    <title/>
  </head>
  <body>src_full_name;pur_full_name;pur_first_name_orig;pur_last_name_orig;std_first_name;std_last_name;std_titles_prefix;pat_full_name;score_full_name;exp_full_name;pur_full_name_DQ_status;pur_full_name_status John Smith;John Smith;John;Smith;John;Smith;;F! L!;0;;NAME_OK;OK</body>
</html>

I’d recommend going with the first approach if it fits your requirements as it more straightforward and easier to debug.

View original

Did this topic help you find an answer to your question?

AKislyakov
Ataccamer
165 replies
Answer
1 year ago
August 11, 2023

Hi @mahesh Ar,

Ataccama One desktop does not have built in step to handle CSV APIs.

I can suggest one of two ways.

First one is to use HTTP Get step in the workflow to save a result of an API call as a text file and then process it using plan and Text File Reader step.

Hide contentShow content

<?xml version='1.0' encoding='UTF-8'?>
<html>
  <head>
    <meta name="generator" content="HTML Tidy for Java (vers. 2009-12-01), see jtidy.sourceforge.net"/>
    <title/>
  </head>
  <body>src_full_name;pur_full_name;pur_first_name_orig;pur_last_name_orig;std_first_name;std_last_name;std_titles_prefix;pat_full_name;score_full_name;exp_full_name;pur_full_name_DQ_status;pur_full_name_status John Smith;John Smith;John;Smith;John;Smith;;F! L!;0;;NAME_OK;OK</body>
</html>

I’d recommend going with the first approach if it fits your requirements as it more straightforward and easier to debug.

mahesh Ar
Author
Data Pioneer
6 replies
1 year ago
August 15, 2023

Hi @AKislyakov ,

Thanks for answering to above questions.

I am trying to implement the first approach and looking for the HTTP Get step.

I don't see HTTP Get step in one desktop .May i know is step name right?

Thanks,

Mahesh Arigela

AKislyakov
Ataccamer
165 replies
1 year ago
August 15, 2023

The stem name is HTTP Get. This is a workflow step and should be available in the workflow (ewf) context.

may_kwok
Star Blazer L3
89 replies
6 months ago
December 12, 2024

Hi @AKislyakov ,

I am running into a use case where my only option is to do an API with a csv response, and we need to put it into a VCI, so would like to avoid using a .ewf.

If we just do a pure http get it does work. Can you explain a bit more on how to make the step convert the incoming csv into xml?

Thanks!

AKislyakov
Ataccamer
165 replies
6 months ago
December 13, 2024

Hi @may_kwok,

You don’t need to do anything specific the step will automatically wrap csv into xml tags, as in my example. Parsing the result though might be a little challenging.

may_kwok
Star Blazer L3
89 replies
6 months ago
December 13, 2024

After a day of tinkering, I’ve got something.

Fruit.csv

A csv file that I just faked out to get some data. As business has progressed this includes vegetables too in addition to fruits....

Api.plan

A plan that gives out the above csv.

Api.online

An online file that allows ONE desktop to start this as an API. To start it, click this button:

Should look like this once started:

Test.wstest

Allows you to run simple HTTP call. This is what it looks like after I run it, so I know what the raw response is going to be:

(You can see the header tabs to send calls with authorisation headers etc)

Get.comp

The actual logic. Calls the API, splits the data, splits the columns, and gets the data. This solution does not care if the metadata changes. It’ll just find the columns you tell it to find. If it’s there it’ll parse it, if it’s not there it’ll just be null. Extra columns are just ignored.

Web lookup step configuration is very simple:
I make an assumption that every time I encounter “ “ (double quote space double quote) it is an end of line.

The orig output file on the bottom left looks like this:

feed column is the entire feed repeated throughout the rows.

Split feed is split out into rows

Attr splits even further down to attributes.

Get col name:
We know that row 1 of the set is the header. So we populate the value of the row 1 into the same positions for all rows.

Instance attributes output does not aggregate, so we are populating the “best” attribute, based on order by “row”, group by attr_pos.

(For those well versed in window functions in sql, this is similar to a rank() over (partition by attr_pos order by row), and then picking the value of attr that is of rank 1)

Find last column position: (so we can get rid of a legit trailing double quote)

Group by 1, because I want to just check the entire table.

Get the max attribute position:

Get rid of beginning and ending double quotes:

(If first column, get rid of first char. If last column, get rid of last char)

Get rid of the header row by filtering out row=1
Pivot it back to the way we want it
- Group by rd2. Rd2 was the record descriptor that we used to get from row to row/column unpivot. We use that to get back to the row granularity.

I’ve got same config for all 3 columns, just showing 1 for brevity:

The step is configured to recognise id, name and type. The column “colour” is not required and not mentioned in the step.

Out file looks like this:

Out 2 (which is going to be the actual data I need):

I hope this helps someone!

1 Attachments

csv_api.zip

may_kwok
Star Blazer L3
89 replies
6 months ago
December 16, 2024

I’ve just noticed that I’ve been over-ambitious to remove the last double quote on each row that I’ve removed 1 extra char, but hopefully this gives the idea….

Changing the column assigner step for fixing the quotes to this should do the trick:

case(attr_pos='1' and left(data_col_name,1)='"', substr(data_col_name,1),
case(attr_pos=last_col_pos and right(data_col_name,1)='"', left(data_col_name,length(data_col_name)-1),data_col_name))

and

case(attr_pos='1' and left(attr,1)='"', substr(attr,1),
case(attr_pos=last_col_pos and right(attr,1)='"', left(attr,length(attr)-1),attr))

Cansu
Community Manager
700 replies
6 months ago
December 16, 2024

Thank you @may_kwok for the elaborate solution! @mahesh Ar does this answer by @may_kwok help? Let us know please 🙋‍♀️

Check out our Quick Start Guide to get started on the community 🙋‍♀️

Reply

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

1 Attachments

Reply

Related topics

Migration of permissions from version 13 to version 14

📚 Guide: How to run GraphQL requests - Part II Working with MMM

📚 Guide: How to run GraphQL requests - Part I

Sending RDM Workflow Updates to Microsoft Teams & Slack

Working with Data Quality (DQ) Rules - Part 2️⃣

Most Liked this week

Sign up

Login to the Ataccama Community

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings