Skip to main content

How can we handle csv output from API response?

I tried to use json call ,SOAP call but it didn't work.Anyway, can we handle in One desktop csv output from API.

Hi @mahesh Ar,

Ataccama One desktop does not have built in step to handle CSV APIs.

I can suggest one of two ways.

First one is to use HTTP Get step in the workflow to save a result of an API call as a text file and then process it using plan and Text File Reader step.

The second one is to use Web lookup step. The Web lookup step will obtain the CSV data and will warp it into an XML object which you can subsequently parse using Splitter and Tokenizer steps. The result should be similar to this:

<?xml version='1.0' encoding='UTF-8'?>
<html>
<head>
<meta name="generator" content="HTML Tidy for Java (vers. 2009-12-01), see jtidy.sourceforge.net"/>
<title/>
</head>
<body>src_full_name;pur_full_name;pur_first_name_orig;pur_last_name_orig;std_first_name;std_last_name;std_titles_prefix;pat_full_name;score_full_name;exp_full_name;pur_full_name_DQ_status;pur_full_name_status John Smith;John Smith;John;Smith;John;Smith;;F! L!;0;;NAME_OK;OK</body>
</html>

 


I’d recommend going with the first approach if it fits your requirements as it more straightforward and easier to debug.


Hi @AKislyakov ,

Thanks for answering to above questions.

 

I am trying to implement the first approach and looking for the HTTP Get step.

I don't see HTTP Get step in one desktop .May i know is step name right?

Thanks,

Mahesh Arigela


The stem name is HTTP Get. This is a workflow step and should be available in the workflow (ewf) context.

 


Hi ​@AKislyakov ,

 

I am running into a use case where my only option is to do an API with a csv response, and we need to put it into a VCI, so would like to avoid using a .ewf.

If we just do a pure http get it does work. Can you explain a bit more on how to make the step convert the incoming csv into xml?

Thanks!


Hi ​@may_kwok,

You don’t need to do anything specific the step will automatically wrap csv into xml tags, as in my example. Parsing the result though might be a little challenging.


After a day of tinkering, I’ve got something.

 

 

Fruit.csv

  • A csv file that I just faked out to get some data. As business has progressed this includes vegetables too in addition to fruits....

 

Api.plan

  • A plan that gives out the above csv.

 

Api.online

  • An online file that allows ONE desktop to start this as an API. To start it, click this button:

 

  • Should look like this once started:

 

 

Test.wstest

  • Allows you to run simple HTTP call. This is what it looks like after I run it, so I know what the raw response is going to be:

 

(You can see the header tabs to send calls with authorisation headers etc)

 

Get.comp

  • The actual logic. Calls the API, splits the data, splits the columns, and gets the data. This solution does not care if the metadata changes. It’ll just find the columns you tell it to find. If it’s there it’ll parse it, if it’s not there it’ll just be null. Extra columns are just ignored.
  • Web lookup step configuration is very simple:
     

     

  • I make an assumption that every time I encounter “ “ (double quote space double quote) it is an end of line.

     

 

The orig output file on the bottom left looks like this:

feed column is the entire feed repeated throughout the rows.

Split feed is split out into rows

Attr splits even further down to attributes.

 

  • Get col name:
    We know that row 1 of the set is the header. So we populate the value of the row 1 into the same positions for all rows.

Instance attributes output does not aggregate, so we are populating the “best” attribute, based on order by “row”, group by attr_pos.

 

 

 

 

 

(For those well versed in window functions in sql, this is similar to a rank() over (partition by attr_pos order by row), and then picking the value of attr that is of rank 1)

 

  • Find last column position: (so we can get rid of a legit trailing double quote)

Group by 1, because I want to just check the entire table.

 

Get the max attribute position:

 

 

  • Get rid of beginning and ending double quotes:

 

(If first column, get rid of first char. If last column, get rid of last char)

 

  • Get rid of the header row by filtering out row=1
  • Pivot it back to the way we want it
    • Group by rd2. Rd2 was the record descriptor that we used to get from row to row/column unpivot. We use that to get back to the row granularity.

 

 

I’ve got same config for all 3 columns, just showing 1 for brevity:

 

 

 

  • The step is configured to recognise id, name and type. The column “colour” is not required and not mentioned in the step.

 

Out file looks like this:

 

 

Out 2 (which is going to be the actual data I need):

 

 

I hope this helps someone!

 


I’ve just noticed that I’ve been over-ambitious to remove the last double quote on each row that I’ve removed 1 extra char, but hopefully this gives the idea….

 

Changing the column assigner step for fixing the quotes to this should do the trick:

case(attr_pos='1' and left(data_col_name,1)='"', substr(data_col_name,1),
case(attr_pos=last_col_pos and right(data_col_name,1)='"', left(data_col_name,length(data_col_name)-1),data_col_name))

and

case(attr_pos='1' and left(attr,1)='"', substr(attr,1),
case(attr_pos=last_col_pos and right(attr,1)='"', left(attr,length(attr)-1),attr))

 


Thank you ​@may_kwok for the elaborate solution! ​@mahesh Ar does this answer by ​@may_kwok help? Let us know please 🙋‍♀️


Reply