Using GraphQL queries in .net to extract DQ results and load into my own database

1 month ago
26 March 2024
3 replies
28 views

Userlevel 2

+1

Prasad Rani
Data Pioneer
10 replies

Hello:

I would like to use GraphQL queries to extract results of a monitoring project - and load the results into my own database table for reporting and other purposes. I need to make it split the invalid records - in a way - that each record should be for one failed rule - even if the primary key record - has multiple DQ issues (failed rules).

Can some one help me with sample code on how you may have accomplished it - if you did such a thing?

OGordon100 1 month ago

Hi Prasad.

I would not recommend directly calling against a GraphQL query to access monitoring project results. This is as the GraphQL APIs are unversioned and subject to change between releases, increasing maintenance considerations for your pipeline.

Instead, I would point you to use a Post Processing Plan for each dataset in the monitoring project (if you are on v14 or earlier, this will be in ONE Desktop, but in v15 onwards can be done through the Web App). When you generate the post processing plan, it will be a template that returns the source dataset + the DQ results (both pass and fail).

You can then use a Filter step to filter for invalid rows of data where that have failed (will be along the lines of `invalid_rules IS NOT NULL`.

Because this dataset shows as 1 row of data per 1 row of source data, all rule failures for each row are combined together in a single string (e.g. `failed_rule_1;failed_rule_56;failed_rule_78`). You can therefore use a Splitter step, and split by ‘;’ to explode out multiple reasons for rule failures into 1 line for each rule failure.

You can then finally use a JDBC writer step to write to your database.

Hope this helps!

-Oliver

View original

3 replies

Userlevel 2

+2

OGordon100
Ataccamer
8 replies
1 month ago
27 March 2024

Hi Prasad.

I would not recommend directly calling against a GraphQL query to access monitoring project results. This is as the GraphQL APIs are unversioned and subject to change between releases, increasing maintenance considerations for your pipeline.

Instead, I would point you to use a Post Processing Plan for each dataset in the monitoring project (if you are on v14 or earlier, this will be in ONE Desktop, but in v15 onwards can be done through the Web App). When you generate the post processing plan, it will be a template that returns the source dataset + the DQ results (both pass and fail).

You can then use a Filter step to filter for invalid rows of data where that have failed (will be along the lines of `invalid_rules IS NOT NULL`.

Because this dataset shows as 1 row of data per 1 row of source data, all rule failures for each row are combined together in a single string (e.g. `failed_rule_1;failed_rule_56;failed_rule_78`). You can therefore use a Splitter step, and split by ‘;’ to explode out multiple reasons for rule failures into 1 line for each rule failure.

You can then finally use a JDBC writer step to write to your database.

Hope this helps!

-Oliver

Userlevel 2

+1

Prasad Rani
Author
Data Pioneer
10 replies
1 month ago
27 March 2024

thank you so much Oliver. Appreciate response.

Userlevel 2

+1

Prasad Rani
Author
Data Pioneer
10 replies
4 days ago
24 April 2024

Oliver:

Post Processing Plans wont work - if I have enabled the “Snowflake Pushdown”. And based on the size of the database tables and other reasons, we will NOT be able to work with disabling that Pushdown.
Also - this feature of Post Processing - will be project specific - and I would like to extract DQ results of all projects everyday - through one Plan.

Have you tried the Metadata reader to extract DQ results? Do you have any template that you could share ?

Reply

Sign up

Login to the Ataccama Community

Scanning file for viruses.

This file cannot be downloaded