Skip to main content

importing catalog item descriptions and attribute descriptions from dbT docs


Prasad Rani
Data Pioneer
Forum|alt.badge.img+1

We are using dBT for our data pipelines, and for the tables and fields, there is description maintained in dBT that we would like to export (json) and update Ataccama Data Catalog.

Is there a known, trusted, supported way to automate this?

I already do have a desktop plan i created that can update catalog objects and attributes based on a excel input file. I may be able to modify that if thats the only option. Iam reaching out to the community to see if there is a preferred way to get descriptions and documentation from dBT to Ataccama.

April 4, 2024

Hi Prasad,
You should be able to update the descriptions for catalog items and attributes using Metadata Writer step.
First you would need to read the ID’s and names for all the existing CI’s and Attributes using Metadata Reader step, then you can join the metadata from the platform with another data stream from dbt which would contain the names of CI’s and attributes and relevant descriptions and then you would update existing CI’s and Attributes using Metadata Writer step.
Below are examples of the configuration of Metadata Reader\Writer steps:
 

And here’s a high level example of how the plan logic might look like:
 

Updating CI descriptions is rather straight forward task but in case of attributes you need to make sure that you reference the right parent catalogItemId for each attribute id.

To parse the data in json format coming from dbt you should be able to use something like Json Reader or Json Parser step in IDE.

I hope this will be helpful.

Ivan

Did this topic help you find an answer to your question?

ivan.kozlov
Ataccamer
Forum|alt.badge.img+3

Hi Prasad,
You should be able to update the descriptions for catalog items and attributes using Metadata Writer step.
First you would need to read the ID’s and names for all the existing CI’s and Attributes using Metadata Reader step, then you can join the metadata from the platform with another data stream from dbt which would contain the names of CI’s and attributes and relevant descriptions and then you would update existing CI’s and Attributes using Metadata Writer step.
Below are examples of the configuration of Metadata Reader\Writer steps:
 

And here’s a high level example of how the plan logic might look like:
 

Updating CI descriptions is rather straight forward task but in case of attributes you need to make sure that you reference the right parent catalogItemId for each attribute id.

To parse the data in json format coming from dbt you should be able to use something like Json Reader or Json Parser step in IDE.

I hope this will be helpful.

Ivan


Cansu
Community Manager
Forum|alt.badge.img+3
  • Community Manager
  • April 5, 2024

Hi @Prasad Rani, I’m closing this thread for now. If you have any follow up questions please feel free to share them here or create a new post  🙋‍♀️


Forum|alt.badge.img
  • Data Voyager
  • March 10, 2025

@ivan.kozlov thanks for your instructions above, our metadata operates with a similar structure to Prasads and I was able to follow what you have to work in my environment. 
Question for you though, I created a plan which reads metadata from an XML file, and successfully writes it to AtaccamaOne, through the logic you provided of joining based on the catalogueItemId and locationId. However, the plan only reads 1 XML file, and writes to 1 catalogue item. I have 1000+ XMLs related to 1000+ catalogue items. Is there a way to iterate through this workflow, for every XML (or DBT file). They all have the exact same formatting, so the plan i have created will work for any XML, i just cant find a way to provide multiple XMLs at input. 


ivan.kozlov
Ataccamer
Forum|alt.badge.img+3

Hi ​@bobparry ,
I apologize for delay with response on this topic.

I guess there are 2 main ways how to read multiple files at once or iterate over the files.
1) To read multiple files in the same directory you can try to set input file location to /PATH/TO/FILES/*.xml, in that case the reader step will read all the files that match that mask (all xml’s in directory) and then the plan will process the inputs from all the files. I should mention that this will work if all the files have the same structure.
2) You can wrap the plan into a workflow (or rather 2 levels of workflows).
Top level workflow will trigger the iteration over the files in defined directory using Iterate Task and then trigger the lower level workflow which should contain the Run DQC task and will trigger the actual component that will process the data. In this case you’ll need to pass the input file name\location as parameter from the top level workflow, to bottom level workflow, to component. This approach is certainly more complex but adds more options in terms of flow control.

You should be able to find Workflow Tutorials project in your ONE Desktop application where you can see examples of Iterator in action.

I hope this helps.

Ivan


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings