Best Practice

Export metadata model into Excel

2 years ago
March 20, 2023
0 replies
321 views

anna.spakova
Ataccamer
155 replies

Hello everyone, Anna here! I work as a projects consultant in Ataccama and today I put together a guide for you on how to export a metadata model definition from Ataccama ONE.

Ataccama ONE allows to export metadata model definition from the platform into a JSON file. However, the JSON file is not an ideal format for business users. For that reason, parsing of the JSON into more user-friendly format, for instance Excel, is desired. The current JSON parser step in Ataccama DQC, however, is not able to parse this JSON structure.

This article introduces a Python script that was developed to parse the JSON structure of the metadata export and provide the metadata model in Excel and CSV formats. It is supported for v13 of Ataccama ONE.

As a first step, the metadata model has to be exported from Ataccama. Next step is to run the Python script on top of the exported metadata model to obtain the CSV and Excel format. Chapter JSON parser in Python describes in detail the Python script as well as the information available in the output. You can skip this part if you are not interested in the details. At the end of the article you can find out how to test the script on your own.

Export metadata model

The export is available under this endpoint: https://<one-address>/mmd/export. If this endpoint is throwing an error, make sure the following property is configured in the MMM backend application.properties:

ataccama.one.mmm.core.export-mmd=true

The MMM backend module needs to be restarted if you modified the properties.

Also, it may happen that the endpoint is not exposed as public URL (due to nginx), which needs to be configured. In this case contact out support engineers if you are not sure how to configure it.

JSON parser in Python

This chapter describes in detail the python script so that you can be modified if needed.

Libraries

Following Python libraries are needed:

pandas
simplejson
csv

Input

The parser script expects the metadata model json5 file in an /input folder. The name and the location can be changed on line #41:

The metadata model is handled as a dictionary, that contains other dictionaries. To work easily with the dictionaries, two functions are defined on the beginning of the script:

Functions

get_dic(dic, key)

This function takes a dictionary and returns another, inner dictionary:

get_val(dic,key)

This function returns the value from a dictionary for a given key. A primitive value is expected.

store(…)

Each record of the model is stored as an array and storing is done using another function, which at the same time translates some Boolean properties (e.g. reference) into string values for the property type:

Reason is, that if a property is primitive (e.g. String), the value is provided under a “type” key in the “type” node:

If it’s an object, the “type” node has no “type” key, instead there is ”true” for one of the remaining properties, e.g. “array”:

The function above will translate these values into string names of the object data types as we know them from the application, e.g. “Embedded Object”.

Parsing the dictionaries

The data from the file is loaded using json.load() function. The nodes part (main part of the model, that contains the actual model definition) is then extracted into a dictionary using the get_dict() function described above.

Metadata export file:

Script:

Parsing of the nodes is done in two for-cycles. First one loads other dictionaries that corresponds to the actual entities (e.g. metadata, term) using the key “node_name”. From these dictionaries, a string value for node “extends” is extracted (indicates from which entity it inherits) if exists.

Properties of the entities are loaded as another dictionary “props” and parsed in the second for-cycle.

From this properties dictionary, another dictionary is created for each property by the name key (“prop”). From this dictionary, we can get the type of the property again as dictionary (prop_type) with Boolean values corresponding to array, reference etc. See the picture capturing the nested dictionaries:

All the properties as well as name of the entity are then sent to the store() function, see above.

Writing to output

All rows created in the store() function are written into a CSV file using the csv.writer and writerow() function. Pandas library is then used to transfer this CSV file into Excel file. In this section you may change the location of the output files, which is set to output folder (relative to the python script location).

Example output:

Limitations

Current implementation is not parsing constrains, traits or indexes or plugins.
Arbitrary reference type is ignored.
Computed content is not parsed.
Delegated scalar properties are not parsed properly (missing property type).

How to Try it Out

Attached to this article is the python script as well as an example input and ouput files.

Make sure you have python installed: https://www.python.org/downloads/. Then you have to install the required packages. You can refer to this guide:

https://packaging.python.org/en/latest/tutorials/installing-packages/

You can run the code from command line, simply use:

$ python <path_to_the_script>/export_mmd_model.py

Make sure you have the expected input and output folders ready with the metadata model input file (see the section Input and Writing to output) OR that you modified the code to fit your needs.

If python command is not recognized, you have the following options:

Add the python executable path to your Window's PATH variable. https://stackoverflow.com/questions/7054424/python-not-recognized-as-a-command
Download Pycharm (Python IDE) and run it from there. https://www.jetbrains.com/pycharm/download/#section=windows

Did this topic help you find an answer to your question?

Be the first to reply!

Reply

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Export metadata model

JSON parser in Python

Libraries

Input

Functions

Parsing the dictionaries

Writing to output

Limitations

How to Try it Out

Reply

Related topics

I don't understand RCPackageTypeicon

2 Errors that I don't understand. They may be related???icon

URGENT / login and setAttributes functions are sometimes not doing anythingicon

fetch products without offeringsicon

Purchases not restoring automatically on Android when calling configure with another app user idicon

Most Liked this week

Sign up

Login to the Ataccama Community

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings