Skip to main content

Hi All,

We are periodically cataloguing our Databricks environment and have encountered a scenario where a table in the underlying system has been deleted, however the metadata still remains present in the catalog after a re-scan.

I understand this is the expected behaviour of Ataccama’s existing functionality, however it is not an ideal scenario for end users as they be viewing metadata for an object that no longer exists. Ideally we would want to tag the catalog item to indicate it has been deleted, or soft delete the catalog item in some way and hide it from data consumers to avoid any potential confusion.

I am interested to understand whether others have encountered this scenario before and what mechanisms (if any) they were able to implement to help with this, either with native Ataccama functionality or with a custom workaround.

Kind regards,

Cristian

 

Hi Cristian ( ​@cmagnano ),

I have not encountered (or noticed) this before, but I was wondering about the following. We have recently upgraded to version 14.5 and now have new functionality called Data Observability. With that schema changes like deleted tables can be noticed. If such a situaton occurs, then you can remove the obsolete catalog item yourself. 

Or, maybe even better, add a property to the catalog item entity in the metadata model that you can use to indicate the soft-delete, as you suggest. The you still have the old metadata available.

Kind regards,

Albert


Thanks for the response Albert.

I will try configuring the Data Observability module in one of our environments to detect schema changes and see if that gives us a list of tables/attributes deleted that can then be actioned.


Reply