Skip to main content
Solved

Best Practices for Data Quality Rules based on joins


Forum|alt.badge.img

Hello!
Wondering if anyone has experience with creating data quality rules for values in joined tables.

For example, we have 2 tables (Table A and B), related to each other through a shared attribute. I want a data quality rule where the value of a column in table A must be lower than the value of a column in Table B. My only idea for this is to create a SQL catalog item joining the two tables, and creating a dq rule on the SQL catalog item. This works, but has flaws, I can’t have this data quality rule contribute to the overall data quality of Table A, nor will users who find this table in the catalog know of all the identified data quality issues. 

Does anyone have a better way of doing this, or is there a built-in tool I’m not familiar with that anyone has used?
Thanks!

 

Best answer by OGordon100

Hello ​@bobparry!

This is a very good question. Unfortunately due to product limitations, you are exactly correct - if you need a rule involving multiple tables you must create a new Catalog Item (either with SQL, ONE Desktop Components, or the Transformation Editor in v15+), and attach your rule to it. It then has no impact on your original table’s Data Quality.

We are, however, actively tracking this exact use case, as it is a very common one. Could you please create an idea for this on the community? Thank you!

View original

OGordon100
Ataccamer
Forum|alt.badge.img

Hello ​@bobparry!

This is a very good question. Unfortunately due to product limitations, you are exactly correct - if you need a rule involving multiple tables you must create a new Catalog Item (either with SQL, ONE Desktop Components, or the Transformation Editor in v15+), and attach your rule to it. It then has no impact on your original table’s Data Quality.

We are, however, actively tracking this exact use case, as it is a very common one. Could you please create an idea for this on the community? Thank you!


Lisa Kovalskaia
Ataccamer
Forum|alt.badge.img+3

@bobparry Hi! To add to what ​@OGordon100 wrote, for the time being you might make use of catalog item relationships to help users discover SQL CIs related to source CIs. With a bit of training users can get used to checking those relationships whenever they are browsing the catalog. Of course if you want to build some automations based on DQ scores, you'd need some logic to translate DQ issues from the SQL CI to the source CI - that's something you could potentially achieve with the metadata model customizations and an automation workflow that would export DQ results from an SQL CI asset and import it to the source CI asset. Let me know if you are interested and have questions.

That said, please do raise a feature idea here on the community too! Hearing from our customers helps our product team tremendously to prioritize and refine their plans. Thank you!


Forum|alt.badge.img
  • Data Voyager
  • May 7, 2025

Thanks Gordon and Lisa, I’ve created the idea here: Enable Data Quality Rules based on Joins | Community

@Lisa Kovalskaia, can you elaborate on what you mean about the automations based on DQ scores? Do you mean having a metadata field, and populating that with the Data Quality score from the SQL catalog item, and then have users look at that field rather than the Overall score? 

To your point about the item relationships, is there any way to automate the creation of those? Especially for SQL Catalog items it feels like the Source catalog items they are related to should automatically appear in the Relationships pane, given the environment knows exactly how they are related. It’s a great idea to encourage users to check the relationships in cases like this, it would be great if it was a little easier to have those relationships populated. 


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings