Skip to main content

Hi community!
 

We hope you’ve found some useful insights in the first part of our Working with Data Quality Rules article. Check it out below if you haven’t seen it already or if you need a refresher!

In this post, we will focus on duplicating rules, scoring records, and creating rules from mask, patterns, and frequency analyses. Let's dive in!
 

Duplicating Rules

Duplicating rules can save you time and effort, especially when you have similar rules that need to be assigned to multiple lookup files. Follow these steps to duplicate a rule:

  1. Navigate to the rule you want to duplicate.
  2. Expand the "More Options" menu and select "Duplicate."

     

  3. A copy of the rule will open in a new tab with the settings already populated.

     

  4. Edit the information as required and click "Save."

Power users can also perform this action via API.

 

Scoring Records

Scoring records helps you assess the severity of data quality issues. The score represents the numeric expression of each record's invalidity according to the rule. Here's how you can assign scores in Ataccama's ONE Webapp:

  1. Use the "Rule Logic" section to assign scores.
  2. Define the score based on the severity of the issue. For example, a non-critical value missing could have a score of 100, while a critical value missing could have a score of 10000.
  3. After setting the scores, publish the rule.

You can also assign scores via the validation component in ONE Desktop, but this is available only for power users.

Viewing Individual Scores

To view individual scores for each record, you need to export the project. Follow these steps:

  1. Navigate to the "Export" tab of a monitoring project.

     

  2. Select the post-processing result to export in CSV format and click the download icon. The report file will contain all the records with their scores in the last column.

To filter the number of records to only those that are invalid, power users need to configure the export plan in ONE Desktop.

 

Creating Rules from Mask, Patterns, and Frequency Analyses

To create rules from mask, patterns, and frequency analyses, follow these steps:

  1. In the "Data Catalog" section, select the desired catalog item.
  2. Open the profiling results of the attribute you want to create a rule for.
  3. In the Mask Analysis widget, select the more options icon and choose "Use results in rule."

     

  4. A sidebar will open with the option to create a new rule.

     

  5. Add multiple masks, frequencies, and patterns to the rule.
  6. Fill in the information for the rule, specify its logic, and publish it.

     

For example, you can create a rule that checks if an attribute doesn't match a specific mask pattern, indicating its invalidity.

 

Creating Lookup Files in Rules

To create lookup files in rules, follow these steps:

  1. In the "Data Quality > Rules" section, select "Create."
  2. Provide the necessary information such as name, description, owner, and steward.
  3. Save the rule as a draft.
  4. Define the rule logic by adding input attributes and selecting the rule type.
  5. Specify the rule output, condition, expected result, and explanation.
  6. During the rule logic definition, choose the option "is from catalog item" to create a lookup item.

     

  7. Select the catalog item and attribute for the lookup.

     

  8. Fill in the details for lookup creation, including handling duplicate values and matching options.
  9. Confirm the lookup creation, and the lookup build will start automatically.
  10. Once the lookup file is created, publish the rule.

Now you have successfully created a rule with a lookup file.

 

Import and Export

When working with rules, it's important to understand the import and export process. Here are some key points to keep in mind:

  • Exported rules include all associated entities and can be imported into a new instance.
  • Users who are owners or stewards of the exported rules must already exist in the target instance.
  • Lookup files used in the rules are not exported. Ensure that the same lookup files exist in the target system to maintain rule references.

For advanced use cases, such as comparing data across different catalog items of different sources, power users can leverage One Desktop.
 

If you have any questions or best practices to share with the community let us know in the comments below 👇

Be the first to reply!

Reply