Happy Tuesday Community!
We started this week by sharing best practices on Monitoring Projects, and today we’ll be continuing with how to configure Monitoring Projects. If you have missed the first post you can find it here:
Let's explore the best practices for configuring Monitoring Projects to ensure robust data quality checks and anomaly detection.
Select Catalog Items
- Go to the Configuration & Results tab.
- To add catalog items:
- For a new project, select Add catalog items.
- To add more items to an existing project, select +Add under Items to Monitor.
- To remove an item, use the more options icon and select Delete.
- For a new project, select Add catalog items.
- The listed catalog items will display a summary of:
- Structure: Aggregated results for structure checks on attributes.
- Anomalies: Detected anomalies on attributes.
- DQ Checks: Results of data quality checks assigned to attributes.
- Overall: Aggregated quality based on assigned DQ checks contributing to overall quality.
- Individual DQ dimensions, listed with their abbreviations.
For more detailed results or to edit checks, click on a catalog item to access the attribute view. Proceed to configure each item as required.
DQ Dimensions
By default, DQ dimensions contributing to overall quality match global settings (see Configuring Data Quality Dimensions). To customize the configuration:
- Navigate to the Overview tab.
- In the Overall quality contribution widget, select desired dimensions to contribute to overall data quality results for the project.
To revert to default settings, select Revert to default.
Structure, DQ Checks, and Anomaly Detection
After selecting catalog items, configure each item individually to add checks:
- Structure Checks: Alert for missing columns or data type changes.
- Anomaly Detection: AI-powered alerts for potential anomalies.
- DQ Checks: Assign Data Quality Evaluation rules (see Working with Rules).
For more information on each check, refer to the links provided in the text.
Assign Checks
- Use the plus icons to enable or add Structure Checks, Anomaly Detection, and DQ Checks.
- Structure Checks can be enabled on an attribute-by-attribute basis by selecting +Make Mandatory.
- Anomaly Detection can be enabled on an attribute-by-attribute basis by selecting +Enable Detection.
- Structure Checks can be enabled on an attribute-by-attribute basis by selecting +Make Mandatory.
- For DQ Checks, choose either:
- Assign custom DQ checks using the plus icon under Applied DQ Checks and select rules from the list.
- Use the rule suggestions option to review and accept/reject suggested rules for the catalog item.
- Assign custom DQ checks using the plus icon under Applied DQ Checks and select rules from the list.
To remove checks, hover over the applied check in the table and click on the cross icon. Remember to publish changes after implementation.
Additional Configuration Options
Filter by Attributes
Use attributes as filters to view DQ results for specific data sources. Enable filters for attributes and publish the changes to the project.
Update Rule References
Be aware of any edited rules in the project. Admins can enable/disable notifications for all users through core:reference
trait. Ensure required reference strategy before project configuration.
Configure Parallel Processing
Global and project-specific parallelization allows efficient processing. Configure it through property settings or web app advanced settings.
By following these best practices, you can ensure an effective and accurate Monitoring Project configuration for maintaining high data quality.
Any questions? Thoughts? Share them in the comments