Reasons for deletion
There are several reasons why you might want to delete data from your MDM.
- Legal reasons e.g. GDPR
- Incorrect load operation happened
- The data is simply not relevant anymore
- Performance issues while keeping redundant data
- etc.
Truncate tables?
People usually ask if a TRUNCATE operation worked. The simple answer is NO. It might work only in isolated edge cases. You typically need to consider MANY other things and at least some of them need to be resolved.
- uncommitted transactions
- related tables/records
- x... tables
- historical table(s)
- direct deletion will not be properly removed from the history and must be removed separately
- master table(s)
- matching repository and matching proposals repository (if matching is enabled)
- think about copyColumns - are there any data copied from that specific entity to a related one
- export operations (truncated data will not be exported)
- event handlers (truncated data will not be published)
- auditing
- direct deletion will not be logged and will ignore any permissions
All of the above need to be considered and taken care of. For that reason, the TRUNCATE operation (as any other direct DB operation) is not RECOMMENDED
The only edge case might be an isolated instance entity (without any relationships), no matching enabled and no related master entity exists. In that case, the truncate operation might be the fastest way how to delete all data in the table when
- the MDM server is not running
- there is no uncommitted transaction
- x Tables have to be truncated as well
What is the best approach
If you wanna physically delete specific records, you can use the ProcessPurge service. It allows remote applications to completely purge records from the hub. The service purges a record of a given entity by ID (you need to provide the list of IDs).
If you want to perform a bulk delete or delete multiple or whole entities, it’s better to create a dedicated load operation with the source deletion strategy set to DELETE.
Deleted records are no longer available to any MDM hub processes, and will not be visible in MDA. To remove a deleted record from the repository, either run the Process Purge Service immediately or remove it by using the housekeeping Purge API.
It can be FULL or DELTA load depending on how generic the load operation should be and what input records you can provide. E.g. if you want to delete whole entities, you can use a full load operation and send an empty file as an input, and don’t forget about the processed entities definition in the given load.