Skip to main content

Hi,

Have gotten some good answers here. So here is a case. It is a requirements to merge data.

This involves steps:

  • Compare to sets of data - say Target and Source.
  • Any record not in Target but in Source will be added (insert).
  • Any record in Target but not in Source will be flagged ‘deleted’ or deleted (delete)
  • Any record in both sets will be updated - say a timestamp will be updated (update)

In SQL it is known as MERGE.

One way would be an full outer join and then use filters, with three branches in the plan, but I hope I miss an easier way ;)

 

Hi Marnix,

If the keys/values match exactly, depending on use case, you can use an outer join in a Join step over your source and target tables, and set a column to flag rows with the required INSERT, UPDATE, DELETE operation, based on whether there are matching values in in_left.key and in_right.key or just one or the other.  You could then use Extract Filter and/or Filter steps to get lists of records for each operation.


Note that if you are referring to doing this on a database, there is also a SQL Execute step which you could use in a plan to manage these CRUD operations.

 

If the values do not match exactly, there are the fuzzy matching functions of the ONE Desktop available, such as distance, and levenshtein to help find matches in joined data.

 

I hope this helps!


Reply


ataccama
arrows
Lead your team  forward  OCT 24 / 9AM ET
×