Well this highly depends on the proportion of the data that is changed between executions. If it is low, I suggest the first approach, if it is something like 10% or higher, then definitely the second. If it varies a lot, then again the second approach will give you much more stable execution time.
Another approach I might suggest is Traversing Plan Publisher which is able to fetch related records in an optimized way.
@AKislyakov
Thank you for the recommendation, I want to follow the first approach but, what can I do to optimize the initial execution, it will extract the information of the full table, it could take days, but I want to keep the delta watermark for next executions.
Regards,
Alejandro
Hi @alexAguilarMx,
This highly depends on the data volumes. If you have a low to moderate amount of records in the full_master entity (let’s say below 10M, but again the number is arbitrary), then waiting for the plan to finish might be a better option. Because the time it takes to develop and test a workaround might be longer then just waiting for non-optimized plan to finish.
Otherwise you can develop two export operations. One using join step approach for the initial execution and the second using MDM Read approach for ongoing delta operations.
Then you can use first one for the initial load and supply referenceTransactionId parameter to the second one to skip the changes. If you are going with this approach I strictly suggest to disable all incoming integration (batch loads, streaming, online services) for the whole period of the initial export and the first run of a delta one.
Hi @AKislyakov
It really helps, a last question about it, how can I obtain the referenceTransactionId value for the second one?
Regards,
Alejandro
Hey, @AKislyakov
I found the referenceTransactionId in __export_reg table.
Regards,
Alejandro