Hi,I am wondering to know if is good idea:use MDC Reader step for delta export operations where we need to complement data from entities of other layer oradd another Data Sources with full mode of this other entity, then use a Join step to get the information required. Both are feasible, not sure what is better talking about batch processing, performance, etc.

Does it good idea use MDC read inside export operations?

Hi,

I am wondering to know if is good idea:

use MDC Reader step for delta export operations where we need to complement data from entities of other layer

or

add another Data Sources with full mode of this other entity, then use a Join step to get the information required.

Both are feasible, not sure what is better talking about batch processing, performance, etc.

Page 1 / 1

Well this highly depends on the proportion of the data that is changed between executions. If it is low, I suggest the first approach, if it is something like 10% or higher, then definitely the second. If it varies a lot, then again the second approach will give you much more stable execution time.

Another approach I might suggest is Traversing Plan Publisher which is able to fetch related records in an optimized way.

@AKislyakov

Thank you for the recommendation, I want to follow the first approach but, what can I do to optimize the initial execution, it will extract the information of the full table, it could take days, but I want to keep the delta watermark for next executions.

Regards,

Alejandro

Hi @alexAguilarMx,

This highly depends on the data volumes. If you have a low to moderate amount of records in the full_master entity (let’s say below 10M, but again the number is arbitrary), then waiting for the plan to finish might be a better option. Because the time it takes to develop and test a workaround might be longer then just waiting for non-optimized plan to finish.

Otherwise you can develop two export operations. One using join step approach for the initial execution and the second using MDM Read approach for ongoing delta operations.
Then you can use first one for the initial load and supply referenceTransactionId parameter to the second one to skip the changes. If you are going with this approach I strictly suggest to disable all incoming integration (batch loads, streaming, online services) for the whole period of the initial export and the first run of a delta one.

Hi @AKislyakov

It really helps, a last question about it, how can I obtain the referenceTransactionId value for the second one?

Regards,

Alejandro

Hey, @AKislyakov

I found the referenceTransactionId in __export_reg table.

Regards,

Alejandro

Sign up

Login to the Ataccama Community

Scanning file for viruses.

This file cannot be downloaded