Skip to main content
Solved

Multi-load operation :: Why data acquisition is slow?


alexAguilarMx
Data Pioneer
Forum|alt.badge.img+2

Hi Every one,

 

Why the data acquisition in a multi-load operation could be so slow?

Checking the database, the query to get the data is taking less than 10 minutes. Do you have any guidance about it?

Best answer by AKislyakov

Hi @alexAguilarMx,

The Data Acquisition step (Load Plan Execution) not only executes the query on the source system site but also fetches all the data to the MDM Server. Therefore, if you have a large dataset, transferring all the data can be quite time-consuming.

It's hard to suggest anything without knowledge of the solution, but here are some basic checks you can perform to narrow down the problem. As I understand, in the load plan, you fetch data from a database, probably using the JDBC Reader step.

  1. Check if you have any heavy plan logic in the Load Plan. You can do it with a test plan, leave just the JDBC Reader part, connect it to a Trash step, and launch it using a workflow. This way, you'll get pure execute and fetch time.
  2. Check how long it takes to fetch all data from the database. Usually, when you execute a query in any kind of database development tool, it fetches you only the first one or two hundred rows. And fetching all of the rows might take much longer. You can check it in multiple ways:
    1. Create a temporary table and store the result there. This will give you pure query execution time on the database side, without data transfer overhead.
    2. Export the result to a text file. This will force database tool to fetch and transfer all rows.
  3. Especially for PaaS deployments, you can run the Load Plan on a local One Desktop to check if there is a significant difference in the execution time. If there is, it might indicate some networking problem between the PaaS environment and the source system.
View original
Did this topic help you find an answer to your question?

2 replies

Forum|alt.badge.img+2
  • Ataccamer
  • 146 replies
  • Answer
  • August 21, 2023

Hi @alexAguilarMx,

The Data Acquisition step (Load Plan Execution) not only executes the query on the source system site but also fetches all the data to the MDM Server. Therefore, if you have a large dataset, transferring all the data can be quite time-consuming.

It's hard to suggest anything without knowledge of the solution, but here are some basic checks you can perform to narrow down the problem. As I understand, in the load plan, you fetch data from a database, probably using the JDBC Reader step.

  1. Check if you have any heavy plan logic in the Load Plan. You can do it with a test plan, leave just the JDBC Reader part, connect it to a Trash step, and launch it using a workflow. This way, you'll get pure execute and fetch time.
  2. Check how long it takes to fetch all data from the database. Usually, when you execute a query in any kind of database development tool, it fetches you only the first one or two hundred rows. And fetching all of the rows might take much longer. You can check it in multiple ways:
    1. Create a temporary table and store the result there. This will give you pure query execution time on the database side, without data transfer overhead.
    2. Export the result to a text file. This will force database tool to fetch and transfer all rows.
  3. Especially for PaaS deployments, you can run the Load Plan on a local One Desktop to check if there is a significant difference in the execution time. If there is, it might indicate some networking problem between the PaaS environment and the source system.

alexAguilarMx
Data Pioneer
Forum|alt.badge.img+2
  • Author
  • Data Pioneer
  • 9 replies
  • August 21, 2023

Hey! 

 

You are right! I found another query fetching more data than expected. 

 

Thank you!


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings