Happy Monday community!
Today we are going to talk about Remote Plans. Let’s start with the basics, what are remote plan executions?
The remote plan execution lets you run any ONE plan on the Ataccama ONE Platform. Plans are typically developed and tested on small data sets or on dedicated environments. This helps you validate the plans and assess how well they perform so that you can make the necessary adjustments before applying them on your actual data. Once the plan works as expected, you can run it remotely on production data in ONE.
Jobs are first sent to the Data Processing Manager (DPM) module, which forwards the job information to a suitable Data Processing Engine (DPE). Actual data processing takes place only in DPE instances.
How to configure Remote Plan Execution
Remote plan execution leverages existing controls and concepts of ONE Desktop. Once you connect to the Ataccama ONE Platform, you can choose between the following launch options:
- Local: A suitable DPE is selected automatically depending on the plan. The plan is first preprocessed in order to detect any steps that require using ONE Spark DPE.
- Spark: All jobs are sent only to ONE Spark DPE.
To create a new environment for the Ataccama ONE Platform, follow the instructions described in Environments, section Adding a New Environment. In this case, the new environment must be configured to use your Ataccama ONE Platform server, with the Launch type set to ONE Platform launch.
Drivers used for remote execution
For security reasons, database drivers are not transferred to DPE when remotely executing plans.
Make sure the file name of the database driver is configured the same way in both your DPE instance and in ONE Desktop.
Remote Execution on ONE Spark DPE
If you want all plans to be executed on the Spark engine, set the run configuration as follows.
-
In ONE Desktop, select the environment that you want to use from the toolbar. The environment must be configured for Ataccama ONE Platform.
-
Open your plan and in the toolbar, select Run configuration.
-
In Data Processing Parameters, select SPARK and provide the name of your cluster.
- Select Run to execute your plan on ONE Spark DPE. You can track the run’s progress and outcome on the Run Results tab.
Remote Execution Direct Communication Configuration
When you run the remote execution jobs on a secured cloud server, you might experience some performance issues. This page describes a workaround in order to avoid them.
Issues without Workaround
Without workaround, you might experience:
Workaround
Assuming that you already have the ONE Platform launch environment (env01) with the Ataccama ONE Platform server with a public address as shown below.
Steps
Using the Environment Set
Original environment (env01
) should still be selected when doing anything but remote executions.
Environment (env02
) should be used only for the remote executions.
- Long delay between job successful finish and execution SUCCESS
- In console, user is informed: cWARN] Status tracking issue
- You can open these warnings in Run Results and in details see the exception/error
-
Navigate to File Explorer.
-
Click Environments.
-
Select New Environment.
-
Create the new ONE Platform launch environment (
env02
). -
Navigate to the new environment (
env02
).
-
Right click Servers.
-
Select New Server.
-
Select Ataccama ONE Platform as Implementation.
-
Use the following entries:
- Web URL: Same as the public server used in
env01
. (Optional) - User: Same as the user for public server used in
env01
. - Password: Same as the password for public server used in
env01
. - Authentication: Basic.
- GraphQL URL: http://mmm-be-svc:8021/graphql.
- ONE Metadata Server Port: 1. (Optional)
- ONE Data Processing Port: 1. (Optional)
- Web URL: Same as the public server used in
-
Click Finish.
-
Always use this new environment (
env02
) for remote execution. -
Uncheck Validate Plans
Always uncheck Validate plans before plans execution option in Run Configurations. For each plan you need to uncheck this checkbox separately.