Skip to main content

Hi,

 

I need to extract metadata using One Desktop and store in Parquet file format in ADLS. Do we have any recommendations on how this can be achieved in One Desktop?

Hi Sumisha,

I’m aware of some issues that can appear when Parquet Writer step is writing directly to ADLS container however it would help if you could clarify which error messages you see and which version of product you’re using. Depending on specific problem you have there might be a fix available.

In the past we had to resolve similar issue and the working approach was to write file to local directory and then using workflow with Operate On File to copy or move the file from local directory to ADLS. Workflow step configuration might look like this. You’ll need to have ADLS storage added to your runtime config so you can then reference it using “resource://$resourcename” notation like on example below.
 

I hope this helps.
Ivan


Thanks Ivan, that was really helpful. To start with, I am trying to extract metadata from Catalog and save in Parquet format as a plan’s output. But it gives me error: Internal error occurred during run of the plan: Illegal char <:> at index 4: file:/c:…..

Would you know if this is not right way to save result in parquet format in local folder?


Could you please post the screenshot of your Parquet Writer step?
Based on the error message it looks like you might have some error in the output file path.

Here’s an example of Parquet Writer configuration with pathvar
 

And here’s another with relative filepath:
 

 


This is my output step


Also, can you share what details are required to get ADLS storage added to runtime config? 


Hi ​@sumisha , unfortunately the error doesn't look familiar and the writer step configuration looks correct. I’d suggest you to create a support request with all the details and our team will investigate.

As for the ADLS2 connection, you can find a connection example below:
    <config class="com.ataccama.dqc.azure.config.AzureGen2Contributor">
        <azureGen2Connections>
            <azureGen2Connection clientId="$azure_client_id" authenticateUser="false" clientKey="$encrypted_client_secret" containerName="$container_name" name="$connection_name" storageAccount="$storage_account" authTokenEndpoint="$auth_endpoint_link"/>
         </azureGen2Connections>
    </config>

$azure_client_id, $encrypted_client_secret, $container_name, $connection_name, #storage_account and $auth_endpoint_link will have to be replaced with values relevant for your target ADLS2 container.

$connection_name is what you will use in your file path using resource://$connection_name/… notation.


Thanks Ivan. Appreciate your support here.


Hello, I am trying to make connectivity to ADLS in Azure subscription from One Desktop. What would be the Authentication token endpoin?

 


Reply