Tutorial

How to count the number of source systems per master id?

  • 12 January 2022
  • 0 replies
  • 97 views

Userlevel 5
Badge +9

If we had the following Instance Data containing the source system and master id groups for each of the instance records and we would like to calculate the count of source per master id.

 

Step by step guide.

  1. Create our input data as a text file.

  1. Create a new plan and add the following steps to the plan;

    • 1x Text File Reader

    • 1x Representative Creator

    • 2x Text File Writer

  2. Connect the steps as follows

  1. Configure the Text File Reader to read from the input data we created in step 1. and add 2 shadow columns. These shadow columns could alternatively be added via an Alter Format step. The columns will hold our count per source system (A | B)

 

  1. Double click on the Representative Creator step and click on the Group By tab.
    select the master_id as the column to group by.
     

     

  2. Add a Rule to the Representative Creator step by clicking on the Add button or double clicking on the asterisk.

 

  1. Double click on the newly created Attribute rule and then on the Representative Attributes tab. Representative Attributes are the columns you want to output in your grouped by master id summary.
    We‘ll add the master_id and two new columns (that we previously created in step 4). We’ll use a countif expression to count the number of record containing the particular source system.
     
  2. Click on the Instance Attributes tab and we’ll repeat the count_A and count_B columns and expressions here too. This will add these two columns to the instance records (our input data set).
  3. Save and run the plan, once the plan has finished running open both the output text files.

    In the out text file, we observe the input instance records with the two new columns. The values in the columns provide a summary of the count per system on each record.

 

In the out_merge text file, we observe a summary grouped by the master id. here we can see that master_id 2, has 2 records from source system A and 1 record from source system B.

 


0 replies

Be the first to reply!

Reply