Mon Sep 20, 2021 2:04 pm
Login Register Lost Password? Contact Us


Join Options(Partition left and Partition Right)

Comments and questions related to the Enterprise Control Language

Mon Mar 04, 2019 8:06 am Change Time Zone

Hi Team,
Can anyone explain the functionality of Partition Left and Right with an example.


Thanks & Regards,

Manikandan N
Daniel_mani
 
Posts: 11
Joined: Wed Jan 23, 2019 2:15 pm

Mon Mar 04, 2019 2:42 pm Change Time Zone

Manikandan N,

Here's the example: If you're joining by "lastname, firstname" then in order for the JOIN to work, all the "Tom Jones" records from both datasets have to be on the same node together. That means JOIN moves data around the nodes as it needs to in order to accomplish the task.

PARTITION LEFT (the default behavior) says that the distribution of the data from both datasets is determined by the LEFT dataset, while PARTITION RIGHT says that the distribution of the data from both datasets is determined by the RIGHT dataset.

So, if you're JOINing a 10 Billion record dataset to a 20 Million record dataset, then the most even distribution of all data from both datasets would be determined by the larger dataset. Generally, you would make that one the LEFT dataset and go with the default partitioning, but if you have some particular need for that larger file to be the RIGHT dataset, then you should specify PARTITION RIGHT on the JOIN.

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1593
Joined: Wed Oct 26, 2011 7:40 pm

Tue Mar 05, 2019 12:54 pm Change Time Zone

Thanks for the Clarification Richard..

Regards,
Manikandan N
Daniel_mani
 
Posts: 11
Joined: Wed Jan 23, 2019 2:15 pm


Return to ECL

Who is online

Users browsing this forum: No registered users and 1 guest