Fri Aug 19, 2022 1:53 pm
Login Register Lost Password? Contact Us

Please Note: The HPCC Systems forums are moving to Stack Overflow. We invite you to post your questions on Stack Overflow utilizing the tag hpcc-ecl ( This legacy forum will be active and monitored during our transition to Stack Overflow but will become read only beginning September 1, 2022.

Avoid unnecessary distributions.

Share ideas, code, best practices and techniques with other community members

Thu Jan 03, 2019 12:06 pm Change Time Zone

If you know a logical file you are working with is distributed, it’s annoying to have to unconditionally DISTRIBUTE every time a workunit is run just to ensure the distribution aligns with the cluster size the said workunit is running on. Obviously workunits run on a specifically sized cluster have to redistribute any file built on a differently sized cluster, or that is not distributed at all.
The trick being to detect when a re-distribution is NOT necessary.

I have found the following test works for our business:

If <Number of Nodes in the cluster the Workunit is running on> = <Number of Nodes file was built on>
Use file as is (i.e. no need for re-distribution)
Re-distribute file

Pseudo ECL:
Code: Select all
Somefilename := ‘~folder::file’;
d := DATASET(Somefilename,layout,THOR);
n_nodes := STD.system.ThorLib.Nodes();

distributed_d := IF((INTEGER)NOTHOR(STD.File.GetLogicalFileAttribute(Somefilename,'numparts')) = n_nodes
                    ,DISTRIBUTE (d,HASH32(id)));

Note if the filename is known at compile time, the expression is evaluated at compile time. See the Graph for confirmation.

One can also check the ‘Queue’ names of the running workunit and the ‘Queue’ the file was built on, though this is more work as STD.File.GetLogicalFileAttribute does not have an option to return the ‘Queue’.
It’s still possible as GetLogicalFileAttribute does have an option to return the WUid the file was built on, this WUid can be passed to the ESP service WsWorkunits/WUInfo which returns with the ‘Queue’ in the (confusingly called) <Workunit><Cluster> tag. This can then be compared in the same manner as above with the return from STD.System.Job.Target().


Posts: 444
Joined: Sat Oct 01, 2011 7:26 pm

Return to Tips & Tricks

Who is online

Users browsing this forum: No registered users and 1 guest