Sun Jul 21, 2019 8:29 am
Login Register Lost Password? Contact Us


Avoid unnecessary distributions.

Share ideas, code, best practices and techniques with other community members

Thu Jan 03, 2019 12:06 pm Change Time Zone

If you know a logical file you are working with is distributed, it’s annoying to have to unconditionally DISTRIBUTE every time a workunit is run just to ensure the distribution aligns with the cluster size the said workunit is running on. Obviously workunits run on a specifically sized cluster have to redistribute any file built on a differently sized cluster, or that is not distributed at all.
The trick being to detect when a re-distribution is NOT necessary.

I have found the following test works for our business:

If <Number of Nodes in the cluster the Workunit is running on> = <Number of Nodes file was built on>
THEN
Use file as is (i.e. no need for re-distribution)
ELSE
Re-distribute file
END

Pseudo ECL:
Code: Select all
IMPORT STD;
Somefilename := ‘~folder::file’;
d := DATASET(Somefilename,layout,THOR);
n_nodes := STD.system.ThorLib.Nodes();

distributed_d := IF((INTEGER)NOTHOR(STD.File.GetLogicalFileAttribute(Somefilename,'numparts')) = n_nodes
                    ,DISTRIBUTED(d,HASH32(id))
                    ,DISTRIBUTE (d,HASH32(id)));

Note if the filename is known at compile time, the expression is evaluated at compile time. See the Graph for confirmation.

One can also check the ‘Queue’ names of the running workunit and the ‘Queue’ the file was built on, though this is more work as STD.File.GetLogicalFileAttribute does not have an option to return the ‘Queue’.
It’s still possible as GetLogicalFileAttribute does have an option to return the WUid the file was built on, this WUid can be passed to the ESP service WsWorkunits/WUInfo which returns with the ‘Queue’ in the (confusingly called) <Workunit><Cluster> tag. This can then be compared in the same manner as above with the return from STD.System.Job.Target().

Yours

Allan
Allan
 
Posts: 373
Joined: Sat Oct 01, 2011 7:26 pm

Return to Tips & Tricks

Who is online

Users browsing this forum: No registered users and 1 guest

cron