Join Logic

The JOIN operation follows this logic:

1. Record distribution/sorting to get match candidates on the same nodes.

The PARTITION LEFT, PARTITION RIGHT, LOOKUP, ALL, NOSORT, KEYED, HASH, and LOCAL options indicate how this happens. These options are mutually exclusive; only one may be specified, and PARTITION LEFT is the default. SKEW and THRESHOLD may modify the requested behaviour. LOOKUP also has the additional effect of deduping the rightrecset by the joincondition.

2. Record matching.

The joincondition, LIMIT, and ATMOST determine how this is done.

An implicit limit of 10000 is added when there is no LIMIT specified AND the following is true:

There is no ATMOST limit specified AND it is not a LEFT ONLY JOIN AND (there is either no KEEP limit specified OR the JOIN has a postfilter).

3. Determine what matches to pass to transform.

The jointype determines this.

4. Generate output records through the TRANSFORM function.

The implicit or explicit transform parameter determines this.

5. Filter output records with SKIP.

If the transform for a record pair results in a SKIP, then the output record is not counted towards any KEEP option totals.

6. Limit output records with KEEP.

Any output records for a given leftrecset record over and above the permitted KEEP value are discarded. In a FULL OUTER join, rightrecset records that match no record are treated as if they all matched different default leftrecset records (that is, the KEEP counter is reset for each one).