Skip to main content

DISTRIBUTION

DISTRIBUTION(recordset [, fields ] [, NAMED( name ) ] [, UNORDERED | ORDERED( bool ) ] [, STABLE | UNSTABLE ] [, PARALLEL [ ( numthreads ) ] ] [, ALGORITHM( name ) ] )

recordsetThe set of records on which to run statistics.
fieldsOptional. A comma-delimited list of fields in the recordset to which to limit the action. If omitted, all fields are included.
NAMEDOptional. Specifies the result name that appears in the workunit.
nameA string constant containing the result label. This must be a valid label (See Definition Name Rules)
UNORDEREDOptional. Specifies the output record order is not significant.
ORDEREDSpecifies the significance of the output record order.
boolWhen False, specifies the output record order is not significant. When True, specifies the default output record order.
STABLEOptional. Specifies the input record order is significant.
UNSTABLEOptional. Specifies the input record order is not significant.
PARALLELOptional. Try to evaluate this activity in parallel.
numthreadsOptional. Try to evaluate this activity using numthreads threads.
ALGORITHMOptional. Override the algorithm used for this activity.
nameThe algorithm to use for this activity. Must be from the list of supported algorithms for the SORT function's STABLE and UNSTABLE options.

The DISTRIBUTION action produces a crosstab report in XML format indicating how many unique records there are in the recordset for each value in each field in the recordset.

When there is an excessively large number of distinct values, it returns an estimate in this form:

<XML>
  <Field name="seqnum" estimate="4000000"/>
</XML>

The DECIMAL data type is not supported by this action. You can use a REAL data type instead.

Example:

SomeFile := DATASET([{'C','G'},{'C','C'},{'A','X'},{'B','G'}],
     {STRING1 Value1,STRING1 Value2});
DISTRIBUTION(SomeFile);
/* The result comes back looking like this:
<XML>
<Field name="Value1" distinct="3">
 <Value count="1">A</Value>
 <Value count="1">B</Value>
 <Value count="2">C</Value>
</Field>
<Field name="Value2" distinct="3">
 <Value count="1">C</Value>
 <Value count="2">G</Value>
 <Value count="1">X</Value>
</Field>
</XML>
*/

//******************************************
namesRecord := RECORD
  STRING20 surname;
  STRING10 forename;
  INTEGER2 age;
END;

namesTable := DATASET([
  {'Halligan','Kevin',31},
  {'Halligan','Liz',30},
  {'Salter','Abi',10},
  {'X','Z',5}], namesRecord);

DISTRIBUTION(namesTable, surname, forename, NAMED('Stats'));
/* The result comes back looking like this:
<XML>
<Field name="surname" distinct="3">
 <Value count="2">Halligan</Value>
 <Value count="1">X</Value>
 <Value count="1">Salter</Value>
</Field>
<Field name="forename" distinct="4">
 <Value count="1">Abi</Value>
 <Value count="1">Kevin</Value>
 <Value count="1">Liz</Value>
 <Value count="1">Z</Value>
</Field>
</XML>
*/

//Post-processing the result with PARSE:
x := DATASET(ROW(TRANSFORM({STRING line},
       SELF.line := WORKUNIT('Stats', STRING))));
res := RECORD
  STRING Fieldname := XMLTEXT('@name');
  STRING Cnt := XMLTEXT('@distinct');
END;

out := PARSE(x, line, res, XML('XML/Field'));
out;