CORRELATION

CORRELATION
Prev	Built-in Functions and Actions	Next

CORRELATION ( recset, valuex, valuey [ , expresssion] [, KEYED ] [, UNORDERED | ORDERED( bool ) ] [, STABLE | UNSTABLE ] [, PARALLEL [ ( numthreads ) ] ] [, ALGORITHM( name ) ] )

recset	The set of records to process. This may be the name of a dataset or a record set derived from some filter condition, or any expression that results in a derived record set. This also may be the GROUP keyword to indicate operating on the elements in each group, when used in a RECORD structure to generate crosstab statistics.
valuex	A numeric field or expression.
valuey	A numeric field or expression.
expression	Optional. A logical expression indicating which records to include in the calculation. Valid only when the recset parameter is the keyword GROUP.
KEYED	Optional. Specifies the activity is part of an index read operation, which allows the optimizer to generate optimal code for the operation.
UNORDERED	Optional. Specifies the output record order is not significant.
ORDERED	Specifies the significance of the output record order.
bool	When False, specifies the output record order is not significant. When True, specifies the default output record order.
STABLE	Optional. Specifies the input record order is significant.
UNSTABLE	Optional. Specifies the input record order is not significant.
PARALLEL	Optional. Try to evaluate this activity in parallel.
numthreads	Optional. Try to evaluate this activity using numthreads threads.
ALGORITHM	Optional. Override the algorithm used for this activity.
name	The algorithm to use for this activity. Must be from the list of supported algorithms for the SORT function's STABLE and UNSTABLE options.
Return:	CORRELATION returns a single REAL value.

The CORRELATION function returns the Pearson's Product Moment Correlation Coefficient between valuex and valuey.

Example:

pointRec := { REAL x, REAL y };
analyze( ds) := MACRO
#uniquename(stats)
%stats% := TABLE(ds, { c     := COUNT(GROUP),
    sx    := SUM(GROUP, x),
    sy    := SUM(GROUP, y),
    sxx   := SUM(GROUP, x * x),
    sxy   := SUM(GROUP, x * y),
    syy   := SUM(GROUP, y * y),
    varx  := VARIANCE(GROUP, x);
    vary  := VARIANCE(GROUP, y);
    varxy := COVARIANCE(GROUP, x, y);
    rc    := CORRELATION(GROUP, x, y) });
OUTPUT(%stats%);
// Following should be zero
OUTPUT(%stats%, { varx - (sxx-sx*sx/c)/c,
   vary - (syy-sy*sy/c)/c,
   varxy - (sxy-sx*sy/c)/c,
   rc - (varxy/SQRT(varx*vary)) });
OUTPUT(%stats%, { 'bestFit: y=' +
   (STRING)((sy-sx*varxy/varx)/c) +
   ' + ' +
   (STRING)(varxy/varx)+'x' });
ENDMACRO;
ds1 := DATASET([{1,1},{2,2},{3,3},{4,4},{5,5},{6,6}], pointRec);
ds2 := DATASET([ {1.93896e+009, 2.04482e+009},
   {1.77971e+009, 8.54858e+008},
   {2.96181e+009, 1.24848e+009},
   {2.7744e+009,  1.26357e+009},
   {1.14416e+009, 4.3429e+008},
   {3.38728e+009, 1.30238e+009},
   {3.19538e+009, 1.71177e+009} ], pointRec);
ds3 := DATASET([ {1, 1.00039},
   {2, 2.07702},
   {3, 2.86158},
   {4, 3.87114},
   {5, 5.12417},
   {6, 6.20283} ], pointRec);
analyze(ds1);
analyze(ds2);
analyze(ds3);

Prev	Up	Next
COMBINE Form 2	Home	COS