CORRELATION( recset, valuex, valuey [ , expresssion] [, KEYED ] [, UNORDERED | ORDERED( bool ) ] [, STABLE | UNSTABLE ] [, PARALLEL [ ( numthreads ) ] ] [, ALGORITHM( name ) ] )
recset | The set of records to process. This may be the name of a dataset or a record set derived from some filter condition, or any expression that results in a derived record set. This also may be the GROUP keyword to indicate operating on the elements in each group, when used in a RECORD structure to generate crosstab statistics. |
valuex | A numeric field or expression. |
valuey | A numeric field or expression. |
expression | Optional. A logical expression indicating which records to include in the calculation. Valid only when the recset parameter is the keyword GROUP. |
KEYED | Optional. Specifies the activity is part of an index read operation, which allows the optimizer to generate optimal code for the operation. |
UNORDERED | Optional. Specifies the output record order is not significant. |
ORDERED | Specifies the significance of the output record order. |
bool | When False, specifies the output record order is not significant. When True, specifies the default output record order. |
STABLE | Optional. Specifies the input record order is significant. |
UNSTABLE | Optional. Specifies the input record order is not significant. |
PARALLEL | Optional. Try to evaluate this activity in parallel. |
numthreads | Optional. Try to evaluate this activity using numthreads threads. |
ALGORITHM | Optional. Override the algorithm used for this activity. |
name | The algorithm to use for this activity. Must be from the list of supported algorithms for the SORT function's STABLE and UNSTABLE options. |
Return: | CORRELATION returns a single REAL value. |
The CORRELATION function returns the Pearson's Product Moment Correlation Coefficient between valuex and valuey.
Example:
pointRec := { REAL x, REAL y }; analyse( ds) := MACRO #uniquename(stats) %stats% := TABLE(ds, { c := COUNT(GROUP), sx := SUM(GROUP, x), sy := SUM(GROUP, y), sxx := SUM(GROUP, x * x), sxy := SUM(GROUP, x * y), syy := SUM(GROUP, y * y), varx := VARIANCE(GROUP, x); vary := VARIANCE(GROUP, y); varxy := COVARIANCE(GROUP, x, y); rc := CORRELATION(GROUP, x, y) }); OUTPUT(%stats%); // Following should be zero OUTPUT(%stats%, { varx - (sxx-sx*sx/c)/c, vary - (syy-sy*sy/c)/c, varxy - (sxy-sx*sy/c)/c, rc - (varxy/SQRT(varx*vary)) }); OUTPUT(%stats%, { 'bestFit: y=' + (STRING)((sy-sx*varxy/varx)/c) + ' + ' + (STRING)(varxy/varx)+'x' }); ENDMACRO; ds1 := DATASET([{1,1},{2,2},{3,3},{4,4},{5,5},{6,6}], pointRec); ds2 := DATASET([ {1.93896e+009, 2.04482e+009}, {1.77971e+009, 8.54858e+008}, {2.96181e+009, 1.24848e+009}, {2.7744e+009, 1.26357e+009}, {1.14416e+009, 4.3429e+008}, {3.38728e+009, 1.30238e+009}, {3.19538e+009, 1.71177e+009} ], pointRec); ds3 := DATASET([ {1, 1.00039}, {2, 2.07702}, {3, 2.86158}, {4, 3.87114}, {5, 5.12417}, {6, 6.20283} ], pointRec); analyse(ds1); analyse(ds2); analyse(ds3);
See Also: VARIANCE, COVARIANCE