Skip to main content

#OPTION

#OPTION( option, value );

optionA case sensitive string constant containing the name of the option to set.
valueThe value to set the option to. This may be any type of value, dependent on what the option expects to be.

The #OPTION statement is typically a compiler directive giving hints to the code generator as to how best to generate the executable code for a workunit. This statement may be used outside an XML scope and does not require a previous call to the LOADXML function to instantiate an XML scope.

Definition of Terms

These definitions are "internal-only" terms used in the option definitions that follow.

DFADeterministic Finite-state Automaton.
FoldTo turn a complex expression into a simpler equivalent one. For example, the expression "1+1" can be replaced with "2" without altering the result.
SpillWriting intermediate result sets to disk so that memory is available for subsequent steps.
FunnelThe + (append file) operator between datasets can be visualized as pouring all the records into a funnel and getting a single stream of records out of the bottom; hence the term "funnel."
TopNAn internally generated activity used in place of CHOOSEN(SORT(xx), n) where n is small, as it can be computed much more efficiently than sorting the entire record set then discarding all but the first n.
ActivityAn ECL operator that takes one or more datasets as inputs.
GraphAll the Activities in a query.
SubgraphA collection of Activities that can all be active at the same time in Thor.
PeepholeA method of code optimization that looks at a small amount of the unoptimized code at a time, in order to combine operations into more efficient ones.

Available options

The following options are generally useful:

maxRunTimeDefault: noneSets the maximum number of seconds a job runs before it times out
freezePersistsDefault: falseIf true, does not calculate/recalculate PERSISTed
expirePersistsDefault: trueIf true, PERSISTs expire after the specified period. This is set in the Sasha configuration setting (PersistExpiryDefault) or using #option ('defaultPersistExpiry', n) where n is the number of days.
defaultPersistExpiryDefault: noneIf set, PERSISTs expire after the number of days specified (overriding the Sasha PersistExpiryDefault setting).
multiplePersistInstancesDefault: trueIf true, multiple PERSISTs are the default.
defaultNumPersistInstancesDefault: noneSpecifies the default number of PERSISTs. A value of -1 specifies that all copies should be kept until they expire or manually deleted.
checkDefault: trueIf true, check for potential overflows of records.
expandRepeatAnyAsDfaDefault: trueIf true, expand ANY* in a DFA.
forceFakeThorDefault: falseIf true, force code to use hthor.
forceGenerateDefault: falseIf true, force .SO to be generated even if it's not worth it
globalFoldDefault: trueIf true, perform a global constant fold before generating.
globalOptimizeDefault: falseIf true, perform a global optimize.
groupAllDistributeDefault: falseIf true, GROUP,ALL generates a DISTRIBUTE instead of a global SORT.
maximizeLexerDefault: falseIf true, maximize the amount of work done in the lexer.
maxLengthDefault: 4096Specify maximum length of a record.
minimizeSpillSizeDefault: falseIf true, if a spill is filtered/deduped etc when read, reduce spill file size by splitting, filtering and then writing.
optimizeGraphDefault: trueIf true, optimize expressions in a graph before generation
orderDiskFunnelDefault: trueIf true, if all inputs to a funnel are disk reads, pull in
parseDfaComplexityDefault: 2000Maximum complexity of expression to convert to a DFA.
pickBestEngineDefault: trueIf true, use hthor if it is more efficient than Thor
diskReadsAreSimpleDefault: trueIf true, modifies the behavior of the pickBestEngine option so disk read operations are regarded the same as index read operations when deciding whether Thor is needed. The benefit is that simple jobs can run on hthor reading/filtering data remotely using dafilesrv.
targetClusterTypehthor|Thor|roxieWhat supercomputer type are we generating code for?
topnLimitDefault: 10000Maximum number of records to do topN on.
outputLimitDefault: 10Sets maximum size (in Mb) of result stored in workunit.
sortIndexPayloadDefault: trueSpecifies sorting (or not) payload fields
workflowDefault: trueSpecifies enabling/disabling workflow services.
foldStoredDefault: falseSpecifies that all the stored variables are replaced with their default values, or values overridden by #stored. This can significantly reduce the size of the graph generated.
skipFileFormatCrcCheckDefault: falseSpecifies that the CRC check on indices produces a warning and not an error.
allowedClustersDefault: noneSpecifies the comma-delimited list of cluster names (as a string constant) where the workunit may execute. This allows the job to be switched between clusters, manually or automatically, if the workunit is blocked on its assigned cluster and another valid cluster is available for use.
AllowAutoQueueSwitchDefault: falseIf true, specifies that the workunit is automatically re-assigned to execute on another available cluster listed in allowedClusters when blocked on its assigned cluster.
performWorkflowCseDefault: falseIf true, specifies that the code generator automatically detects opportunities for Common Sub-expression Elimination that may be "buried" within multiple PERSISTed attributes. If false, notification of these opportunities are displayed to the programmer as suggestions for the use of the INDEPENDENT Workflow Service.
defaultSkewErrorDefault: noneA value between 0.0 and 1.0 that determines the amount of skew needed to generate a skew error. This value is ignored if the ECL has provided a SKEW attribute.
defaultSkewWarningDefault: noneA value between 0.0 and 1.0 that determines the amount of skew needed to generate a skew warning. If set higher than defaultSkewError, then the value is ignored.
overrideSkewErrorDefault: noneIf set to a value between 0.0 and 1.0, it overrides any ECL SKEW(nn) attribute values in the current job.
defaultSkewThresholdDefault: 1GBThe size of the dataset (in bytes) local to a single node needed before Skew errors/warnings are generated if no THRESHOLD(nn) was supplied in ECL.
overrideSkewThresholdDefault: noneThe size of the dataset (in bytes) local to a single node needed before Skew errors/warnings are generated. Overrides any ECL THRESHOLD(nn) attribute values in the current job.
applyInstantEclTransformationsDefault falseLimit non-file outputs with a CHOOSEN
applyInstantEclTransformationsLimitDefault 100Number of records to limit to
divideByZeroDefault zero'zero' evaluates to 0, the default behavior. 'fail' causes the job to fail and report a division by zero error. 'nan' (only currently supported for real numbers) creates a quiet NaN, which will propagate through any real expressions it is used in. You can use NOT ISVALID(x) to test if the value is a NaN. Integer and decimal division by zero continue to return 0.
outputLimitMbDefault 10 [MB]Limit of output to a workunit in MB.
hthorMemoryLimitDefault 300 [MB]Override memory usage limit set in ECL Agent's defaultMemoryLimitMB configuration option (for hThor only).
maxCsvRowSizeMbDefault 10 [MB]Upper limit of a CSV line read in MB.
compressInternalSpillsDefault trueCompress internal spills. (e.g., spills created by lookahead or sort gathering).
hdCompressorTypeDefault 'FLZ'Distribute compressor to use.
hdCompressorOptionsDefault ''Distribute compressor options (e.g., AES key)
splitterSpillDefault -1Integer value to indicate whether to force splitters to spill or not. [1 = force spill | 0 = force in memory | -1 = adhere to helper setting ]
loopMaxEmptyDefault 1000Max # of iterations that LOOP can cycle through without results before reporting an error
smallSortThresholdDefault 0 (disabled)If estimated size is below this threshold in bytes, a minisort approach should be used.
sort_max_devianceDefault 10 [MB]Max (byte) variance allowed during sort partitioning
joinHelperThreadsDefault = same as number of coresNumber of threads to use in threaded variety of join helper
bindCoresDefault = 0For Roxie queries. If non-zero, binds the query to only use the specified number of cores. This overrides the value set for coresPerQuery in Roxie configuration.
translateDFSlayoutsDefault = 0Specifies that file layouts should be looked up at compile time. See File Layout Resolution at Compile Time in the Programmer's Guide for more details.
timeLimit For Roxie queries. Maximum run time (in ms) for a query.
generateGlobalIdDefault = falseFor Roxie queries. When true, generates a unique GlobalId if one is not provided.
analyzeWorkunit Overrides the setting in ECL Agent to analyze workunits after ECL queries are executed (Thor only). This allows a workunit to be further analyzed to identify and display any potential issues. These possible issues display in ECL Watch's "Warnings & Errors" area. The global setting defaults to TRUE, but can be changed using Configuration Manager.

The following options are all about generating Logical graphs in a workunit.

Logical graphs are stored in the workunit and viewed in ECL Watch. They include information about which attribute/line number/column the symbols are defined in. Exported attributes are represented by <module>.<attribute> in the header of the activity. Non-exported (local) attributes are represented as <module>.<exported-attribute>::<non-exported-name>

generateLogicalGraphDefault: falseIf true, generates a Logical graph in addition to all the workunit graphs.
generateLogicalGraphOnlyDefault: falseIf true, generates only the Logical graph for the workunit.
logicalGraphExpandPersistDefault: trueIf true, generates expands PERSISTed attributes.
logicalGraphExpandStoredDefault: falseIf true, generates expands STORED attributes.
logicalGraphIncludeNameDefault: trueIf true, generates attribute names in the header of the activity boxes.
logicalGraphIncludeModuleDefault: trueIf true, generates module.attribute names in the header of the activity boxes.
logicalGraphDisplayJavadocDefault: trueIf true, generates the Javadoc-style comments embedded in the ECL in place of the standard text that would be generated (see http://java.sun.com/j2se/javadoc/writingdoccomments/). Javadoc-style comments on RECORD structures or scalar attributes will not generate, as they have no graph Activity box directly associated.
logicalGraphDisplayJavadocParametersDefault: falseIf true, generates information about parameters in any Javadoc-style comments.
filteredReadSpillThresholdDefault: 2Filtered disk reads are spilled if will be duplicated more than N times.
foldConstantCastDefault: trueIf true, (cast)value is folded at generate time.
foldFilterDefault: trueIf true, filters are constant folded.
foldAssignDefault: trueIf true, TRANSFORMs are constant folded.
foldSQLDefault: trueIf true, SQL is constant folded.
optimizeDiskReadDefault: trueIf true, include project and filter in the transform for a disk read.
optimizeSQLDefault: falseIf true, optimize SQL.
optimizeThorCountsDefault: trueIf true, convert COUNT(diskfile) into optimized version.
peepholeDefault: trueIf true, peephole optimize memcpy/memsets, etc.
spotCSEDefault: trueIf true, look for common sub-expressions in TRANSFORMs/filters.
noteRecordSizeInGraphDefault: trueAdd estimates of record sizes to the graph
showActivitySizeInGraphDefault: falseShow estimates of generated c++ size in the graph
showMetaInGraphDefault: falseAdd distribution/sort orders to the graph
showRecordCountInGraphDefault: trueShow estimates of record counts in the graph
spotTopNDefault: trueIf true, convert CHOOSEN(SORT()) into a topN activity.
spotLocalMergeDefault: falseIf true, if local JOIN and both sides are sorted, generate a light-weight merge.
countIndexDefault: falseIf true, optimize COUNT(index) into optimized version (also requires optimizeThorCounts).
allowThroughSpillDefault: trueIf true, allow through spills.
optimizeBoolReturnDefault: trueIf true, improve code when returning BOOLEAN from a function.
optimizeSubStringDefault: trueIf true, don't allocate memory when doing a substring.
thorKeysDefault: trueIf true, allow INDEX operations in Thor.
regexVersionDefault: 0If set to 1, specifies use of the previous regular expression implementation, which may be faster but also may exceed stack limits.
compileOptionsDefault: noneSpecify override compiler options (such as /Zm1000 to double the compiler heap size to workaround a heap overflow error).
linkOptionsDefault: noneSpecify override linker options.
optimizeProjectsDefault: trueIf false, disables automatic field projection/distribution optimization.
notifyOptimizedProjectsDefault: 0If set to 1, reports optimizations to named attributes. If set to 2, reports all optimizations.
optimizeProjectsPreservePersistsDefault: falseIf true, disables automatic field projection/distribution optimization around reading PERSISTed files. If a PERSISTed file is read on a different size cluster than it was created on, optimizing the projected fields can mean that the distribution/sort order cannot be recreated.
aggressiveOptimizeProjectsDefault: falseIf true, enables attempted minimization of network traffic for sorts/distributes. This option doesn't usually result in significant benefits, but may do so in some specific cases.
percolateConstantsDefault: trueIf false, disables attempted aggressive constant value optimizations.

The following options are useful for debugging:

debugNlpDefault: falseIf true, output debug information about the NLP processing to the .cpp file.
resourceMaxMemoryDefault: 400MMaximum amount of memory a subgraph can use.
resourceMaxSocketsDefault: 2000Maximum number of sockets a subgraph can use.
resourceMaxActivitiesDefault: 200Maximum number of activities a subgraph can contain.
unlimitedResourcesDefault: falseIf true, assume lots of resources when resourcing the graphs.
traceRowXMLDefault: falseIf true, turns on tracing in ECL Watch graphs. This should only be used with small datasets for debugging purposes.
_ProbeDefault: falseIf true, display all result rows from intermediate result sets in the graph in ECL Watch when used in conjunction with the traceRowXML option. This should only be used with small datasets for debugging purposes.
debugQueryDefault: falseIf true, compile query using debug settings.
optimizeLevelDefault: 3 for roxie, else 0Set the C++ compiler optimization level (optimizations can cause the compiler to take a lot longer).
checkAssertsDefault: trueIf true, enables ASSERT checking.
soapTraceLevelDefault: 1The level of detail in reporting SOAPCALL or HTTPCALL information (set to 0 for none, 1 for normal, 2 - 8 for more detail)
traceEnabledDefault: FALSEEnables tracing to log files when TRACE actions are present. See TRACE.
traceLimitDefault: 10Overrides the the default KEEP setting for a TRACE statement to indicate how many TRACE statement to write to log file. See TRACE.

The following options are for advanced code generation use:

These options should be left alone unless you REALLY know what you are doing. Typically they are used internally by our developers to enable/disable features that are still in development. Occasionally the technical support staff will suggest that you change one of these settings to work around a problem that you encounter, but otherwise the default settings are recommended in all cases.

filteredReadSpillThresholdDefault: 2Filtered disk reads are spilled if will be duplicated more than N times.
foldConstantCast Default: trueIf true, (cast)value is folded at generate time.
foldFilter Default: trueIf true, filters are constant folded.
foldAssign Default: trueIf true, TRANSFORMs are constant folded.
foldSQL Default: trueIf true, SQL is constant folded.
optimizeDiskRead Default: trueIf true, include project and filter in the transform for a disk read.
optimizeSQL Default: falseIf true, optimize SQL.
optimizeThorCounts Default: trueIf true, convert COUNT(diskfile) into optimized version.
peephole Default: trueIf true, peephole optimize memcpy/memsets, etc.
spotCSE Default: trueIf true, look for common sub-expressions in TRANSFORMs/filters.
spotTopN Default: trueIf true, convert CHOOSEN(SORT()) into a topN activity.
spotLocalMerge Default: falseIf true, if local JOIN and both sides are sorted, generate a light-weight merge.
countIndex Default: falseIf true, optimize COUNT(index) into optimized version (also requires optimizeThorCounts).
allowThroughSpill Default: trueIf true, allow through spills.
optimizeBoolReturn Default: trueIf true, improve code when returning BOOLEAN from a function.
optimizeSubString Default: trueIf true, don't allocate memory when doing a substring.
thorKeys Default: trueIf true, allow INDEX operations in thor.
regexVersion Default: 0If set to 1, specifies use of the previous regular expression implementation, which may be faster but also may exceed stack limits.
compileOptions Default: noneSpecify override compiler options (such as /Zm1000 to double the compiler heap size to workaround a heap overflow error).
linkOptions Default: noneSpecify override linker options.
optimizeProjects Default: trueIf false, disables automatic field projection/distribution optimization.
notifyOptimizedProjects Default: 0If set to 1, reports optimizations to named attributes. If set to 2, reports all optimizations.
optimizeProjectsPreservePersists Default: falseIf true, disables automatic field projection/distribution optimization around reading PERSISTed files. If a PERSISTed file is read on a different size cluster than it was created on, optimizing the projected fields can mean that the distribution/sort order cannot be recreated.
aggressiveOptimizeProjects Default: falseIf true, enables attempted minimization of network traffic for sorts/distributes. This option doesn't usually result in significant benefits, but may do so in some specific cases.
percolateConstants Default: trueIf false, disables attempted aggressive constant value optimizations.
exportDependenciesDefault: falseGenerate information about inter-definition dependencies
maxCompileThreadsDefault 4 for eclccserver and 1 for eclccNumber of compiler instances to compile the c++
reportCppWarningsDefault: falseReport warnings from c++ compilation
saveCppTempFilesDefault: falseRetain the generated c++ files
spanMultipleCppDefault: trueGenerate a work unit in multiple c++ files
activitiesPerCppDefault 500 for Linux or 800 for WindowsNumber of activities in each c++ file (requires spanMultipleCpp)
obfuscateOutputDefault falseIf true, details are removed from the generated workunit, including ECL code, estimates of record size, and number of records.

The following options are for the workunit analyzer:

analyzeWorkunitDefault: trueIf set to FALSE, disables analysis of the workunit
analyzer_minInterestingTimeDefault: 1000Analyze activities that exceed this minimum time to execute (milliseconds)
analyzer_minInterestingCost Default: 30000Report issues where the time penalty exceeds this value (milliseconds)
analyzer_skewThreshold Default: 20Report skew related issues that exceed this threshold
analyzer_minRowsPerNode Default: 1000Ignore activities that have this average number of rows per node

Example:

  #OPTION('traceRowXml', TRUE);
  #OPTION('_Probe', TRUE);
  
  my_rec := RECORD
    STRING20 lname;
    STRING20 fname;
    STRING2 age;
  END;
  
  d := DATASET([{ 'PORTLY', 'STUART' , '39'},
              { 'PORTLY', 'STACIE' , '36'},
              { 'PORTLY', 'DARA' , ' 1'},
              { 'PORTLY', 'GARRETT', ' 4'}], my_rec);
  
  OUTPUT(d(d.age > ' 1'), {lname, fname, age} );
  
  //************************************
  //This example demonstrates Logical Graphs and
  // Javadoc-style comment blocks
  #OPTION('generateLogicalGraphOnly',TRUE);
  #OPTION('logicalGraphDisplayJavadocParameters',TRUE);
  
  /**
  * Defines a record that contains information about a person
  */
  namesRecord :=
       RECORD
  string20    surname;
  string10    forename;
  integer2    age := 25;
       END;
  
  /**
  Defines a table that can be used to read the information from the file
  and then do something with it.
  */
  namesTable := DATASET('x',namesRecord,FLAT);
  
  
  /**
       Allows the name table to be filtered.
  
       @param ages The ages that are allowed to be processed.
            badForename Forname to avoid.
  
       @return the filtered dataset.
  */
  namesTable filtered(SET OF INTEGER2 ages, STRING badForename) :=
       namesTable(age in ages, forename != badForename);
  
  OUTPUT(filtered([10,20,33], ''));