Record Structures and Files

Record Structures and Files
Prev		Next

RECORD Structure

attr := RECORD [ ( baserec ) ] [, MAXLENGTH( length ) ] [, LOCALE( locale ) ] [, PACKED ]

fields ;

[ IFBLOCK( condition )

fields ;

END; ]

[ => payload ]

END;

attr	The name of the RECORD structure for later use in other definitions.
baserec	Optional. The name of a RECORD structure from which to inherit all fields. Any RECORD structure that inherits the baserecfields in this manner becomes compatible with any TRANSFORM function defined to take a parameter of baserec type (the extra fields will, of course, be lost).
MAXLENGTH	Optional. This option is used to create indexes that are backward compatible for platform versions prior to 3.0. Specifies the maximum number of characters allowed in the RECORD structure or field. MAXLENGTH on the RECORD structure overrides any MAXLENGTH on a field definition, which overrides any MAXLENGTH specified in the TYPE structure if the datatype names an alien data type. This option defines the maximum size of variable-length records. If omitted, fixed size records use the minimum size required and variable length records produce a warning. The default maximum size of a record containing variable-length fields is 4096 bytes (this may be overridden by using #OPTION(maxLength,####) to change the default). The maximum record size should be set as conservatively as possible, and is better set on a per-field basis (see the Field Modifiers section below).
length	An integer constant specifying the maximum number of characters allowed.
LOCALE	Optional. Specifies the Unicode locale for any UNICODE fields.
locale	A string constant containing a valid locale code, as specified in ISO standards 639 and 3166.
PACKED	Optional. Specifies the order of the fields may be changed to improve efficiency (such as moving variable-length fields after the fixed-length fields)..
fields	Field declarations. See below for the appropriate syntaxes.
IFBLOCK	Optional. A block of fields that receive "live" data only if the condition is met. The IFBLOCK must be terminated by an END. This is used to define variable-length records. If the condition expression references fields in the RECORD preceding the IFBLOCK, those references must use SELF. prepended to the fieldname to disambiguate the reference. This is only for use in RECORD structures that define files on disk.
condition	A logical expression that defines when the fields within the IFBLOCK receive "live" data. If the expression is not true, the fields receive their declared default values. If there's no default value, the fields receive blanks or zeros.
=>	Optional. The delimiter between the list of key fields and the payload when the RECORD structure is used by the DICTIONARY declaration. Typically, this is an inline structure using curly braces ( { } ) instead of RECORD and END.
payload	The list of non-keyed fields in the DICTIONARY.

Record layouts are definitions whose expression is a RECORD structure terminated by the END keyword. The attr name creates a user-defined value type that can be used in built-in functions and TRANSFORM function definitions. The delimiter between field definitions in a RECORD structure can be either the semi-colon (;) or a comma (,).

IFBLOCK and alien data types (TYPE) should only be used when accessing external data files. It is much more efficient to use the native types for general processing. In particular, some optimizations to project and filter files remotely are not supported on IFBLOCKs and alien datatypes.

In-line Record Definitions

Curly braces ({}) are lexical equivalents to the keywords RECORD and END that can be used anywhere RECORD and END are appropriate. Either form (RECORD/END or {}) can be used to create "on-the-fly" record formats within those functions that require record structures (OUTPUT, TABLE, DATASET etc.), instead of defining the record as a separate definition.

Field Definitions

All field declarations in a RECORD Structure must use one of the following syntaxes:

	datatype identifier [ {modifier} ] [ := defaultvalue] ;
	identifier := defaultvalue ;
	defaultvalue ;
	sourcefield ;
	recstruct [ identifier ] ;
	sourcedataset ;
	childdataset identifier [ { modifier } ];

datatype	The value type of the data field. This may be a child dataset (see DATASET). If omitted, the value type is the result type of the defaultvalue expression.
identifier	The name of the field. If omitted, the defaultvalue expression defines a column with no name that may not be referenced in subsequent ECL.
defaultvalue	Optional. An expression defining the source of the data (for operations that require a data source, such as TABLE and PARSE). This may be a constant, expression, or definition providing the value.
modifier	Optional. One of the keywords listed in the Field Modifierssection below.
sourcefield	A previously defined data field, which implicitly provides the datatype, identifier, and defaultvalue for the new field--inherited from the sourcefield.
recstruct	A previously defined RECORD structure. See the Field Inheritancesection below.
sourcedataset	A previously defined DATASET or derived recordset definition. See the Field Inheritancesection below.
childdataset	A child dataset declaration (see DATASET and DICTIONARY discussions), which implicitly defines all the fields of the child at their already defined datatype, identifier, and defaultvalue (if present in the child dataset's RECORD structure).

Field definitions must always define the datatype and identifier of each field, either implicitly or explicitly. If the RECORD structure will be used by TABLE, PARSE, ROW, or any other function that creates an output recordset, then the defaultvalue must also be implicitly or explicitly defined for each field. In the case where a field is defined in terms of a field in a dataset already in scope, you may name the identifier with a name already in use in the dataset already in scope as long as you explicitly define the datatype.

Field Inheritance

Field definitions may be inherited from a previously defined RECORD structure or DATASET. When a recstruct (a RECORD Structure) is specified from which to inherit the fields, the new fields are implicitly defined using the datatype and identifier of all the existing field definitions in the recstruct. When a sourcedataset (a previously defined DATASET or recordset definition) is specified to inherit the fields, the new fields are implicitly defined using the datatype, identifier, and defaultvalue of all the fields (making it usable by operations that require a data source, such as TABLE and PARSE). Either of these forms may optionally have its own identifier to allow reference to the entire set of inherited fields as a single entity.

You may also use logical operators (AND, OR, and NOT) to include/exclude certain fields from the inheritance, as described here:

R1 AND R2	Intersection	All fields declared in both R1 and R2
R1 OR R2	Union	All fields declared in either R1 or R2
R1 AND NOT R2	Difference	All fields in R1 that are not in R2
R1 AND NOT F1	Exception	All fields in R1 except the specified field (F1)
R1 AND NOT [F1, F2]	Exception	All fields in R1 except those in listed in the brackets (F1andF2)

The minus sign (-) is a synonym for AND NOT, so R1-R2 is equivalent to R1 AND NOT R2.

It is an error if the records contain the same field names whose value types don't match, or if you end up with no fields (such as: A-A). You must ensure that any MAXLENGTH/MAXCOUNT is specified correctly on each field in both RECORD Structures.

Example:

R1 := {STRING1 F1,STRING1 F2,STRING1 F3,STRING1 F4,STRING1 F5};
R2 := {STRING1 F4,STRING1 F5,STRING1 F6};
R3 := {R1 AND R2}; //Intersection - fields F4 and F5  only
R4 := {R1 OR R2}; //Union - all fields F1 - F6
R5 := {R1 AND NOT R2}; //Difference - fields F1 - F3
R6 := {R1 AND NOT F1}; //Exception - fields F2 - F5
R7 := {R1 AND NOT [F1,F2]}; //Exception - fields F3 - F5

//the following two RECORD structures are equivalent:
C := RECORD,MAXLENGTH(x)
  R1 OR R2;
END;
         
D := RECORD, MAXLENGTH(x)
  R1;
  R2 AND NOT R1;
END;

Field Modifiers

The following list of field modifiers are available for use on field definitions:

	{ MAXLENGTH( length ) }
	{ MAXCOUNT( records ) }
	{ XPATH( 'tag' ) }
	{ XMLDEFAULT( 'value' ) }
	{ DEFAULT( value ) }
	{ VIRTUAL ( fileposition ) }
	{ VIRTUAL( localfileposition ) }
	{ VIRTUAL( logicalfilename ) }
	{ BLOB }

{ MAXLENGTH(length ) }	Specifies the maximum number of characters allowed in the field (see MAXLENGTH option above).
{ MAXCOUNT(records ) }	Specifies the maximum number of records allowed in a child DATASET field (similar to MAXLENGTH above).
{ XPATH('tag') }	Specifies the XML or JSON tag that contains the data, in a RECORD structure that defines XML or JSON data. This overrides the default tag name (the lowercase field identifier). See the XPATH Support section below for details.
{ XMLDEFAULT('value') }	Specifies a default XML value for the field. The value must be constant.
{ DEFAULT( value ) }	Specifies a default value for the field. The value must be constant. This value will be used: 1. When a DICTIONARY lookup returns no match. 2. When an out-of-range record is fetched using ds[n] (as in ds[5] when ds contains only 4 records). 3. In the default records passed to TRANSFORM functions in non-INNER JOINS where there is no corresponding row. 4. When defaulting field values in a TRANSFORM using SELF = [ ].
{ VIRTUAL( fileposition ) }	Specifies the field is a VIRTUAL field containing the relative byte position of the record within the entire file (the record pointer). This must be an UNSIGNED8 field and must be the last field, because it only truly exists when the file is loaded into memory from disk (hence, the "virtual").
{ VIRTUAL( localfileposition ) }	Specifies the local byte position within a part of the distributed file on a single node: the first bit is set, the next 15 bits specify the part number, and the last 48 bits specify the relative byte position within the part. This must be an UNSIGNED8 field and must be the last field, because it only truly exists when the file is loaded into memory from disk (hence, the "virtual").
{ VIRTUAL( logicalfilename ) }	Specifies the logical file name of the distributed file. This must be a STRING field. If reading from a superfile, the value is the current logical file within the superfile.
{ BLOB }	Specifies the field is stored separately from the leaf node entry in the INDEX. This is applicable specifically to fields in the payload of an INDEX to allow more than 32K of data per index entry. The BLOB data is stored within the index file, but not with the rest of the record. Accessing the BLOB data requires an additional seek.

XPATH Support

XPATH support is a limited subset of the full XPATH specification, basically expressed as:

node[qualifier] / node[qualifier] ...

node	Can contain wildcards.
qualifier	Can be a node or attribute, or a simple single expression of equality, inequality, or numeric or alphanumeric comparisons, or node index values. No functions or inline arithmetic, etc. are supported. String comparison is indicated when the right hand side of the expression is quoted.

These operators are valid for comparisons:

<, <=, >, >=, =, !=

An example of a supported xpath:

/a/*/c*/*d/e[@attr]/f[child]/g[@attr="x"]/h[child>="5"]/i[@x!="2"]/j

You can emulate AND conditions like this:

/a/b[@x="1"][@y="2"]

Also, there is a non-standard XPATH convention for extracting the text of a match using empty angle brackets (<>):

R := RECORD
STRING blah{xpath('a/b<>')};
//contains all of b, including any child definitions and values
END;

An XPATH for a value cannot be ambiguous. If the element occurs multiple times, you must use the ordinal operation (for example, /foo[1]/bar) to explicit select the first occurrence.

For XML or JSON DATASETs reading and processing results of the SOAPCALL function, the following XPATH syntax is specifically supported:

1) For simple scalar value fields, if there is an XPATH specified then it is used, otherwise the lower case identifier of the field is used.

STRING name;                  //matches: <name>Kevin</name>
STRING Fname{xpath('Fname')}; //matches: <Fname>Kevin</Fname>

2) For a field whose type is a RECORD structure, the specified XPATH is prefixed to all the fields it contains, otherwise the lower case identifier of the field followed by '/' is prefixed onto the fields it contains. Note that an XPATH of '' (empty single quotes) will prefix nothing.

NameRec := RECORD
  STRING Fname{xpath('Fname')}; //matches: <Fname>Kevin</Fname>
  STRING Mname{xpath('Mname')}; //matches: <Mname>Alfonso</Mname>
  STRING Lname{xpath('Lname')}; //matches: <Lname>Jones</Lname>
END;
          
PersonRec := RECORD
  STRING Uid{xpath('Person[@UID]')};
  NameRec Name{xpath('Name')};
    /*matches: <Name>
              <Fname>Kevin</Fname>
              <Mname>Alfonso</Mname>
              <Lname>Jones</Lname>
              </Name> */
END;

3) For a child DATASET field, the specified XPATH can have one of two formats: "Container/Repeated" or "/Repeated." Each "/Repeated" tag within the optional Container is iterated to provide the values. If no XPATH is specified, then the default value for the Container is the lower case field name, and the default value for Repeated is "Row." For example, this demonstrates "Container/Repeated":

DATASET(PeopleNames) People{xpath('people/name')};
          /*matches: <people>
                        <name>Gavin</name>
                        <name>Ricardo</name>
                    </people> */

This demonstrates "/Repeated":

DATASET(Names) Names{xpath('/name')};
          /*matches: <name>Gavin</name>
                     <name>Ricardo</name> */

"Container" and "Repeated" may also contain xpath filters, like this:

DATASET(doctorRec) doctors{xpath('person[@job=\'doctor\']')};
          /*matches: <person job='doctor'>
                       <FName>Kevin</FName>
                       <LName>Richards</LName>
                     </person> */

4) For a SET OF type field, an xpath on a set field can have one of three formats: "Repeated", "Container/Repeated" or "Container/Repeated/@attr". They are processed in a similar way to datasets, except for the following. If Container is specified, then the XML reading checks for a tag "Container/All", and if present the set contains all possible values. The third form allows you to read XML attribute values.

SET OF STRING people;
    //matches: <people><All/></people>
    //or: <people><Item>Kevin</Item><Item>Richard</Item></people>
         
SET OF STRING Npeople{xpath('Name')};
    //matches: <Name>Kevin</Name><Name>Richard</Name>
SET OF STRING Xpeople{xpath('/Name/@id')};
    //matches: <Name id='Kevin'/><Name id='Richard'/>

For writing XML or JSON files using OUTPUT, the rules are similar with the following exceptions:

For scalar fields, simple tag names and XML/JSON attributes are supported.
For SET fields, <All> will only be generated if the container name is specified.
xpath filters are not supported.
The "Container/Repeated/@attr" form for a SET is not supported.

Example:

For DATASET or the result type of a TRANSFORM function, you need only specify the value type and name of each field in the layout:

R1 := RECORD
  UNSIGNED1 F1; //only value type and name required
  UNSIGNED4 F2;
  STRING100 F3;
END;
          
D1 := DATASET('~LR::SomeFile',R1,THOR);

For "vertical slice" TABLE, you need to specify the value type, name, and data source for each field in the layout:

R2 := RECORD
  UNSIGNED1 F1 := D1.F1; //value type, name, data source all explicit
  D1.F2; //value type, name, data source all implicit
END;

T1 := TABLE(D1,R2);

For "crosstab report" TABLE:

R3 := RECORD
  D1.F1;            //"group by" fields must come first 
  UNSIGNED4 GrpCount := COUNT(GROUP); 
                   //value type, column name, and aggregate
  GrpSum := SUM(GROUP,D1.F2); //no value type -- defaults to INTEGER
  MAX(GROUP,D1.F2); //no column name in output
END;

T2 := TABLE(D1,R3,F1);

Form1 := RECORD
    Person.per_last_name; //field name is per_last_name - size
                         //is as declared in the person dataset
    STRING25 LocalID := Person.per_first_name;
                        //the name of this field is LocalID and it
                        //gets its data from Person.per_first_name
    INTEGER8 COUNT(Trades); //this field is unnamed in the output file
    BOOLEAN HasBogey := FALSE; 
                        //HasBogey defaults to false
    REAL4    Valu8024;
          //value from the Valu8024 definition
END;
Form2 := RECORD
     Trades; //include all fields from the Trades dataset at their
            // already-defined names, types and sizes
     UNSIGNED8 fpos {VIRTUAL(fileposition)};
           //contains the relative byte position within the file
END;

Form3 := {Trades,UNSIGNED8 local_fpos {VIRTUAL(localfileposition)}};
          //use of {} instead of RECORD/END
          //"Trades" includes all fields from the dataset at their
          // already-defined names, types and sizes
          //local_fpos is the relative byte position in each part

Form4 := RECORD, MAXLENGTH(10000)
     STRING VarStringName1{MAXLENGTH(5000)};
          //this field is variable size to a 5000 byte maximum
 
     STRING VarStringName2{MAXLENGTH(4000)};
          //this field is variable size to a 4000 byte maximum

     IFBLOCK(MyCondition = TRUE) //following fields receive values
          //only if MyCondition = TRUE

     BOOLEAN HasLife := TRUE;
          //defaults to true unless MyCondition = FALSE

     INTEGER8 COUNT(Inquiries); 
          //this field is zero if MyCondition = FALSE, even
          //if there are inquiries to count

      END;
END;

in-line record structures, demonstrating same field name use

ds := DATASET('d', { STRING s; }, THOR);
t := TABLE(ds, { STRING60 s := ds.s; });
    // new "s" field is OK with value type explicitly defined

"Child dataset" RECORD structures

ChildRec := RECORD
    UNSIGNED4 person_id;
    STRING20 per_surname;
    STRING20 per_forename;
END;
ParentRecord := RECORD
    UNSIGNED8 id;
    STRING20 address;
    STRING20 CSZ;
    STRING10 postcode;
    UNSIGNED2 numKids;
    DATASET(ChildRec) children{MAXCOUNT(100)};
END;

an example using {XPATH('tag')}

R := RECORD
     STRING10 fname;
     STRING12 lname;
     SET OF STRING1 MySet{XPATH('Set/Element')}; //define set tags
END;
B := DATASET([{'Fred','Bell',['A','B']},
            {'George','Blanda',['C','D']},
            {'Sam','',['E','F'] } ], R);
         
OUTPUT(B,,'~LR::test.xml', XML,OVERWRITE);

/* this example produces XML output that looks like this:
<Dataset>
<Row><fname>Fred </fname><lname>Bell</lname>
 <Set><Element>A</Element><Element>B</Element></Set></Row>
<Row><fname>George</fname><lname>Blanda </lname>
 <Set><Element>C</Element><Element>D</Element></Set></Row>
<Row><fname>Sam </fname><lname> </lname>
<Set><Element>E</Element><Element>F</Element></Set></Row>
</Dataset>
*/

another XML example with a 1-field child dataset

cr := RECORD,MAXLENGTH(1024)
  STRING phoneEx{XPATH('')};
END;
r := RECORD,MAXLENGTH(4096)
  STRING id{XPATH('COMP-ID')};cr := RECORD,MAXLENGTH(1024)
  STRING phoneEx{XPATH('')};
END;
r := RECORD,MAXLENGTH(4096)
  STRING id{XPATH('COMP-ID')};
  STRING phone{XPATH('PHONE-NUMBER')};
  DATASET(cr) Fred{XPATH('PHONE-NUMBER-EXP')};
END;
         
DS := DATASET([{'1002','1352,9493',['1352','9493']},
            {'1003','4846,4582,0779',['4846','4582','0779']}],r);

OUTPUT(ds,,'~LR::XMLtest2',
      XML('RECORD',
          HEADING('<?xml version="1.0" encoding="UTF-8"?><RECORDS>',
                  '</RECORDS>')),OVERWRITE);
 
/* this example produces XML output that looks like  this:
<?xml version="1.0" encoding="UTF-8"?>
   <RECORDS>
      <RECORD>
         <COMP-ID>1002</COMP-ID>
          <PHONE-NUMBER>1352,9493</PHONE-NUMBER>
          <PHONE-NUMBER-EXP>1352</PHONE-NUMBER-EXP>
          <PHONE-NUMBER-EXP>9493</PHONE-NUMBER-EXP>
       </RECORD>
       <RECORD>
         <COMP-ID>1003</COMP-ID>
          <PHONE-NUMBER>4846,4582,0779</PHONE-NUMBER>
          <PHONE-NUMBER-EXP>4846</PHONE-NUMBER-EXP>
          <PHONE-NUMBER-EXP>4582</PHONE-NUMBER-EXP>
          <PHONE-NUMBER-EXP>0779</PHONE-NUMBER-EXP>
       </RECORD>
     </RECORDS>
 */

XPATH can also be used to define a JSON file

/* a JSON  file called "mybooks.json" contains this data:
[
  {
    "id" : "978-0641723445",
    "name" : "The Lightning Thief",
    "author" : "Rick Riordan"
  }
,
  {
    "id" : "978-1423103349",
    "name" : "The Sea of Monsters",
    "author" : "Rick Riordan"
  }
]
*/

BookRec := RECORD
  STRING ID {XPATH('id')}; //data from id tag -- renames field to uppercase
  STRING title {XPATH('name')}; //data from name tag, renaming the field
  STRING author; //data from author tag, tag name is lowercase and matches field name  
END;

books := DATASET('~LR::mybooks.json',BookRec,JSON('/'));
OUTPUT(books);

Prev		Next
Casting Rules	Home	DATASET