Fri Jul 01, 2022 2:14 pm
Login Register Lost Password? Contact Us

Please Note: The HPCC Systems forums are moving to Stack Overflow. We invite you to post your questions on Stack Overflow utilizing the tag hpcc-ecl (https://stackoverflow.com/search?tab=newest&q=hpcc-ecl). This legacy forum will be active and monitored during our transition to Stack Overflow but will become read only beginning September 1, 2022.



Get a const set of field names from a layout

Comments and questions related to the Enterprise Control Language

Fri Mar 25, 2022 7:23 pm Change Time Zone

Say we have a layout like this:
Code: Select all
MyLayout := RECORD
  INTEGER a;
  INTEGER B;
  STRING C;
  STRING D;
  INTEGER e;
  //plus another 1000 attributes
END;


I would like to get a constant set of field names, same as if I had hard-coded them like this:

Code: Select all
EXPORT SET OF STRING keys := [
  'a',
  'B',
  'C',
  'D',
  'e'
];


1. trouble: I need the field names to preserve case

I could do something like this

Code: Select all
MyLayout := RECORD
   INTEGER a {xpath('a')};
   INTEGER B {xpath('B')};
   STRING C {xpath('C')};
   STRING D {xpath('D')};
   INTEGER e {xpath('e')};
END; 


2. trouble
This removes empty STRING fields and I need to keep all fields.
xml_txt := (STRING)toxml(ROW([], a_l));

I could do this:
Code: Select all
MyLayout := RECORD
   INTEGER a {xpath('a')};
   INTEGER B {xpath('B')};
   STRING C {xpath('C'), DEFAULT( '-' )};
   STRING D {xpath('D'), DEFAULT( '-' )};
   INTEGER e {xpath('e')};
END; 


3. trouble
If I parse the XML like this:

Code: Select all
attributes_l := RECORD
  STRING name;
END;

attributes := REGEXREPLACE('<(\\w+)>([a-z\\-0-9]+)?</\\w+>', xml_txt, '$1,');
list := DATASET(STD.STr.SplitWords(attributes, ','), attributes_l);


I get the correct result but its not a constant set anymore. So I can't use static code generation like this:

Code: Select all
updates := //get external data
ROW(TRANSFORM(MyLayout,
  #DECLARE (cnt)
  #DECLARE (len)
  #DECLARE (attribute)

  #SET (cnt, 1)
  #SET (len, COUNT(attributeNames))

  #LOOP
    #SET (attribute, list[%cnt%])
    #IF (%cnt% <= %len%)
      #SET (cnt, %cnt% + 1)
      SELF.%attribute% := LEFT.%attribute% + updates.%attribute%;
    #ELSE
      #BREAK
    #END
  #END
SELF := LEFT;
));


I just want to avoid hardcoding field names multiple times in a file or subsets of the fields in different files, esp when there is over a thousand fields.
katzda
 
Posts: 1
Joined: Fri Mar 25, 2022 6:48 pm

Mon Mar 28, 2022 2:55 pm Change Time Zone

HI,
You can export the layout of the structure into XML which you can then loop through, in a MACRO, generating any ECL you would want. Then feed that constructed ECL into the compilers token stream.
I have MACRO below that takes a record structure as its input and constructs a field list.
e.g.
Code: Select all
R := RECORD
    STRING fl1;
    INTEGER itm2;
END;
MAC_makeFieldListFromLayout(R);

Generates
Code: Select all
{fl1,itm2}


EXPORT MAC_makeFieldListFromLayout(lay) := MACRO
    #UNIQUENAME(attrib)
    #SET(attrib,'')
    #UNIQUENAME(sep)
    #SET(sep,'{')
    #UNIQUENAME(out)
    #EXPORTXML(out, lay)
    #FOR (out)
      #FOR (Field)
        #APPEND(attrib,%'sep'%+%'{@label}'%)
        #SET(sep,',')
      #END
    #END
    %'attrib'%+'}'
ENDMACRO;

You change 'APPEND' to construct the ECL suitable for your case.
Yours
Allan
Allan
 
Posts: 444
Joined: Sat Oct 01, 2011 7:26 pm

Mon Mar 28, 2022 3:00 pm Change Time Zone

Ah, I've just read your note on preserving case. The exported XML does not preserve case. This is a right pain that I have brought up with the core team before. Curiously, case used to be preserved as the example in the ECL ref manual (last time I looked) did preserve case.
Allan
 
Posts: 444
Joined: Sat Oct 01, 2011 7:26 pm

Mon Mar 28, 2022 4:00 pm Change Time Zone

katzda,

You said:
I would like to get a constant set of field names, same as if I had hard-coded them like this:
so you can accomplish that like this:
Code: Select all
//Tag names can contain letters, digits, hyphens, underscores, and periods
// and the name ends with either a space, a slash, or an angle bracket:
PATTERN Tagname := PATTERN('[-_.A-Za-z0-9]')+;
PATTERN NameEnd := PATTERN('[ />]')+;
PATTERN Find  := '<' Tagname NameEnd;


ds := dataset([ {'<Row><Name><Fname>Fred</Fname><Lname>Jones</Lname></Name>'},
                {'<Address CSZ="Anytown, FL 12345">223 Main Street</Address>'},
                {'<EmptyTag/><More stuff="and nonsense"/></Row>'},
                {'<Row><Name><Fname>John</Fname><Lname>Smith</Lname></Name>'},
                {'<Address CSZ="Anyville, GA 54321">145 High Street</Address>'},
                {'<EmptyTag/><More stuff="and nonsense"/></Row>'}],
                        {STRING60 line});
                        
P := PARSE(ds,line,Find,{STRING Tag := MATCHTEXT(Tagname)},FIRST);
TagNames := DEDUP(SORT(P,Tag));                        
SetTags := SET(TagNames,Tag);
SetTags
This example works to produce a list of all the unique tag names in the XML and preserves the case of those names.

If the tag order is important, you can do it like this (after the PARSE):
Code: Select all
TagNames := PROJECT(P,
                    TRANSFORM({UNSIGNED C,STRING Tag},
                              SELF.C := COUNTER,
                              SELF.Tag := LEFT.Tag));
UniqueTags := SORT(DEDUP(SORT(Tagnames,Tag,C),tag),C);               
      
SetTags := SET(UniqueTags,Tag);

Let me know if there's a next step to your problem that you'd like some help with.

HTH

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1619
Joined: Wed Oct 26, 2011 7:40 pm

Tue Mar 29, 2022 12:59 pm Change Time Zone

Thank you all! I appreciate your responses and you definitely gave me some useful ideas. Below are my partial solutions. I imagine I'll be able to use macro v1 and v3 when appropriate as both have pros and cons. Not sure what benefit could v2 solution have over v1, maybe performance?

There is one last task which I still need to work on but posting this now to keep you updated and give more information. But I'm certainly in a much better position to solve the second task myself than I was before.

Code: Select all
GetFields_v1(layout) := FUNCTIONMACRO
  /* N.B:
    + Preserves case
    + Preserves order
    - Generates a NON-constant expression
    - Won't preserve STRING fields if the layout hasn't defined a default value with {DEFAULT}, e.g {DEFAULT('-')}
  */
  xml_txt := (STRING)toxml(ROW([], layout));

  STRING attributes := (STRING)REGEXREPLACE('<(\\w+)>([a-z\\-0-9]+)?</\\w+>', xml_txt, '$1,');
  list := STD.STr.SplitWords(attributes, ',');
  RETURN list;
ENDMACRO;

GetFields_v2(layout) := FUNCTIONMACRO
  /* N.B:
    + Preserves case
    - Doesn't preserves order
    - Result is NOT constant
    - Won't preserve STRING fields if the layout hasn't defined a default value with {DEFAULT}, e.g {DEFAULT('-')}
  */
  xml_txt := (STRING)toxml(ROW([], layout));

  PATTERN Tagname := PATTERN('[-_.A-Za-z0-9]')+;
  PATTERN NameEnd := PATTERN('[ />]')+;
  PATTERN Find  := '<' Tagname NameEnd;

  ds := dataset([{xml_txt}],{STRING60 line});
                         
  P := PARSE(ds,line,Find,{STRING Tag := MATCHTEXT(Tagname)},FIRST);
  TagNames := DEDUP(SORT(P,Tag));                       
  SetTags := SET(TagNames,Tag);
  RETURN SetTags;
ENDMACRO;

GetFields_v3(layout) := FUNCTIONMACRO
  /*N.B
   - Won't preserve case
   - Won't preserve STRING fields if the layout hasn't defined a default value with {DEFAULT}, e.g {DEFAULT('-')}
   + Generated SET IS constant!
  */
  rec := ROW([], layout);

  #UNIQUENAME(out)
  #UNIQUENAME(sep)
 
  #SET(sep,'')
 
  #EXPORTXML(out, rec)
  RETURN [ 
  #FOR (out)
    #FOR (Field)
      #IF(%'sep'% = '')
        #SET(sep,',')
      #ELSE
        %sep%
      #END
      %'{@label}'%
    #END
  #END
  ];
ENDMACRO;

/* TASK 1: Get fields from Attributes_l as a SET and update all fields which have the same name
           Assume Updates_l will have the same fields as Attributes_l and more

           Assume all fields are integers for now.
           Case sensitivity doesnt matter in this case
*/
Attributes_l := RECORD
   INTEGER a {xpath('a')};
   INTEGER B {xpath('B')};
   INTEGER E {xpath('E')};
END;
Updates_l := RECORD 
   INTEGER a {xpath('a')};
   INTEGER B {xpath('B')};
   INTEGER E {xpath('E')};
   INTEGER f {xpath('f')};
END;

attributes := ROW({0,1,2}, Attributes_l);
updates := ROW({1,2,3,4}, Updates_l);

/* TESTING */
fieldNames_v1 := GetFields_v1(Attributes_l);
fieldNames_v2 := GetFields_v2(Attributes_l);
fieldNames_v3 := GetFields_v3(Attributes_l);
OUTPUT(fieldNames_v1, NAMED('fieldNames_v1'));   
OUTPUT(fieldNames_v2, NAMED('fieldNames_v2')); 
OUTPUT(fieldNames_v3, NAMED('fieldNames_v3')); 

updatedAttributes := ROW(TRANSFORM(Attributes_l,

  fieldNames := GetFields_v3(Attributes_l);
  #DECLARE (cnt)
  #DECLARE (len)
  #DECLARE (field)

  #SET (cnt, 1)
  #SET (len, COUNT(fieldNames))

  #LOOP
    #SET (field, fieldNames[%cnt%])
    #IF (%cnt% <= %len%)
      #SET (cnt, %cnt% + 1)
      SELF.%field% := attributes.%field% + updates.%field%;
    #ELSE
      #BREAK
    #END
  #END
  SELF := attributes;
));

OUTPUT(updatedAttributes, NAMED('updatedAttributes'));

/* TASK 2: Take the attributes instance of Attributes_l and convert into name/value pairs dataset.
           Assume Attributes_l will contain STRING fields
           We have to preserve case
*/
katzda01
 
Posts: 1
Joined: Mon Mar 28, 2022 2:04 pm


Return to ECL

Who is online

Users browsing this forum: Google [Bot] and 1 guest