Mon Dec 06, 2021 6:17 am
Login Register Lost Password? Contact Us


Constant expression expected -- Macro

Comments and questions related to the Enterprise Control Language

Tue Jul 25, 2017 6:45 pm Change Time Zone

Hi ,

I am trying to split a text in to multiple parts based on some condition.

This is the condition am trying to achieve, trying to use a template language for the same to execute a block of code , but encounter this error as we cannot use dynamically computed values in the macro and it has to be a constant in #IF.

Please suggest me how i can improve the same or any other way i can achieve the same.

IF(length(IncomingString <= 1400 then
Call the wordwrap with 70 and put those into Statement1
Statement2 is all empty.
IF(length(IncomingString) > 1400 then
Call that word wrap code first with 1400 – that would split into two.
Then call it with 70 with the first half. Put those into Statement1
Then call it with 70 with the second half. Put those into Statement2

Here is the sample code am using

Code: Select all
import std , python;

SET OF STRING70 splitString(STRING p , Integer num_parts ) := EMBED(Python)
   import textwrap
   return textwrap.wrap(p, num_parts)
ENDEMBED;

statements := 'Enterprise Control Language (ECL) has been designed specifically for huge data projects using the LexisNexis High Performance Computer Cluster (HPCC). ECL’s extreme scalability comes from a design that allows you to leverage every query you create for re-use in subsequent queries as needed. To do this, ECL takes a Dictionary approach to building queries wherein each ECL definition defines an expression. Each previous Definition can then be used in succeeding ECL definitions—the language extends itself as you use it.';

ds := DATASET( [ { statements } ] , { STRING statement_text } );

AStatement_Info := RECORD
      STRING70 STATEMENT_TEXT;
END;

res := PROJECT(  ds ,  TRANSFORM (
                                  {
                                   DATASET(AStatement_Info) Statement1 {MAXCOUNT(20)};
                                   DATASET(AStatement_Info) Statement2 {MAXCOUNT(20)};
                                  },
                                  len_consumer_statements:=length(LEFT.statement_text);
                                                   #IF( len_consumer_statements<= 1400)
                                                      temp :=  splitString( LEFT.statement_text , 70);
                                    temp_ds:=DATASET(temp,{ STRING70 STATEMENT_TEXT });
                                    SELF.Statement1 := temp_ds( STATEMENT_TEXT <> '');
                                                      SELF := [];
                                  #ELSE
                                    temp1:= splitString( LEFT.statement_text , 1400);
                                    temp2 := splitString ( temp1[1] , 70);
                                    temp3 := splitString ( temp1[2] , 70);
                                    temp_ds_1:=DATASET(temp2,{STRING70 STATEMENT_TEXT});
                                    temp_ds_2:=DATASET(temp3,{STRING70 STATEMENT_TEXT});
                                    SELF.Statement1:=temp_ds_1(STATEMENT_TEXT <> '');
                                    SELF.Statement2:= temp_ds_2( STATEMENT_TEXT <> '');
                                 #END 
            ));
                  
output(res);
                       
ksviswa
 
Posts: 129
Joined: Sat Jun 09, 2012 9:43 am

Wed Jul 26, 2017 7:19 pm Change Time Zone

ksviswa,

Here's how I would do it (note that I had to change a couple of characters in your string):
Code: Select all
statements := 'Enterprise Control Language (ECL) has been designed specifically for huge data projects using the LexisNexis High Performance Computer Cluster (HPCC). ECL\'s extreme scalability comes from a design that allows you to leverage every query you create for re-use in subsequent queries as needed. To do this, ECL takes a Dictionary approach to building queries wherein each ECL definition defines an expression. Each previous Definition can then be used in succeeding ECL definitions-the language extends itself as you use it.';

ds := DATASET( [ { statements } ] , { STRING statement_text } );

AStatement_Info := RECORD
  STRING70 STATEMENT_TEXT;
END;

{DATASET(AStatement_Info) Statement1 {MAXCOUNT(20)}} XF(ds L) := TRANSFORM
  NumParts := ROUNDUP(LENGTH(TRIM(L.statement_text))/70);
  SELF.Statement1 := DATASET(NumParts,
                             TRANSFORM(AStatement_Info,
                                       StartPt := ((COUNTER-1)*70)+1;
                                       SELF.STATEMENT_TEXT :=
                                                 L.statement_text[StartPt.. ]));
END;                                                         

res := PROJECT(ds,XF(LEFT));
output(res);
This just splits it into 70-byte chunks, ignoring whether that split words or not, but I think that's what your code would also have done. No need for either Python or Template Language code.

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1606
Joined: Wed Oct 26, 2011 7:40 pm

Wed Jul 26, 2017 8:20 pm Change Time Zone

Thanks Richard..

I did not have any problems with splitting the data , we could use your approach or the python one. The python one makes sure the word wrapping is done and we just do not split by 70 characters , so that the same word is not split in to multiple lines.

I was stuck with the execution of multiple blocks based on the length of the statement.
statement1 and statement2 are updated based on the condition given below.

IF(length(IncomingString <= 1400 then
Call the wordwrap with 70 and put those into Statement1
Statement2 is all empty.
IF(length(IncomingString) > 1400 then
Call that word wrap code first with 1400 – that would split into two.
Then call it with 70 with the first half. Put those into Statement1
Then call it with 70 with the second half. Put those into Statement2
ksviswa
 
Posts: 129
Joined: Sat Jun 09, 2012 9:43 am

Fri Jul 28, 2017 1:36 pm Change Time Zone

ksviswa,

Then I would do it this way (keeping the Python but eliminating the Template Language):
Code: Select all
import std , python;

SET OF STRING70 splitString(STRING p , Integer num_parts ) := EMBED(Python)
   import textwrap
   return textwrap.wrap(p, num_parts)
ENDEMBED;

statements := 'Enterprise Control Language (ECL) has been designed specifically for huge data projects using the LexisNexis High Performance Computer Cluster (HPCC). ECL’s extreme scalability comes from a design that allows you to leverage every query you create for re-use in subsequent queries as needed. To do this, ECL takes a Dictionary approach to building queries wherein each ECL definition defines an expression. Each previous Definition can then be used in succeeding ECL definitions—the language extends itself as you use it.';

ds := DATASET( [ { statements } ] , { STRING statement_text } );

AStatement_Info := RECORD
      STRING70 STATEMENT_TEXT;
END;

res := PROJECT(ds,TRANSFORM({DATASET(AStatement_Info) Statement1 {MAXCOUNT(20)};
                             DATASET(AStatement_Info) Statement2 {MAXCOUNT(20)};},
                             settxt := splitString( LEFT.statement_text , 70);
                             tds := DATASET(settxt,{ STRING70 STATEMENT_TEXT })
                                      ( STATEMENT_TEXT <> '');
                             NumParts := COUNT(temp_ds);
                             SELF.Statement1 := IF(NumParts < 21,
                                                   temp_ds,temp_ds[1..20]);
                             SELF.Statement2 := IF(NumParts > 20,
                                                   temp_ds[21..],[]);
            ));
                 
output(res);
This method also solves a problem that your splitting at 1400 idea might have had, by determining the number of parts AFTER the Python code splits it instead of assuming splitting at 1400 will not miss a part (because the Python code is actually splitting at < 70 characters each part so as not to split a word -- so if there were 1399 characters and Python left > 2 spaces at the end of even one part you would have ended up with 21 parts instead of 20 and would have missed putting that last part into your Statement2).

Of course, you would eliminate all these problems by just allowing a larger MAXCOUNT on your child dataset and putting all the parts into one instead of two. Is there a particular reason for NOT doing that?

Try it and see how it works.

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1606
Joined: Wed Oct 26, 2011 7:40 pm


Return to ECL

Who is online

Users browsing this forum: Bing [Bot] and 1 guest