Mon Dec 16, 2019 4:55 am
Login Register Lost Password? Contact Us


Check if two record structure match

Comments and questions related to the Enterprise Control Language

Tue Oct 08, 2019 11:33 pm Change Time Zone

I'm putting together a macro to check if two record structures match between two given datasets. If I have the dataset I pass that in so I can compare the xml value of the rec structure for the datasets. If I don't have a dataset defined and was just given a filename then I attempt to lookup the record structure. If the file name does not exist it throws a warning and shows as a mismatch and I'm good with that. My question is how do I get around file names that are not constant at runtime and I don't have a dataset definition. I need to get this working without having the dataset handy and without being able to create it because I don't know the the layout and with the file name built in this exact manner. Trying to integrate into existing common code that everyone uses without any code changes to the builds.

Thanks for your ideas. I'm so close I can taste it I just need a way to interrogate the file structure by name at runtime :-)

PS. I tried to use get column mapping and GetLogicalFileAttribute first.

Example: macro defined below

Code: Select all
#CONSTANT('myfileprefix','~thor::tmsn');

prefix := '~thor' : stored('myfileprefix');


Test1 := dataset([{1,'one'},{2,'two'}],{integer id , string desc});
Test2 := dataset([{2,'two'},{3,'three'}],{integer id , string desc});

filename := prefix + '::testfile';
//run first wuid then comment out
// output(Test2,,filename,thor);

//uncomment below and run second time after test file created. you can just syntex check it it will say
//Error:    LOOKUP attribute requires a constant filename MAC_Check_Rec_Struct_Match.ecl
#IF(MAC_Check_Rec_Struct_Match(Test1,filename))
output('They Match',named('match'));
#ELSE
output('NO LUCK',named('NOPE'));
#END


//pass in two file names and this code will tell you if the record structure is //identical two datasets will also work as RECORDOF will use the dataset and extract //the known structure newly added functionality in 6.4  //https://hpccsystems.com/blog/file-layout-resolution-compile-time
EXPORT MAC_Check_Rec_Struct_Match(file1,file2) := functionmacro
import std;

   #uniquename(typ1);
   #uniquename(typ2);
   #uniquename(r);
   #uniquename(r2);
   #uniquename(out);
   #uniquename(out2);

   //check if the parms are datasets or strings
   %typ1%  := STD.Str.ToLowerCase(#GETDATATYPE(file1)[..6])  = 'string';
   %typ2%  := STD.Str.ToLowerCase(#GETDATATYPE(file2)[..6])  = 'string';
   #IF(%typ1%)
     %r% := RECORDOF(file1,LOOKUP); //if string look up record def assuming file exists
   #ELSE
     %r% := RECORDOF(file1); //assume it is a dataset and has been loaded into memory
   #END
   
   #IF(%typ2%) %r2% := RECORDOF(file2,LOOKUP);
   #ELSE       %r2% := RECORDOF(file2); #END
   #EXPORT(out, %r%);
   #EXPORT(out2, %r2%);

   return %'out'% = %'out2'%;

endmacro;


Tim N
newportm
 
Posts: 15
Joined: Tue Nov 15, 2016 2:48 pm

Wed Oct 09, 2019 2:04 pm Change Time Zone

Here is an option. I can pull the data from the DFU with a soapcall at runtime. and parse out the record struct. One caveat is this does not return any information if it is a superfile.
Code: Select all
DFUInfoRequest := RECORD, MAXLENGTH(100)
      STRING  Name              {XPATH('Name'               )} := filename;
      STRING  Cluster           {XPATH('Cluster'            )} := cluster;
      STRING  UpdateDescription {XPATH('UpdateDescription'  )} := '0';
      STRING  FileName          {XPATH('FileName'           )} := '';
      STRING  FileDesc          {XPATH('FileDesc'           )} := '';
END;
   
DFUInfoOutRecord := RECORD, MAXLENGTH(100000)
      STRING Ecl                {XPATH('FileDetail/Ecl'              )};   
END;

esp            := pesp + ':8010';
results := SOAPCALL('http://' + esp + '/WsDfu'
                     ,'DFUInfo'
                     ,DFUInfoRequest
                     ,DATASET(DFUInfoOutRecord)
                     ,XPATH('DFUInfoResponse')
                     );
   
results;
newportm
 
Posts: 15
Joined: Tue Nov 15, 2016 2:48 pm

Wed Oct 09, 2019 7:07 pm Change Time Zone

Tim,

Not a simple problem, but I managed to find a fairly simple way to do it! :)

First I wrote a FUNCTIONMACRO that uses #EXPORTXML to get the structure information from any declared DATASET (inline or on disk) and used Template Language to format the result exactly the same as the GetLogicalFileAttribute function's return result:
Code: Select all
GetStructTxt(ds) := FUNCTIONMACRO
  #DECLARE(Ctr);
  #SET(Ctr,0);
  #DECLARE(OutString);
  #SET(OutString,'{ ');
  #EXPORTXML(Fred,ds);
  #FOR (Fred)
    #FOR (Field)
      #IF(%Ctr%=0)
         #APPEND(OutString,%'{@ecltype}'% + ' ' + %'{@name}'% )
         #SET(Ctr,1);
      #ELSE   
         #APPEND(OutString,', ' + %'{@ecltype}'% + ' ' + %'{@name}'% )
      #END
    #END
  #END
  #APPEND(OutString,' };\n'); //add \n to duplicate GetLogicalFileAttribute() return
  RETURN %'OutString'%;
ENDMACRO; 

Now you can use the GetLogicalFileAttribute function to get the structure when you only have the filename. The "trick" to this function that I learned through hard effort is that it appends a newline character to the end of its return result, so I had to make sure the FUNCTIONMACRO duplicated that format exactly to allow a simple string compare between the two results.

Then you can compare any two dataset structures, like this:
Code: Select all
#CONSTANT('myfileprefix','~thor::test::RT');
prefix := '~thor' : stored('myfileprefix');
filename  := prefix + '::testfile';

Test1 := dataset([{1,'one'},{2,'two'}],{integer id , string desc});
Test2 := dataset([{2,'two'},{3,'three'}],{integer id , string desc}); //disk file
Test3 := dataset([{1,'one'},{2,'two'}],{UNSIGNED id , string10 desc});

IMPORT Std;

recstruct1 := GetStructTxt(Test1);                             
recstruct2 := STD.File.GetLogicalFileAttribute(filename,'ECL');
recstruct3 := GetStructTxt(Test3);                             

OUTPUT(recstruct1,NAMED('recstruct1_raw'));
OUTPUT(recstruct2,NAMED('recstruct2_raw'));
OUTPUT(recstruct3,NAMED('recstruct3_raw'));
OUTPUT(recstruct1 = recstruct2,NAMED('Compare_1_2')); 
OUTPUT(recstruct1 = recstruct3,NAMED('Compare_1_3')); 
OUTPUT(recstruct2 = recstruct3,NAMED('Compare_2_3')); 
Thanks for the interesting problem.

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1488
Joined: Wed Oct 26, 2011 7:40 pm

Wed Oct 09, 2019 11:41 pm Change Time Zone

Richard,

I really appreciate you taking the time to put this together. I agree it has been a fun thing to work on. I don;t get to do much code problem solving these days. Your solution is pretty slick and gets around the one issue I was coming up with. A note that the macro GetStructTxt only works for inline dataset definitions or data that has been transformed/referenced in some way other than an output. If I instead use a dataset defined like this

Code: Select all
DS := Dataset('~thor::base::test', TestFolder.layouts.sampLayout,thor);


assuming I am going to pass the dataset around and do stuff with it later, it returns { } as the layout.

In other news,
if I do the same thing but the file name is built as in my example above.

Code: Select all
#CONSTANT('myfileprefix','~thor::tmsn');
prefix := '~thor' : stored('myfileprefix');
filename := prefix + '::testfile';

Test1 := dataset(filename,TestFolder.layouts.sampLayout,thor);


GetStructTxt(Test1 ); the compiler creates a local workunit and says it completed but never actually submits the job. L20191009-123456

Doing an output to read in a sequential does not change the behavior. Now if I take an altering action on the dataset say a project or sort the layout format actually changes for a file with a child dataset or 50...

simplified items in layout for here.

/////RESULT OF NOTHOR(STD.File.GetLogicalFileAttribute(file2,'ECL'));
Code: Select all
coverage_info := RECORD
   string4 child1;
  END;

finance_company_info := RECORD
   string15 child2;
  END;

RECORD
  string6 rec1
  string20 rec2
  DATASET(coverage_info) coverages{maxcount(18)};
  DATASET(finance_company_info) finance_info{maxcount(4)};
END;


////////REsult of GetStructTxt //////
Code: Select all
{ string6 rec1, string20 rec2,  table of <unnamed> coverages, string4 child1,  coverages, table of <unnamed> finance_info, string15 child2, finance_info };

I guess I can write another wrapper to convert all layout with child datasets into the {} format.
newportm
 
Posts: 15
Joined: Tue Nov 15, 2016 2:48 pm

Thu Oct 10, 2019 1:19 pm Change Time Zone

Tim,
A note that the macro GetStructTxt only works for inline dataset definitions or data that has been transformed/referenced in some way other than an output.
This simple solution makes it work for me:
Code: Select all
IMPORT TrainingYourName;

ds1 := TrainingYourName.File_Persons_Slim.file[1..2];
ds2 := TrainingYourName.Accounts[1..2];

recstruct1a:= (STRING)GetStructTxt(ds1);                             
recstruct2a:= (STRING)GetStructTxt(ds2);                             
OUTPUT(recstruct1a,NAMED('recstruct1a_rawEXPORT'));//OUTPUT file
OUTPUT(recstruct2a,NAMED('recstruct2a_rawEXPORT'));//sprayed file
You just need to make the dataset you pass a subset (like the first 2 recs, as I did here) and then it works correctly.

The problem is, sometime in the last 20 years the #EXPORT and #EXPORTXML format was expanded to include file information. Unfortunately, that info was added as a set of enclosing tags (whose info is only in XML attributes) instead of a simple self-contained tag. The problem is, the tag name is different for each filetype, so I would need to write several separate versions to handle this. Here's what it looks like:
Code: Select all
<Data>
<CsvTable exported="false" name="csv^class::rt::intro::accounts">
  <Field ecltype="unsigned8"
         label="personid"
         name="personid"
         position="0"
         rawtype="524545"
         size="8"
         type="unsigned"/>
</CsvTable>
</Data>

and ...

<Data>
<FlatTable exported="false" name="flat^class::rt::intro::persons" recordLength="155">
  <Field ecltype="unsigned8"
         label="id"
         name="id"
         position="0"
         rawtype="524545"
         size="8"
         type="unsigned"/>
</FlatTable>
</Data>
The fact that GetLogicalFileAttrbute returns totally different text for nested child datasets means separate code to match that. :(

I'll see what I can do with that, but I'm traveling now so ... :)

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1488
Joined: Wed Oct 26, 2011 7:40 pm

Thu Oct 10, 2019 2:06 pm Change Time Zone

Tim,

OK, upon further exploration, it appears that GetLogicalFileAttribute returns whatever text appears in the ECL tab in ECL Watch for that logical file.
Here is an option. I can pull the data from the DFU with a soapcall at runtime. and parse out the record struct. One caveat is this does not return any information if it is a superfile.
And that appears to be the same thing that GetLogicalFileAttribute is doing.

For small/simple files, that record structure is expressed as
Code: Select all
{ unsigned4 recid, string10 homephone };

For larger record structures (including nested Child Datasets) that takes the form:
Code: Select all
RECORD
  unsigned4 recid;
  string10 homephone;
  string10 cellphone;
  string20 fname;
  string20 mname;
  string20 lname;
  string10 new_homephone;
  string10 new_cellphone;
  string20 new_fname;
  string20 new_mname;
  string20 new_lname;
END;
So I'll have to reconsider how to duplicate that structure.

Otherwise, could you simply default to using GetLogicalFileAttribute on both sides of your comparison? That would make it much simpler. :)

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1488
Joined: Wed Oct 26, 2011 7:40 pm

Mon Oct 14, 2019 8:33 pm Change Time Zone

Code: Select all
EXPORT get_ThorFile_Info(STRING filename,STRING pcluster = '',STRING pesp = _Control.ThisEnvironment.ESP_IPAddress) := FUNCTION

DFUInfoRequest := RECORD, MAXLENGTH(100)
      STRING  Name              {XPATH('Name'               )} := filename;
      STRING  Cluster           {XPATH('Cluster'            )} := pcluster;
      STRING  UpdateDescription {XPATH('UpdateDescription'  )} := '0';
      STRING  FileName          {XPATH('FileName'           )} := '';
      STRING  FileDesc          {XPATH('FileDesc'           )} := '';
END;
   
DFUInfoOutRecord := RECORD, MAXLENGTH(100000)
      STRING exception_code     {XPATH('Exceptions/Exception/Code'   )};
      STRING exception_source   {XPATH('Exceptions/Exception/Source' )};
      STRING exception_msg      {XPATH('Exceptions/Exception/Message')};
      STRING Name               {XPATH('FileDetail/Name'             )};
      STRING Filename           {XPATH('FileDetail/Filename'         )};
      STRING Description        {XPATH('FileDetail/Description'      )};
      STRING Dir                {XPATH('FileDetail/Dir'              )};
      STRING PathMask           {XPATH('FileDetail/PathMask'         )};
      STRING Filesize           {XPATH('FileDetail/Filesize'         )};
      STRING ActualSize         {XPATH('FileDetail/ActualSize'       )};
      STRING RecordSize         {XPATH('FileDetail/RecordSize'       )};
      STRING RecordCount        {XPATH('FileDetail/RecordCount'      )};
      STRING Wuid               {XPATH('FileDetail/Wuid'             )};
      STRING Owner              {XPATH('FileDetail/Owner'            )};
      STRING Cluster            {XPATH('FileDetail/Cluster'          )};
      STRING JobName            {XPATH('FileDetail/JobName'          )};
      STRING Persistent         {XPATH('FileDetail/Persistent'       )};
      STRING Format             {XPATH('FileDetail/Format'           )};
      STRING MaxRecordSize      {XPATH('FileDetail/MaxRecordSize'    )};
      STRING CsvSeparate        {XPATH('FileDetail/CsvSeparate'      )};
      STRING CsvQuote           {XPATH('FileDetail/CsvQuote'         )};
      STRING CsvTerminate       {XPATH('FileDetail/CsvTerminate'     )};
      STRING CsvEscape          {XPATH('FileDetail/CsvEscape'        )};
      STRING Modified           {XPATH('FileDetail/Modified'         )};
      STRING Ecl                {XPATH('FileDetail/Ecl'              )};
      STRING Eclxml             {XPATH('FileDetail/Ecl/Format/xml'   )};
      STRING isSuper            {XPATH('FileDetail/isSuperfile'      )};
      STRING subfiles           {XPATH('FileDetail/subfiles/Item'    )};
END;

esp            := pesp + ':8010';

results := SOAPCALL('http://' + esp + '/WsDfu'
                              ,'DFUInfo'
                              ,DFUInfoRequest
                              ,DATASET(DFUInfoOutRecord)
                              ,XPATH('DFUInfoResponse')

RETURN results;
END;


Code: Select all
rec := IF((boolean)(get_ThorFile_Info(file1)[1].isSuper) ,
             get_ThorFile_Info('~' +
                 get_ThorFile_Info(file1)[1].subfiles)[1].ecl,
             get_ThorFile_Info(file1)[1].ecl);
                   
   rec;


Yes, I already went down that road. However, the result of the soap call is identical to NOTHOR(STD.File.GetLogicalFileAttribute(file1,'ECL'));

That said, if we go out to the http://esp.net:8010/WsDfu/ we can access the DFUDefFile
The only problem is the soap call returns a hash of the blob unlike the button on the GUI.

DF
Code: Select all
UDefFileRequest  := RECORD, MAXLENGTH(100)
         STRING  Name              {XPATH('Name'               )} := filename;
         STRING  Format            {XPATH('Format'             )} := 'xml';
   END;
   
   DFUDefFileRecord := RECORD, MAXLENGTH(100000)
   STRING defFile     {XPATH('defFile'   )};
   END;
   
   results := SOAPCALL('thor_esp.net/WsDfu'                           ,'DFUDefFile'                           ,DFUDefFileRequest                            ,DATASET(DFUDefFileRecord)                           ,XPATH('DFUDefFileResponse')                        );
   results;


the xml embedded in the returned code is identical to that of Recordof(dataset) so it would enable me to do a like for like comparison with no other checks. But ... I can't access the actual result of the DFUDefFile :-(
newportm
 
Posts: 15
Joined: Tue Nov 15, 2016 2:48 pm

Tue Oct 15, 2019 1:34 pm Change Time Zone

Tim,

OK, then I suggest it's time for you to submit a feature request in JIRA. You can ask for an option to enable you to easily get the same return result from either your SOAPCALL or the GetLogicalFileAttribute function call (or both).

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1488
Joined: Wed Oct 26, 2011 7:40 pm

Mon Oct 21, 2019 4:27 pm Change Time Zone

Hi Everyone,

I noticed this post and realised that a series of YouTube Videos I've done that discuss manipulating / analyzing dataset structures at compile time, may be of help.

https://www.youtube.com/playlist?list=PLONd-6DN_sz3QTzE5s_qbOSDJ8V-IEXUM

Yours
Allan
Allan
 
Posts: 392
Joined: Sat Oct 01, 2011 7:26 pm

Wed Nov 13, 2019 10:08 pm Change Time Zone

Tim,

If all you really need is access to the result of "DFUDefFile", the value is Base64 encoded so all you have to do is decode via a standard API.

It's easily enough done in most languages,

In ECL you can call "STD.Str.DecodeBase64(value)".

Regards,
Tony
anthony.fishbeck
 
Posts: 57
Joined: Wed Jan 30, 2013 10:18 pm

Next

Return to ECL

Who is online

Users browsing this forum: Bing [Bot], MSN [Bot] and 1 guest

cron