Sat Aug 18, 2018 2:27 pm
Login Register Lost Password? Contact Us


Working with Multi-Layout Fixed Length File

Questions around writing code and queries

Mon Jul 25, 2011 12:29 pm Change Time Zone

I have sprayed a fixed length file. Although the length of each line is the same, the layout of each line may or may not be the same. Say I have 5 layouts (L1, L2, L3, L4 and L5). A transaction may consist of 1 to n of such layouts (say L1, L2, L2, L3, L3, L4, L5). Another transaction may look like L1, L2, L3, L4, L4, L4 ,L4, L5, L5.

My dilemma is that I cannot have a common layout to read all the lines. However, I can parse the 3 character to determine the Layout that the line is following.

Is there a ECL Design Pattern that I can use to tackle this efficiently :?:
vfpeter
 
Posts: 3
Joined: Mon Jul 25, 2011 12:20 pm

Mon Jul 25, 2011 2:22 pm Change Time Zone

Ahh ... the old Cobol Copybook ... looks nostalgic :lol:

This is one of the first 'nasty' things ECL had to support; if you look in the language reference under record structure - there is a capability called IFBLOCK - it essentially allows you to introduce a section of a fixed length field record that only exists dependant upon an expression based upon prior fields ...

It is a while since I have actually used the feature myself - but if you have any questions please ask - I can blow out the cobwebs ...

David
dabayliss
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 109
Joined: Fri Apr 29, 2011 1:35 pm

Mon Jul 25, 2011 2:57 pm Change Time Zone

Thanks for the quick response David :D . I will try this out.

Peter.
vfpeter
 
Posts: 3
Joined: Mon Jul 25, 2011 12:20 pm

Mon Jul 25, 2011 3:26 pm Change Time Zone

I created this file in a text editor:

1AAAAABBBBBCCCCCDDDDD
2AABBCCDDEEFFGGHHIIJJ
1FFFFFGGGGGHHHHHIIIII
2FFGGHHIIJJKKLLMMNNOO

Then I uploaded it to the VM and did a spray fixed with rec length 23 (CRLF added by text editor). This code reads the file and splits out the separate structures perfectly (you'll need to look at the result through the ECL Watch page, not the Results tab of the ECL IDE):

Code: Select all
MultiRec := RECORD
  STRING1 RecType;
   IFBLOCK(SELF.RecType = '1')
     STRING5 F1_1;
     STRING5 F2_1;
     STRING5 F3_1;
     STRING5 F4_1;
   END;
   IFBLOCK(SELF.RecType = '2')
     STRING2 F1_2;
     STRING2 F2_2;
     STRING2 F3_2;
     STRING2 F4_2;
     STRING2 F5_2;
     STRING2 F6_2;
     STRING2 F7_2;
     STRING2 F8_2;
     STRING2 F9_2;
     STRING2 F10_2;
   END;
   STRING2 CRLF;
END;
ds := dataset('~TEST::MULTILAYOUT::InputData',MultiRec,flat);

OUTPUT(ds);


HTH,

Richard
richard.taylor@lexisnexis.com
 
Posts: 11
Joined: Wed Jun 15, 2011 6:00 pm

Mon Jul 25, 2011 3:39 pm Change Time Zone

Once you've read your data in, it is worth converting it (e.g., using PROJECT) to a single format that doesn't contain IFBLOCKs since the systems tends to process rows more efficiently if they don't contain conditional fields.
ghalliday
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 178
Joined: Wed May 18, 2011 9:48 am

Mon Jul 25, 2011 8:05 pm Change Time Zone

You guys rock! ;)

Richard, thanks for the example.

I made some minor changes as I already had the layout defined previously.

Code: Select all
MultiRec := RECORD
  STRING3 RecSeq;
  STRING1 RecType;
   IFBLOCK(SELF.RecType = 'A')
      MyLayoutA AND NOT [RecordSequence, RecordType] NewRec_A;
   END;
   IFBLOCK(SELF.RecType = 'B')   
      MyLayoutB AND NOT [RecordSequence, RecordType] NewRec_B;
   END;
   IFBLOCK(SELF.RecType = 'C')   
      MyLayoutC AND NOT [RecordSequence, RecordType] NewRec_C;
   END;
   IFBLOCK(SELF.RecType = 'D’)
      MyLayoutD AND NOT [RecordSequence, RecordType] NewRec_D;
   END;
   IFBLOCK(SELF.RecType = 'E')   
      MyLayoutE AND NOT [RecordSequence, RecordType] NewRec_E;
   END;
   IFBLOCK(SELF.RecType = 'F')   
      MyLayoutF AND NOT [RecordSequence, RecordType] NewRec_F;
   END;
   STRING1 EOL;
END;

ds1 := DATASET('~TEST::MULTILAYOUT::TestInputData'', MultiRec, thor);
OUTPUT(ds1,NAMED('FinalRead'));


ghalliday wrote:Once you've read your data in, it is worth converting it (e.g., using PROJECT) to a single format that doesn't contain IFBLOCKs since the systems tends to process rows more efficiently if they don't contain conditional fields.

That is exactly what I plan to do, after I read the data.

Thanks for all the help.
Peter
vfpeter
 
Posts: 3
Joined: Mon Jul 25, 2011 12:20 pm

Wed Feb 21, 2018 6:57 pm Change Time Zone

ghalliday wrote:Once you've read your data in, it is worth converting it (e.g., using PROJECT) to a single format that doesn't contain IFBLOCKs since the systems tends to process rows more efficiently if they don't contain conditional fields.


Severely belated follow-up question, if I may...
Which of the following would be (generally) more efficient?
Code: Select all
// A) Contains child datasets only conditionally
lPar := RECORD
  INTEGER id;
  BOOLEAN hasChld1;
  BOOLEAN hasChld2;
  IFBLOCK(SELF.hasChld1)
    DATASET(lChd1) ds1;
  END;
  IFBLOCK(SELF.hasChld2)
    DATASET(lChd2) ds2;
  END;
END;

// B) Always contains (possibly empty) child datasets
lPar := RECORD
  INTEGER id;
  DATASET(lChd1) ds1;
  DATASET(lChd2) ds2;
END;


Your comment above, Gavin, seems to indicate that B) would be more efficient?
Thanks.
jwilt
 
Posts: 50
Joined: Wed Feb 27, 2013 7:46 pm

Wed Feb 21, 2018 8:07 pm Change Time Zone

Possibly another alternative:
Code: Select all
// C) Contains child datasets with conditional maxcounts
lPar := RECORD
  INTEGER id;
  BOOLEAN hasChld1;
  BOOLEAN hasChld2;
  DATASET(lChd1) ds1  {maxcount(if(self.hasChld1, 10, 0))};
  DATASET(lChd2) ds2  {maxcount(if(self.hasChld2, 20, 0))};
END;

Any benefit here?
jwilt
 
Posts: 50
Joined: Wed Feb 27, 2013 7:46 pm

Thu Feb 22, 2018 2:17 pm Change Time Zone

Jim,

Personally, I would go with your B solution.

It doesn't hurt anything to have a nested child dataset that contains no records, because nested child datasets are inherently variable-length records and my understanding is that an empty child dataset takes up no room in the record.

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1370
Joined: Wed Oct 26, 2011 7:40 pm


Return to Programming

Who is online

Users browsing this forum: No registered users and 1 guest

cron