Tips and Tricks for ECL — Part 1 — Bit Fiddling

I get fairly frequent emails asking, “How can I … (do something) in ECL?” I’ve saved the example code from many of these questions, so now I’ll begin sharing them through this blog.

If anybody has this type of question they’d like me to answer (in the blog), please just send me an email to richard.taylor@lexisnexisrisk.com and I will do my best to come up with an ECL example demonstrating the solution.

All the code examples in this series will be stored in the “BlogCode” directory of my repository. So you will see IMPORT BlogCode used frequently.

We’ll start now with several questions I’ve gotten about “bit fiddling” in ECL.

A Couple of Useful Functions

The first thing to know about “bit fiddling” in ECL is that the bitwise operators will all only work on integer data types. Therefore, all the following examples are all designed to work with an UNSIGNED4 bitmap, although you can certainly use them with smaller bitmaps if you need to.

We’ll start with a couple of functions I wrote that are generally useful when you’re working with bits:

EXPORT Bit(UNSIGNED4 Bitmap,UNSIGNED1 WhichBit) := MODULE
   SHARED Mask := (UNSIGNED4)POWER(2,WhichBit-1); 
   EXPORT BOOLEAN   IsOn := BitMap & Mask <> 0;
   EXPORT UNSIGNED4 Flip := Bitmap ^ Mask; 
 END;

I created a MODULE structure to contain these, since they are related functions and both take the same two parameters. They also both use the same expression, so I was able to simplify the code by defining the SHARED Mask definition used by both EXPORT functions.

For the SHARED Mask definition, the POWER function raises the base value (2) by the exponent defined by WhichBit-1. This makes use of the mathematical fact that an exponent of zero is always 1, an exponent of one always returns the base number, and every other exponent will create the appropriate value so that the Mask (as a decimal value) will always be the value 1, or 2, or 4, or 8, or 16, etc. so that the only bit turned on in the resulting UNSIGNED4 Mask is the one bit in the ordinal position specified by the WhichBit parameter.

The IsOn function returns a BOOLEAN indicating whether the specified bit is a 1 (on) or 0 (off). It starts by detecting an empty bitmap and immediately returning FALSE in that case, otherwise it will use the Bitwise AND operator (&) with the passed in Bitmap and the Mask to determine if the specified bit is on or not.

The Flip function returns an UNSIGNED4 as the new bitmap after it has flipped the value of the specified bit. If the passed in bitmap as a 1 (on) in that bit it will change it to 0 (off), and vice versa. It uses the Bitwise eXclusive OR operator (^) with the passed in Bitmap and the Mask to flip just the value of the specified bit, leaving all other bits as they were.

Here’s an example of how these are used. This can be run in any builder window:

IMPORT BlogCode;
UNSIGNED4 MyBitmap := 4; //or in binary -- 00000000000000000000000000000100b
  
BlogCode.Bit(MyBitmap,1).IsOn; //returns FALSE
BlogCode.Bit(MyBitmap,2).IsOn;//returns FALSE 
BlogCode.Bit(MyBitmap,3).IsOn; //returns TRUE 
NewBitMap := BlogCode.Bit(MyBitmap,2).Flip;//turn on second bit
//making this -- 00000000000000000000000000000110b 
BlogCode.Bit(NewBitMap,1).IsOn;  //returns FALSE 
BlogCode.Bit(NewBitMap,2).IsOn;  //returns TRUE 
BlogCode.Bit(NewBitMap,3).IsOn;  //returns TRUE

ADDENDUM

When David Bayliss reviewed the above code, he said that I had “just inserted a floating point operation into low-level bit twiddling. This is HORRIBLE.” So he suggested an alternative method that just uses the bitshift left operator (<<) along with the Bitwise AND (&) and Bitwise OR (|) operators, like this:

EXPORT Bit(UNSIGNED4 Bitmap,UNSIGNED1 WhichBit) := MODULE

    EXPORT BOOLEAN   IsOn :=   Bitmap & (UNSIGNED4)(1 << WhichBit-1) <> 0;

    EXPORT UNSIGNED4 Flip := Bitmap | ((UNSIGNED4)1 << WhichBit-1);

  END;

This method removes the need for that POWER function.

So What Can You Do With Bitmaps?

Someone emailed me with this request:

“Thought you might have a trick up your sleeve to attack this type of encoding. Basically an integer value is used to represent a list of values where each bit represents a separate item.

“In the example below, each bit set represents an index into a string to get yet another code.

// Valid codes  " VURQLKGFEDCABIJOWY",
// Position      1234567890123456789
// Number                  position              Code
//   32 = 2^5,              5  + 1                L
// 4096 = 2^12              12 + 1                A
//  512 = 2^9               9  + 1                E
//  544 = 2^5 + 2^9                               LE

“How do we write ECL code to translate a number to a code?”

So here’s the first code I wrote to handle this specific problem:

IMPORT $;

EXPORT Bit2Code(UNSIGNED4 B, STRING32 Codes) := FUNCTION
  		Code32 := IF($.Bit(B,32).IsOn,Codes[32],'');
 		Code31 := IF($.Bit(B,31).IsOn,Codes[31],''); 
		Code30 := IF($.Bit(B,30).IsOn,Codes[30],'');
 		Code29 := IF($.Bit(B,29).IsOn,Codes[29],'');
 		Code28 := IF($.Bit(B,28).IsOn,Codes[28],'');
 		Code27 := IF($.Bit(B,27).IsOn,Codes[27],'');
 		Code26 := IF($.Bit(B,26).IsOn,Codes[26],'');
 		Code25 := IF($.Bit(B,25).IsOn,Codes[25],'');
 		Code24 := IF($.Bit(B,24).IsOn,Codes[24],'');
 		Code23 := IF($.Bit(B,23).IsOn,Codes[23],'');
 		Code22 := IF($.Bit(B,22).IsOn,Codes[22],'');
 		Code21 := IF($.Bit(B,21).IsOn,Codes[21],'');
 		Code20 := IF($.Bit(B,20).IsOn,Codes[20],'');
 		Code19 := IF($.Bit(B,19).IsOn,Codes[19],'');
 		Code18 := IF($.Bit(B,18).IsOn,Codes[18],'');
 		Code17 := IF($.Bit(B,17).IsOn,Codes[17],'');
 		Code16 := IF($.Bit(B,16).IsOn,Codes[16],'');
 		Code15 := IF($.Bit(B,15).IsOn,Codes[15],'');
 		Code14 := IF($.Bit(B,14).IsOn,Codes[14],'');
 		Code13 := IF($.Bit(B,13).IsOn,Codes[13],'');
 		Code12 := IF($.Bit(B,12).IsOn,Codes[12],'');
 		Code11 := IF($.Bit(B,11).IsOn,Codes[11],'');
 		Code10 := IF($.Bit(B,10).IsOn,Codes[10],'');
 		Code09 := IF($.Bit(B,09).IsOn,Codes[09],'');
 		Code08 := IF($.Bit(B,08).IsOn,Codes[08],'');
 		Code07 := IF($.Bit(B,07).IsOn,Codes[07],'');
 		Code06 := IF($.Bit(B,06).IsOn,Codes[06],'');
 		Code05 := IF($.Bit(B,05).IsOn,Codes[05],'');
 		Code04 := IF($.Bit(B,04).IsOn,Codes[04],'');
 		Code03 := IF($.Bit(B,03).IsOn,Codes[03],'');
 		Code02 := IF($.Bit(B,02).IsOn,Codes[02],'');
 		Code01 := IF($.Bit(B,01).IsOn,Codes[01],'');
 	  RETURN TRIM(Code01 + Code02 + Code03 + Code04 + Code05
  	            + Code06 + Code07 + Code08 + Code09 + Code10 
 	            + Code11 + Code12 + Code13 + Code14 + Code15
  	            + Code16 + Code17 + Code18 + Code19 + Code20
  	            + Code21 + Code22 + Code23 + Code24 + Code25
  	            + Code26 + Code27 + Code28 + Code29 + Code30
  	            + Code31 + Code32,ALL);
END;

This function takes two parameters: and UNSIGNED4 bitmap, and a STRING32 containing the valid codes for each position. This is a simple “brute force” approach that will test each bit in the bitmap and either take the code value for that position or a blank string. The TRIM function is using the ALL option to remove all spaces from the concatenated result string.

You can test the function in a builder window like this:

IMPORT BlogCode;

ValidCodes := ' VURQLKGFEDCABIJOWY';
output(BlogCode.Bit2Code(32,ValidCodes),named('Bit2Code_32'));		  //L
output(BlogCode.Bit2Code(4096,ValidCodes),named('Bit2Code_4096'));	//A
output(BlogCode.Bit2Code(512,ValidCodes),named('Bit2Code_512'));		//E
output(BlogCode.Bit2Code(544,ValidCodes),named('Bit2Code_544'));		//LE
output(BlogCode.Bit2Code(10000000000000000b,ValidCodes),named('Bit2Code_10000000000000000b'));	//O
output(BlogCode.Bit2Code(10000000010000000b,ValidCodes),named('Bit2Code_10000000010000000b'));	//GO

Once I had solved his specific problem, I decided to expand the solution to be a more generic tool. Instead of using the bitmap to indicate a set of single letter codes, why not have it indicate a set of strings of any length (up to 4K, in this example)? That’s what this next version does:

IMPORT $;

EXPORT Bit2String  := MODULE
   EXPORT Layout := RECORD
     STRING Txt{MAXLENGTH(4096)};
   END;

   EXPORT Func(UNSIGNED4 B, DATASET(Layout) Codes) := FUNCTION

      br := ROW({''},Layout);

  		Code32 := IF($.Bit(B,32).IsOn,Codes[32],br);
 		Code31 := IF($.Bit(B,31).IsOn,Codes[31],br);
 		Code30 := IF($.Bit(B,30).IsOn,Codes[30],br);
 		Code29 := IF($.Bit(B,29).IsOn,Codes[29],br);
 		Code28 := IF($.Bit(B,28).IsOn,Codes[28],br);
 		Code27 := IF($.Bit(B,27).IsOn,Codes[27],br);
 		Code26 := IF($.Bit(B,26).IsOn,Codes[26],br);
 		Code25 := IF($.Bit(B,25).IsOn,Codes[25],br);
 		Code24 := IF($.Bit(B,24).IsOn,Codes[24],br);
 		Code23 := IF($.Bit(B,23).IsOn,Codes[23],br);
 		Code22 := IF($.Bit(B,22).IsOn,Codes[22],br);
 		Code21 := IF($.Bit(B,21).IsOn,Codes[21],br); 
		Code20 := IF($.Bit(B,20).IsOn,Codes[20],br);
 		Code19 := IF($.Bit(B,19).IsOn,Codes[19],br); 
		Code18 := IF($.Bit(B,18).IsOn,Codes[18],br);
 		Code17 := IF($.Bit(B,17).IsOn,Codes[17],br);
 		Code16 := IF($.Bit(B,16).IsOn,Codes[16],br);
 		Code15 := IF($.Bit(B,15).IsOn,Codes[15],br);
 		Code14 := IF($.Bit(B,14).IsOn,Codes[14],br);
 		Code13 := IF($.Bit(B,13).IsOn,Codes[13],br);
 		Code12 := IF($.Bit(B,12).IsOn,Codes[12],br);
 		Code11 := IF($.Bit(B,11).IsOn,Codes[11],br);
 		Code10 := IF($.Bit(B,10).IsOn,Codes[10],br);
 		Code09 := IF($.Bit(B,09).IsOn,Codes[09],br);
 		Code08 := IF($.Bit(B,08).IsOn,Codes[08],br);
 		Code07 := IF($.Bit(B,07).IsOn,Codes[07],br);
 		Code06 := IF($.Bit(B,06).IsOn,Codes[06],br);
 		Code05 := IF($.Bit(B,05).IsOn,Codes[05],br);
 		Code04 := IF($.Bit(B,04).IsOn,Codes[04],br);
 		Code03 := IF($.Bit(B,03).IsOn,Codes[03],br);
 		Code02 := IF($.Bit(B,02).IsOn,Codes[02],br);
 		Code01 := IF($.Bit(B,01).IsOn,Codes[01],br);
 	  RETURN (Code01 + Code02 + Code03 + Code04 + Code05
  	            + Code06 + Code07 + Code08 + Code09 + Code10
  	            + Code11 + Code12 + Code13 + Code14 + Code15
  	            + Code16 + Code17 + Code18 + Code19 + Code20
  	            + Code21 + Code22 + Code23 + Code24 + Code25
  	            + Code26 + Code27 + Code28 + Code29 + Code30
  	            + Code31 + Code32)(Txt <> '');
  END;
END;

This code uses the same “brute force” approach, but instead of building characters for a string, it defines the records that will go into the result record set. That means that instead of using the TRIM function to get rid of blank spaces, we simply append all the result records into a record set that we filter to eliminate the blank records.

Testing this version is similar to the previous, but we pass the dataset of text values as the second parameter. Multiple bits turned on result in multiple records in the result set.

IMPORT BlogCode;
SetStrings := ['','The Voice','American Idol','The X Factor','Law & Order',
               'Lost','Glee','The Daily Show','Revenge','Hart of Dixie',
                'Walking Dead','True Blood','Sopranos','Game of Thrones',
                'Downton Abbey','Poirot','Rizzoli & Isles','Suits','Swamp People',
                'Pawn Stars','Firefly','AC 360','Fox & Friends','Hardball',
                'Mike & Molly','60 Minutes','The Ellen Show','Elementary','Sherlock',
                'Here Comes Honey Boo Boo','Doctor Who'];
ds := DATASET(SetStrings,BlogCode.Bit2String.Layout);

OUTPUT(BlogCode.Bit2String.Func(110011b,ds));
OUTPUT(BlogCode.Bit2String.Func(64,ds));
OUTPUT(BlogCode.Bit2String.Func(64928,ds));
OUTPUT(BlogCode.Bit2String.Func(0100000000000000000000000000000b,ds));

Bitmapped Dates

I was teaching an ECL class one day in Alpharetta, and we were having a discussion of the various ways you can store dates in ECL. Joe Keim came up with this solution.

This function will squeeze a standard YYYYMMDD 8-byte date string into three bytes of storage in an UNSIGNED3 bitmap. This format has the added advantage that it will still be perfectly sortable, just as a YYYYMMDD string would be.

EXPORT UNSIGNED3 Date2Bits(STRING8 YYYYMMDD) := FUNCTION
   YY := (UNSIGNED3)YYYYMMDD[1..4] << 9;
 	MM := (UNSIGNED3)YYYYMMDD[5..6] << 5;
 	DD := (UNSIGNED3)YYYYMMDD[7..8];
   RETURN YY | MM | DD;
 END;

This Date2Bits function takes the first four characters of the string (YYYY) and casts that value into an UNSIGNED3 that is then shifted left 9 bits. The next two characters of the string (MM) are also cast into an UNSIGNED3 that is then shifted left 5 bits. The last two characters of the string (DD) are simply cast into an UNSIGNED3.

Those three UNSIGNED3 values are then ORed together using the Bitwise OR operator (|) to create the resulting bit map, where:

YYYY is in first 15 bits MM is in next 4 bits DD is in last 5 bits

You can then use this Bits2Date function to reconstruct the YYYYMMDD string from the bitmap:

EXPORT STRING8 Bits2Date(UNSIGNED3 d) := FUNCTION
   STRING4 YY := INTFORMAT((d & 111111111111111000000000b >> 9),4,0);
   STRING2 MM := INTFORMAT((d & 000000000000000111100000b >> 5),2,1);
   STRING2 DD := INTFORMAT((d & 000000000000000000011111b),2,1);
   RETURN YY + MM + DD;
 END;

This function uses the Bitwise AND operator (&) against a binary mask, which is then shifted right 9 bits to produce the integer value for the YYYY. The INTFORMAT function then right-justifies that integer value into a 4-byte string with leading blanks. The MM and DD values are treated the same way, except their STRING2 results are formatted with leading zeroes. The final YYYYMMDD string is a simple concatenation of the three.

You can test these functions with code like this in a builder window:

IMPORT BlogCode;
d1 := '19500827';

UNSIGNED3 date1 := BlogCode.Date2Bits(d1);
date2 := BlogCode.Bits2Date(date1);

ds2 := DATASET([{'InDate',d1},
                 {'OutBitVal',Date1},
                 {'Re-Date',Date2}], 
               {string10 txt,string10 val});

OUTPUT(ds2,NAMED('Bits2Date_' + d1));

Viewing the Bitmap

Sometimes you just want to see what the bitmap actually looks like. You can, of course, just output the integer value, but a bitmap is best viewed in binary representation. So when Peter Vennel needed to create a string representation of a bitmap, here’s my code to do that:

IMPORT $;

EXPORT STRING32 Bit2Str(UNSIGNED4 Bin) := FUNCTION

   STRING1 Bin2Str(UNSIGNED1 Bit) := IF($.Bit(Bin,Bit).IsOn,'1','0');

    RETURN Bin2Str(32) + Bin2Str(31) + Bin2Str(30) + Bin2Str(29) +
           Bin2Str(28) + Bin2Str(27) + Bin2Str(26) + Bin2Str(25) +
           Bin2Str(24) + Bin2Str(23) + Bin2Str(22) + Bin2Str(21) +
           Bin2Str(20) + Bin2Str(19) + Bin2Str(18) + Bin2Str(17) +
           Bin2Str(16) + Bin2Str(15) + Bin2Str(14) + Bin2Str(13) +
           Bin2Str(12) + Bin2Str(11) + Bin2Str(10) + Bin2Str(9) + 
           Bin2Str(8)  + Bin2Str(7)  + Bin2Str(6)  + Bin2Str(5) +
           Bin2Str(4)  + Bin2Str(3)  + Bin2Str(2)  + Bin2Str(1);
  END;

This simple function just relies on the previously defined Bit.IsOn function to generate either a “1” or “0” character in each bit position. The result is just the concatenation of all 32 characters, showing you exactly what the bitmap looks like.

Test with this code in a builder window and you’ll see the result:

IMPORT BlogCode;
val := 532860;

OUTPUT(BlogCode.Bit2Str(6));   //00000000000000000000000000000110
OUTPUT(BlogCode.Bit2Str(val)); //00000000000010000010000101111100

Convert to Hexadecimal

I got an email from Jarvis Robinson, who needed a way to display text and the corresponding Hexadecimal values side by side. So I wrote this String2HexString function to do that:

EXPORT String2HexString(STRING DataIn) := FUNCTION

    STRING2 Str2Hex(STRING1 StrIn) := FUNCTION
     STRING1 HexVal(UNSIGNED1 val) :=
  		      CHOOSE(val,'1','2','3','4','5','6','7','8',              '9','A','B','C','D','E','F','0');
     UNSIGNED1 Char1 := (>UNSIGNED1<)StrIn >> 4;
     UNSIGNED1 Char2 := ((>UNSIGNED1<)StrIn & 00001111b);
     RETURN HexVal(Char1) + HexVal(Char2);
END;
Rec := {STRING Hex{MAXLENGTH(1024)}};
ds := DATASET(LENGTH(TRIM(DataIn)),
                 TRANSFORM(Rec,
   	            SELF.Hex := Str2Hex(DataIn[COUNTER])));
HexOut := ROLLUP(ds,TRUE,TRANSFORM(OutRec,SELF.Hex := LEFT.Hex + RIGHT.Hex));
RETURN DATASET([{DataIn,HexOut[1].Hex}],
                  {STRING Txt{MAXLENGTH(1024)},
                   STRING Hex{MAXLENGTH(1024)}});
END;

This function starts with the Str2Hex function nested within, which will produce two hexadecimal characters for a single string character. It has its own HexVal function that returns a the Hex for a given value. The Char1 definition uses the (>UNSIGNED1<) type transfer shorthand syntax to treat the StrIn character as an UNSIGNED1 value, so that the Bitshift right operator (>>) will work (all bitwise operations can only be done on integer types). This simply moves the left 4 bits to the righthand nibble, making the UNSIGNED1 a value from zero through 15. The Char2 definition also uses the (>UNSIGNED1<) type transfer shorthand syntax to treat the StrIn character as an UNSIGNED1 value, so that the Bitwise AND operator (&) will work. This simply removes the left 4 bits, making the UNSIGNED1 a value from zero through 15. The return value concatenates the two hex characters into the STRING2 result.

The key to this function being able to work with any size string (up to 512 bytes, as written, but you can modify that if needed), is the DATASET definition. This form of DATASET creates a new dataset with as many records as there are characters in the StrIn parameter. More importantly, the inline TRANSFORM populates each record with the Hexadecimal equivalent to each chacter in the StrIn.

That leaves only the ROLLUP to composite all those records into a single one with all the Hex values concatenated into a single string. The inline form of DATASET then allows a nicely formatted one-record result to present both the input text and its hex equivalent as the result.

Testing the code this way will show its usefulness:

IMPORT BlogCode;
STRING dat := 'efghtBCD'; //x'6566676809424344'

OUTPUT(BlogCode.String2HexString(dat));

Note that the input string contains a tab (t) character, which can be difficult to see in text mode. But looking at the result in hex mode, you can clearly see the 09 hex representation of the tab.

That’s enough for this article. We’ll continue with more useful code examples in subsequent articles.

Getting Started with HPCC Systems

Getting Started with HPCC Systems

Let’s get started

Detailed documentation

Detailed documentation

Detailed documentation

Check out the Wiki

HPCC Systems Training

HPCC Systems Training

HPCC Systems Training

HPCC Systems Training

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

Tips and Tricks for ECL — Part 1 — Bit Fiddling