Sun Aug 14, 2022 4:14 am
Login Register Lost Password? Contact Us

Please Note: The HPCC Systems forums are moving to Stack Overflow. We invite you to post your questions on Stack Overflow utilizing the tag hpcc-ecl ( This legacy forum will be active and monitored during our transition to Stack Overflow but will become read only beginning September 1, 2022.

Difference between unicode and varunicode

A place for developers to ask questions about contributing to the open source code base

Sun Dec 20, 2020 6:09 pm Change Time Zone

Hi Team,

Can u share the details of difference between unicode and varunicode,
String and string[n] with an example

Thanks in advance,
Manikandan N.
Posts: 11
Joined: Wed Jan 23, 2019 2:15 pm

Mon Dec 21, 2020 4:06 pm Change Time Zone

Hello Manikandan,

From a string length perspective, the main difference between UNICODE and VARUNICODE value types in ECL lies in the fact that the latter utilizes a null character to indicate its termination (i.e., it is null-terminated or C String), whereas the first does not.

Considering that the UNICODE value type in ECL refers to a UTF-16 encoded unicode character string, in practical terms, the VARUNICODE value type will require two extra bytes (due to the null character at the end of the string) for storing the same UNICODE string. See the example below:

Code: Select all

MyData :=   DATASET([{'A','A'}],MyRec);

SIZEOF(MyData.F1); //result is 2 (16-bit code unit per char)
SIZEOF(MyData.F2); //result is 4 (16-bit code unit per char plus a 16-bit code unit for the null terminator)

As for the comparison between STRING and STRING[n] value types in ECL, again from a string length perspective; in the latter you are specifying the maximum length of the string in bytes (i.e., the 'n' value), whereas in the first you are leaving the maximum length information variable to the size needed to contain the passed value (this information is then stored as a leading 4-byte integer indicating the actual number of characters for the string, similar to a Pascal string).

In practical terms, and depending on the characteristics of your data, the right choice between these two value types can have an impact in both storage and memory usage of your data manipulations. See a more detailed discussion focused on this type of decision here:

Posts: 28
Joined: Mon Apr 15, 2019 1:22 am

Return to Contributors

Who is online

Users browsing this forum: No registered users and 1 guest