Thu Oct 21, 2021 6:06 pm
Login Register Lost Password? Contact Us


Fully Qualified String Field Definitions

Questions around writing code and queries

Mon Dec 19, 2016 4:24 pm Change Time Zone

I have a question regarding STRING field definitions.

Am I better off to fully qualify my STRING fields or allow them to be variable length?

For example I am working with a data file called BuildFax which contains multiple string data elements which can be up to 1000 characters in length.

When I define the ECL fields as STRING1000 the strings are padded and difficult to view in ECL Watch.

If I define the ECL fields simply as STRING, the string fields are adjusted to the lengthof the field value and much easier to read in ECL Watch.

With regards to my question, does either option affect the size of my dataset in memory or on disk?

What is the best practice I should follow?

Thanks
Chuck
chuck.beam
 
Posts: 20
Joined: Mon Dec 21, 2015 2:59 pm

Tue Jan 03, 2017 4:02 pm Change Time Zone

Chuck,
Am I better off to fully qualify my STRING fields or allow them to be variable length?
The standard answer to this question is:
  • IF you know the string is always going to contain n number of characters (like a US state code or zipcode field)
  • OR the string will always contain 1 to n characters where n is a small number and the average length of the actual data approaches the max (like most street address fields)
  • THEN you should define that field as a STRINGn.
  • ELSE IF n is a large number and the average length of the data is small compared to the maximum
  • THEN variable-length STRING would be best.
With regards to my question, does either option affect the size of my dataset in memory or on disk?
Yes, both affect the storage and memory size:
  • Fixed-length fields are always stored at their defined length
  • Variable-length STRING fields are stored with a leading 4-byte integer indicating the actual number of characters following that instance (like a Pascal string)
Therefore, if you define a string field that always contains 2 characters as a STRING2 it occupies two bytes of storage, but define it as a STRING and it will occupy six.

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1600
Joined: Wed Oct 26, 2011 7:40 pm

Tue Jan 03, 2017 4:08 pm Change Time Zone

Thanks Richard,

Based on your comments, I should use the unqualified STRING for my current code.

The fields are defined in the raw data as STRING255, but they rarely contains any data at all (most records are blank).

This would require the least amount of space on disk and in memory, correct?

Chuck
chuck.beam
 
Posts: 20
Joined: Mon Dec 21, 2015 2:59 pm

Tue Jan 03, 2017 4:13 pm Change Time Zone

Chuck,

Yes, absolutely. If the string field is sparsely populated but possibly long when it does have data, then STRING is the way I would go.

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1600
Joined: Wed Oct 26, 2011 7:40 pm


Return to Programming

Who is online

Users browsing this forum: No registered users and 1 guest

cron