The Value of Precise Semantics

There is a movement that says there should be one, and only one, way to perform any given task. This concept, famously championed by Python, has some merit. It allows for simple and easily understandable language syntax and it tends to mean that short stretches of code have an obvious semantic.

The flipside is readily appreciated if one imagines that we suddenly imposed a rule that everyone had to write using 4th Grade English; suddenly the simplicity has become unwieldy verbosity. In such a language one typically finds that common-sequences of code become memorized and typed idiomatically into the editor; thus the language does have higher level structures – they are just unique to the individual programmer and unknown to the compiler.

ECL has a rather different approach. ECL is designed to allow the programmers’ intent to be specified as accurately, concisely and simply as possible; in that order. In order to be as generically applicable as possible it has a good range of entirely general functions that could be used, ‘Python like’, to do everything. But it also has some very specific and tailored functions designed to do exactly what you want as efficiently as possible; both in terms of execution and expression.

To turn a theoretical discussion concrete let us consider the problem of the conditional expression: an expression the value of which depends upon one or more Boolean expressions. The general solution for this is the IF statement. Thus

IF ( a > 10, IF ( a > 15, 2, 1 ), IF ( a < 0, IF ( a < -10, -2, -1 ), 0 ));


The IF statement is perfectly general purpose; any flow of control can be handled by nesting the IF statements appropriately. If every IF has a nested IF down to some depth then you have a binary decision tree, if you only partially fill one or more branches then you can have any shape of tree you wish. In my example I always tested a value ‘a’; but you can actually use a different variable or combination of variables at each level. This is roughly equivalent to nested IF/ELSE statements in C++.
Suppose, however, that you have a very distinct flow of control: a cascade. You really want to try conditions one after another until you find the first one that is true. You can, of course, use IF; but if you want to make a glaring, obvious statement that this is a cascade then you can: enter the MAP statement.
MAP ( a > 15 => 2, a > 10 => 1, a < -10 => -2, a < 0 => -1, 0);

Now a quick challenge: how many of you saw what I was doing with the MAP straight away? Had the IF really conveyed that to you? I believe that the improvement in visual clarity given by MAP easily outweighs the need to learn a second syntax. Like the IF statement the MAP allows each Boolean expression to be arbitrary. MAP corresponds to ELSIF in languages such as PERL.
But what about the case where the cascade follows a distinct pattern; specifically testing a single variable for a value? You can use MAP or IF; but you can also make it a little clearer what you are doing using CASE:
CASE(MyString,’FRED’=>1,’EDITH’=>2,’JOHN’=>3,0);

Again a new syntax to be learned but once learned the semantic: “check a particular value against a list of them” screams from the page. This is a piece of code very often seen when encoding a string as a numeric for space. I actually have a macro to generate case statements from data; available at http://www.dabhand.org/ECL/BuildCase.htm
There is one even more specific case: where you have a numeric from 1->n and you want to pick the item from a list. In that case one uses CHOOSE:
CHOOSE(MyInt,’FRED’,’EDITH’,’JOHN’,’UNKNOWN’)

There are some even more specialist functions such as WHICH, REJECTED and CHOOSESETS that I might cover at a later date. However, for this blog I really just want to invite the reader to look at the different syntaxes and ask yourself a couple of questions: 1) how long would it take to learn the four syntaxes rather than just one 2) doesn’t the code read rather better when you have the right tool for the job?
Assuming, or at least hoping, I have convinced you; it still leaves the question: “which should I used?” My rule of thumb is very simple – always use the most precise function you can without having to ‘bend’ anything. The reasoning is twofold: firstly it allows someone ‘glancing’ at the code to readily see what you are doing and secondly it gives the compiler the strongest ‘heads up’ and therefore the highest chance of generating good code.


						
	
					
				
	Share this


				
	
					
				
	Share this
	
			
							
				
	
		Categories
	
	
		
			
																							Blogs
																									Community
																									ECL & Code Snippets

Getting Started with HPCC Systems

Getting Started with HPCC Systems

Let’s get started

Detailed documentation

Detailed documentation

Detailed documentation

Check out the Wiki

HPCC Systems Training

HPCC Systems Training

HPCC Systems Training

HPCC Systems Training

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

Welcome to the HPCC Systems developer community!

The Value of Precise Semantics