The Value of Precise Semantics

There is a movement that says there should be one, and only one, way to perform any given task. This concept, famously championed by Python, has some merit. It allows for simple and easily understandable language syntax and it tends to mean that short stretches of code have an obvious semantic.

The flipside is readily appreciated if one imagines that we suddenly imposed a rule that everyone had to write using 4th Grade English; suddenly the simplicity has become unwieldy verbosity. In such a language one typically finds that common-sequences of code become memorized and typed idiomatically into the editor; thus the language does have higher level structures – they are just unique to the individual programmer and unknown to the compiler.

ECL has a rather different approach. ECL is designed to allow the programmers’ intent to be specified as accurately, concisely and simply as possible; in that order. In order to be as generically applicable as possible it has a good range of entirely general functions that could be used, ‘Python like’, to do everything. But it also has some very specific and tailored functions designed to do exactly what you want as efficiently as possible; both in terms of execution and expression.

To turn a theoretical discussion concrete let us consider the problem of the conditional expression: an expression the value of which depends upon one or more Boolean expressions. The general solution for this is the IF statement. Thus

IF ( a > 10, IF ( a > 15, 2, 1 ), IF ( a < 0, IF ( a < -10, -2, -1 ), 0 ));

The IF statement is perfectly general purpose; any flow of control can be handled by nesting the IF statements appropriately. If every IF has a nested IF down to some depth then you have a binary decision tree, if you only partially fill one or more branches then you can have any shape of tree you wish. In my example I always tested a value ‘a’; but you can actually use a different variable or combination of variables at each level. This is roughly equivalent to nested IF/ELSE statements in C++.

Suppose, however, that you have a very distinct flow of control: a cascade. You really want to try conditions one after another until you find the first one that is true. You can, of course, use IF; but if you want to make a glaring, obvious statement that this is a cascade then you can: enter the MAP statement.

MAP ( a > 15 => 2, a > 10 => 1, a < -10 => -2, a < 0 => -1, 0);

Now a quick challenge: how many of you saw what I was doing with the MAP straight away? Had the IF really conveyed that to you? I believe that the improvement in visual clarity given by MAP easily outweighs the need to learn a second syntax. Like the IF statement the MAP allows each Boolean expression to be arbitrary. MAP corresponds to ELSIF in languages such as PERL.

But what about the case where the cascade follows a distinct pattern; specifically testing a single variable for a value? You can use MAP or IF; but you can also make it a little clearer what you are doing using CASE:

CASE(MyString,’FRED’=>1,’EDITH’=>2,’JOHN’=>3,0);

Again a new syntax to be learned but once learned the semantic: “check a particular value against a list of them” screams from the page. This is a piece of code very often seen when encoding a string as a numeric for space. I actually have a macro to generate case statements from data; available at http://www.dabhand.org/ECL/BuildCase.htm

There is one even more specific case: where you have a numeric from 1->n and you want to pick the item from a list. In that case one uses CHOOSE:

CHOOSE(MyInt,’FRED’,’EDITH’,’JOHN’,’UNKNOWN’)

There are some even more specialist functions such as WHICH, REJECTED and CHOOSESETS that I might cover at a later date. However, for this blog I really just want to invite the reader to look at the different syntaxes and ask yourself a couple of questions: 1) how long would it take to learn the four syntaxes rather than just one 2) doesn’t the code read rather better when you have the right tool for the job?

Assuming, or at least hoping, I have convinced you; it still leaves the question: “which should I used?” My rule of thumb is very simple – always use the most precise function you can without having to ‘bend’ anything. The reasoning is twofold: firstly it allows someone ‘glancing’ at the code to readily see what you are doing and secondly it gives the compiler the strongest ‘heads up’ and therefore the highest chance of generating good code.