To get the latest release candidate of our open source, high performance computing solution, go to the HPCC Systems Downloads page, select Candidate, the Operating System you require and click Apply. New to HPCC Systems? Find out more about how it works, our open souce community and how to get started.
This is the first in a series of blogs covering the major performance, usability and new features ready for you to use in the HPCC Systems 6.2.x series. The focus of this blog is some of the more significant performance improvements, which are a continuation of work started in HPCC Systems 6.0.0, particularly multi-core support.
Parallel child query execution in Thor
In 6.0.0, we implemented the ability to execute child queries in parallel on Roxie and in 6.2.0 we have extended this to include Thor, which can now also execute child queries on multiple threads. This is particularly beneficial when using the parallel activity execution ‘stranding’ (discussed in my previous blog), where child queries are involved otherwise their execution can result in serial operation despite the parallel strands because each strand would block others while executing each child query.
Memory Manager improvements
Many recent releases have included memory manager optimizations. Although sometimes changes made in this area may seem small when taken individually, the overall impact of a number of them made at the same time, contributes to what can be quite a significant overall performance boost. Some of them may be related and in other cases they build on the improvements already made, for example, by removing bottlenecks. There are a number of memory manager improvements in 6.2.0. The one I’d like to draw your attention to is linked with the implementation of block allocators of rows. The improved performance gained from this change is particularly significant when using parallel activity execution. It is yet another change that enables queries to run faster on machines with processors that have lots of cores. More improvements are coming in this area in 6.4.0 for queries using a PARALLEL PROJECT which will run significantly faster.
Lookup joins in child queries
If a lookup join is used in a child query where the right hand side of the lookup join is independent of the parent record, it is not necessary (and very inefficient) to calculate the right hand side and build the lookup table every time the child query is executed. In 6.2 it is possible to provide a HINT to the Thor engine that the RHS does not need to be re-evaluated every time, which can result in very significant performance improvements. A future version will extend the code generator to spot such cases automatically, without requiring the HINT to be used. An example of the type of code that can benefit from such a HINT is shown in the JIRA issue.
New ECL Keyword LIKELY/UNLIKELY
The new LIKELY keyword (and it’s UNLIKELY counterpart) can be used to give hints to the code generator about whether a condition is going to filter most of the records out, or just a few. This can be used by the code generator to affect resourcing decisions, for example, whether it’s better to spill versus duplicate the code. In 6.2.0, the effect on generated code is fairly minimal, but future improvements in the code generator may be able to make much more use of this information, so it’s worth starting to use the keywords now to get the maximum benefit. This feature is particularly relevant to generated code such as SALT and KEL, which know exactly what the likelihoods are for many of the filters they generate:
The next blog in this series will highlight the recent improvements to embedded language support featuring projecting fields into EMBED, constant folding and some details about the new Couchbase plugin.
- The full list of JIRA issues resolved fixed in 6.2.0 is available using this filter.
- The complete list of the HPCC Systems Open Source Project roadmap items resolved fixed in 6.2.0 is available in this JIRA filter.
- Read about the major features and enhancements made in the parent release 6.0.0 in the following blog posts: Feature highlights part one and feature highlights part 2.
- View the list of roadmap items completed in 6.0.0.