HPCC Systems 8.0.0 – Cross Platform Highlights
While the main focus of HPCC Systems 8.0.0 Gold is the Cloud Native platform, there are many new features and improvements users across both the Cloud Native and Bare Metal platforms will be pleased to see implemented. Some of them may follow on from changes made in recent minor releases, so you might also like to catch up on cross platform features and enhancements mentioned in the following blogs:
More details about our Cloud Native platform are available on this wiki page which includes a number of resources to help you get started, including blog posts, helm file links and quick start videos. Specific details of new features and enhancements now available on the Cloud Native platform are available in our HPCC Systems 8.0.0 – Cloud Native Platform Highlights blog.
In this blog, I’m going to direct you to the main cross platform highlights, although there may be additional fixes that haven’t made it on to my list! So I always recommend taking a look at the Release Notes and Red Book for additional information and to see the complete list of changes made.
As usual, improvements made in HPCC Systems 8.0.0 cover a wide range of areas:
- Performance and usability including user interfaces such as ECL Watch, the ECL Playground and VS Code
- Security
- ECL Language
- Dynamic ESDL
- Connectors and Plugins
In addition, a selection of significant fixes get a mention alongside the main features and enhancements.
Performance and Usability Improvements
The platform development team is always on the lookout for opportunities to make our platform even faster and easier to use. Changes included here focus on making your queries run faster, or generally improving the way components operate.
Performance
In the world of ROXIE, there are some changes that have a positive impact on general usage. ROXIE now recovers better from dropped UDP packets which makes it much more resilient to potential network problems. ROXIE also now handles over sized continuation data much more efficiently, which is great news, particularly for those using STEPPED index reads.
Also in ROXIE, an issue which could cause CPU starvation or other difficult to predict effects, has been fixed in HPCC Systems 8.0.0. In this case, ROXIE was not operating as intended. While there is a work around available for earlier versions of HPCC Systems (which involves setting the prestartAgentThreads option to true in the ROXIE configuration), it is now fixed and more details are available in this JIRA issue Roxie worker threads were incorrectly given elevated priority if started on demand.
These additional changes will also have a substantial impact on ROXIE performance:
- Improve the efficiency of the index LRU cache
- Improve cache behaviour for concurrent page requests
- Reduce the overhead of the index node cache
In addition to the ROXIE performance enhancements mentioned above, the memory required to compile complex queries has been reduced, which may help compile times when memory consumption is close to the limits.
Usability
In terms of usability, the following deserve a special mention:
- Original casing of ECL attribute names is now preserved in the dependencies.xml file
- It is now easier to locate an activity that is failing due to a Soapcall made with a malformed URL
- Copying Data between Bare Metal and Cloud Native environments. The Despray and Copy functions now generate a description of the sprayed file ready for importing. This adds to previous work completed in this area including allowing logical files to be desprayed to an Azure blob target and the ability to import a description of an exported file
- DFU Workuits now indicates the type of spray used by specifying delimited, JSON, XML or fixed
- Invalid XML WsDFUXRef response
The following issue is of special note, because the fix may potentially change the way existing queries behave:
UI Usability Improvements – ECL Watch and the ECL Playground
There are two new UI improvements that are now available for use in ECL Watch that I particularly want to mention. Both were suggested by users and although they may seem like small changes, they will certainly enhance your use of ECL Watch in the areas concerned.
- The Workunit Details page now shows all SOAPCALLS used within a query. This feature now makes it very easy to answer the ‘what gateways is this query using’ question
- ECL Watch now displays work unit skew for Thor jobs. A link to node IP addresses has been added to the Thor Disk Usage page and a skew summary is now shown on the Output tab in the Workunit details
In the ECL Playground, a Publish option has been added. The interpretation of the ECL Playground Queue has been improved to accurately distinguish between submitted jobs which need to be run on the Thor cluster, as opposed to queries which are deployed to ROXIE using the Publish command (not Run).
Security
As always, I’d like to draw your attention to some security improvements in a number off different areas.
HPCC Systems now supports Hashicorp Vault, which is a 3rd party secrets management service. We have also implemented Encryption in transit on ROXIE, by adding an option to encrypt all traffic within the ROXIE cluster.
We have also implemented security improvements for VS Code and ECL Watch. The ws_codesign service has been extended to provide a new verify service so that the VS Code plugin can verify signatures and ECL Watch is now excluded from search engine results.
ECL Language
A number of changes have been made to the ECL SendEmail Standard Library function.
- High priority messages are now supported. If highPriority is set to true the message is sent with high priority. The default setting is false (normal priority)
- Cc and Bcc are now supported by the standard library SendEmail ECL function.
- Email recipient count limitations have been lifted. Previously, the number of To, Cc and Bcc recipients was restricted to 1000 characters
- There are now two new SetExpiryDays options in the Standard File Library which can be used to set or clear an expire using fileservices
Documentation about the SendMail Standard Library function is available in the HPCC Systems Standard Library Reference Guide.
The following language feature, already supported in ROXIE, is now also supported in Thor. Documentation about how this function works is available in the ECL Language Reference Guide.
Dynamic ESDL
Features of particular note for this area in HPCC Systems 8.0.0 include enhancements to the ESDL integration script, which can now make mysql calls using the same mysql plugin as ECL and you can take advantage of the following newly supported features:
- Reading and writing to MySQL databases.
- Create services/methods purely written in ESDL integration scripts. No backend ROXIE, web service, Java or C++ is required.
- Call one or more Roxie queries or other web services from within scripts
- The ability to change the target and source locations on the fly
- Building of structured output
- Dynamically tokenizing strings. Tokens may be iterated or accessed by position using standard script syntax.
- Building structured variables
- Dynamically adding namespaces to target content
The dynamic log manager configuration is now available in ESDL bindings and WSDL namespace generation has also been improved.
Connectors and Plugins
Our integrated Spark plugin (find out more here) has been upgraded to version 3.1.1.
Our Java API (HPCC4J) project provides a set of Java based libraries and tools which facilitate interaction between HPCC Systems Web Services and C++ based tools. The following improvements have been made to this project in HPCC Systems 8.0.0:
- Support for Code Signing/verifying via WsCodesign
- XRef support via WsDFUXRef
- Top Level Key support
- HPCC Systems File random access support (via DAFILESRV)
- DFSClient: Ability to resume a suspended file read or after a failure
- Spark Connector added server side filtering support
- Avro and Parquet file support is now available
Download, Release Notes, Documentation and Training
HPCC Systems Bare Metal platform users can download the latest version along with the Release Notes and any supporting documentation required in the usual way, from the HPCC Systems Website.
Keep up to date with known issues and workarounds using the HPCC Systems Red Book.
Visit our Training Center for online courses in Learning ECL, Machine Learning, HPCC Systems Management and Administration.
Next Steps
Future development work focuses on the following areas:
- Providing significant performance enhancements, specifically looking at the performance of the heavy usage of some compiles
- Security features and enhancements
- Improvements to ECL Watch using the React framework
HPCC Systems is Celebrating 10 Years as an Open Source Big Data Analytics Platform
Join us as we mark this anniversary event with users, colleagues, ambassadors and collaborators via a series of video podcasts. It’s great to reflect on how we got to where we are today with the stories shared in this series and look forward to what may lie ahead in the future. View the full list of podcasts on our 10 Year Anniversary Podcast Series Wiki.
Featured Podcast
Join Flavio Villanustre (VP Technology and CISO, LexisNexis Risk Solutions Group) Krishna Turlapathi (Director, Software Engineering) and Ken Rowland (Consulting Software Engineer) as they discuss looking at different metrics comparing our Bare Metal and Cloud Native Platforms, including ROXIE performance.