HPCC Systems 7.4.0 is now available for download
We are very pleased to announce that HPCC Systems 7.4.0 is now available for download.
View the release notes for the full list of changes. The supporting documentation is available on our website and we recommend that you browse the RedBook for additional information about specific items of note. Highlights of this release include:
Security
Spark users
Embedded Java users
ECL Standard Library Features
ECL Language New Features
ECL IDE
ECL Watch
We are working towards providing feedback to users regarding workunits that might not be performing optimally, by spotting some common errors/symptoms that we see on slow workunits. These new features are examples of this approach:
Documentation
Have questions or need a tip? Post in our Developer forum.
Found an issue or have a feature request? Raise a JIRA issue.
Want to contribute? Walk through the process and take a look at the notes for developers included in the readme in our GitHub repository.
Subscribe to our Newsletter.
Read our blog.
Join us at one of our monthly Tech Talk webcasts.
View the release notes for the full list of changes. The supporting documentation is available on our website and we recommend that you browse the RedBook for additional information about specific items of note. Highlights of this release include:
Security
- DALI Whitelisting - The effect and purpose of this new feature is to prevent unauthorized DALI clients communicating with DALI. If an unauthorized client tries to connect to DALI, it will be rejected with an "Access denied" type error.
All components defined in the local environment (i.e. the environment that Dali belongs to), will automatically be whitelisted. So the feature should have no visible effect if there is zero access to DALI from clients outside the environment. DALI clients that are not defined in the environment, will need explicitly whitelisting. Entries can be added/removed and the environment updated in DALI without restarting the environment. This feature is switched on by default.
Spark users
- We now support Spark 2.4 which includes UDF aggregation through pandas. This extends Spark default aggregation functions and therefore its analytics capabilities.
- Implemented DataSource API version 1 - Support for DataSource API v1 allows Spark-HPCC Systems users to use a more familiar API that is unified between Scala & Python. The new DataSource API also increases performance significantly for Python reading and writing.
- There is a new Spark filter to HPCC Systems filter converter
- We now support a query limit in DFSCLIENT which allows users to fetch a small sample of data to look at it, rather than having to wait until the entire dataset is available.
- We have added support for reading XML and CSV files and removed the flat file restriction to allow users to leverage this significant improvement.
Embedded Java users
- ACTIVITY flag improvements. When using the Activity flag on embedded code, it is sometimes useful within that code to get information about the number of slaves, channels etc and the slave that this particular machine is running on. This information was not available in Java embed code prior to this improvement.
ECL Standard Library Features
- The DataPatterns bundle, which provides some basic data profiling and research tools, is now included in the standard library.
- New Std.File functions that allow users to control the choice of dfuserver to use for spraying. This may be useful in many cases for example, where there are multiple dfuservers available and one is busy, or to choose one that may be faster for the source and target locations.
- New EditDistanceWithinMax function. Provides the ability to return the edit distance up to a maximum which adds to the set of existing and related functions in this area which provide the edit distance with no maximum and edit distance within a maximum which returns true or false.
ECL Language New Features
ECL IDE
- A Chrome engine has now been added into the ECL IDE to display the embedded ECL Watch HTML pages.
ECL Watch
- Improvements to Target Clusters and Cluster Processes including Column Sorting
- Graphs now pop out to enable you to see more of your graph (instead of re-rendering in a new browser tab).
We are working towards providing feedback to users regarding workunits that might not be performing optimally, by spotting some common errors/symptoms that we see on slow workunits. These new features are examples of this approach:
- New analyzeWorkunit # option and configuration option – After ECL queries are executed (Thor jobs only) the workunit may be analysed to identify any potential issues, which are reported in ECL Watch’s Warnings and Errors section.
- Analyse workunit stats for distribute output row skews - In this case we are able to spot that the output of a DISTRIBUTE was very skewed and thus performance of the job was significantly slower than it would have been otherwise. Sometimes it isn’t possible to avoid skew in such cases, but it can often indicate a poor choice of join condition.
Documentation
- Brazilian Portuguese versions of the documentation have been added to the sources and will be made available via the HPCC Systems website in the coming weeks.
Have questions or need a tip? Post in our Developer forum.
Found an issue or have a feature request? Raise a JIRA issue.
Want to contribute? Walk through the process and take a look at the notes for developers included in the readme in our GitHub repository.
Subscribe to our Newsletter.
Read our blog.
Join us at one of our monthly Tech Talk webcasts.
- lchapman
- Posts: 79
- Joined: Wed May 11, 2011 11:49 am
1 post
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 1 guest