Getting up and running with HPCC Systems

Note: This blog has been updated for the most current versions of the ECL IDE and ECL Watch as well as all the current play cluster information.

If you are a first-time user who is curious to learn what HPCC Systems can do, or have been using our bare metal version for some time and would like to test drive our Cloud Native Platform, why not take a look using our play cluster. This blog includes a number of useful takeaways, providing you with a step by step guide to using the ECL IDE or VS Code as well as ECL Watch.

The following provides a high level overview, designed to help new users get started with HPCC Systems, and ECL (Enterprise Control Language). If you want to jump right in and set up your own HPCC Systems cloud native cluster, use our Cloud Native Platform wiki page, where you will find everything you need to get setup.

Future blogs in this series are in progress, focusing on:

  • Importing and working with new data
  • Setting up your own local cluster via Docker Desktop and Kubernetes
  • Comparing the ECL IDE vs Visual Studio code’s ECL Extension

What is HPCC Systems

In case you have not heard of HPCC Systems or ECL, here is a little bit of background.

Development on HPCC Systems began in late 1999. The aim was to provide a robust and easy to use End-to-End Data-Lake management technology platform, providing everything you need from data ingestion and data processing right the way through to data delivery to the end user. A more complete overview of the technology can be found here

In 2011, HPCC Systems became an open-source platform and is now available for anyone to use for their own data analytics solutions (See us celebrating 10 years of open source in 2021 here). While it can provide you with everything you need for your End-to-End solutions, HPCC Systems is also flexible enough that you can also choose to use it to ingest and prepare your data for use elsewhere.

HPCC Systems can take exponential amounts of data and; break it apart, sort it, utilize the pertinent data to make accurate, significant connections; and then deliver the cleaned data to clients for their use. Our data libraries house petabytes of information, primed for reference at a moment’s notice by major insurance and legal entities running information checks. Our customers run HPCC Systems in data warehouses across the country that house the data and our bare metal clusters, however, only having on-site facilities definitely has some vulnerabilities.

Cloud Native

Protecting the massive amounts of sensitive data from possible outages is a priority for every major company. One way to do that is to shift, from using only bare metal clusters to a cloud native or hybrid platform. This allows the data to always be accessed, queried, and delivered to the customers with very little delay. Utilizing the cloud native platform provides a cost-effective solution to businesses who would otherwise not have the ability to build a data center.  As well, if the data needs of a smaller business suddenly grows beyond their capacity to handle an increased volume of data, a cloud native platform can easily scale to their needs almost instantaneously!

Going to a cloud native platform reduces down time if your data center experiences an outage. It also means that depending on the amount of data you are processing you may have a substantial cost savings associated with not having a physical data center to manage. Data can be accessed from across the network with built in fail safes so you will always be able to get the information you need.

WHY ECL?

If you would like to fully ingest the history of ECL, please read through the White Paper, which explores how ECL works, how it tackles data problems, and how it’s an easier programming language to use compared to other solutions. 

“Enterprise Control Language (ECL) is the query and control language developed to manage the HPCC (High Performance Cluster Computing) and truly differentiates it from other technologies in its ability to easily and efficiently provide flexible data analysis on a massive scale.” -David Bayliss (SVP and Chief Data Scientist) 

ECL is a robust high level analytical coding language that performs complex operations while offering a friendly interface to work with for the user. If you want to learn more about how to use ECL, visit our online ECL training courses, which provide resources for beginners all the way to the advanced user.

Since the creation of ECL, it has been primarily implemented using the ECL IDE (Integrated Development Environment), however in the past few years other alternative IDEs such as Eclipse and Visual Studio Code have surfaced. The ECL IDE is still the in-house code builder originally designed for ECL.

An ECL extension has been made available for Visual Studio which is a code builder that supports a variety of languages.

Only one  of these applications is required to work with ECL code and the play cluster but there are pros and cons to both which will be covered in the last part of this series. As a new user, I would recommend learning both. Each have their own advantages and at this point one is not superior to the other which will be explored another time.

Let’s first look at installing the ECL IDE and connecting it to a sample cluster, followed by the same process using the ECL Extension for VS Code.

THE ECL IDE

HPCC Systems was designed for use with the average computer in mind. You can use literally any modern computer and a stable internet connection to play with our product. When you are ready to dip your toe in the water, visit the Download page on the HPCC Systems website.

Note: The ECL IDE only runs on the Windows operating systems. If you are using a Mac you will need to download Visual Studio and install the ECL extension. There will be a more in-depth look into VS Code with the ECL Extension below.

Download the ECL IDE and Client Tools. The Client Tools portion contains the essential compiler and the engine that generates the work units. More information on using the ECL IDE and Client Tools is available here.

Once you’ve gone to the Download page, scroll down till you see the download steps and follow the instructions on the page.

Once you have the IDE installed, you will need to open it and connect to a cluster.

Note: The first time you open ECL IDE the preferences tab will open automatically because it knows it needs to be configured. Once you have an established connection to a cluster, the IDE will remember your choices for future use.

Setting up the Server ID in Preferences

Note: If the Preferences window doesn’t automatically open it can be accessed in the bottom left corner of the Configuration window.

The HPCC Systems play cluster is available 24/7 for users to use to try out the platform. and is designed to introduce new people to how the software works. It is strictly a sample of HPCC Systems. The play cluster’s return times are not a good indicator of the how powerful HPCC Systems and ECL are, at full strength.

To access the play cluster using the ECL IDE, enter the following information in the fields as seen below:

play.HPCCSystems.com

Note: Make sure the server field does not include the preface http:// which will not work. Sometimes when using copy/paste, it auto populates with http://play.HPCCSystems:18010. 

In the Server field enter play.HPCCSystems.com

Then click on SSL and Advanced to enable those settings. Highlight the Attribute Server field and delete its contents.

After the Attributes have been deleted you can now click OK. You do not need to click Apply, it is automatically done when you click ok.

Now that you have completed setting up the preferences all that is left is to finish the configuration window.

A few things to point out for you if you are new to using the ECL IDE

  • Leave Configuration set to default for now. If in the future you need to connect with other clusters you can add different configurations and easily switch between them.
  • The Login ID you choose needs to correspond with the work you do in ECL Watch. Typically, your first name followed by last initial will be sufficient. Unless you find someone else working on the cluster that happens to have the same name as you then I recommend you use something more unique.
  • Password is not necessary, so just enter the username you wish to work with and then click ok only after you have entered the server information under the preferences button. (This will only need to be done the first time)
  • The Error Log window located at the bottom right, can be closed once you are logged in, by clicking the “x” in the top right hand corner. The Syntax Errors window displays any information needed to indicate if there are any issues with your code.

Now you are officially connected to the play cluster and can begin writing ECL and exploring the capabilities of HPCC Systems.

Using the ECL Extension for VS Code

For those who wish to use VS Code to explore ECL, Download it here and follow their setup wizard for the basic install. 

Note: You will still need to download the client tools from the HPCC Systems website for the ECL Extension to work.

Once you have Visual Studio open, you’ll need to add the ECL Extension by going to View on the top toolbar and selecting Extensions or use the Show Extensions shortcut. Ctrl + Shift + X (Windows) or ⇧⌘X (MAC).

Then type ECL in the search bar and select ECL Extension.

Once you’ve installed the ECL Extension you need to connect VS Code to the Play Cluster.

If not already on your screen locate the launch.json configuration area and follow the arrows below.

Visual Studio Code – Play Cluster configuration

  • “name”: “Play”
  • “protocol”: “https” – The “s” is needs to be added for it to connect properly.
  • “serverAddress”: “play.hpccsystems.com”
  • “port”: 18010
  • “user”: The Login ID you choose needs to correspond with the work you do in ECL Watch. Typically, your first name followed by last initial will be sufficient. Unless you find someone else working on the cluster that happens to have the same name as you then I recommend you use something more unique.

Here are the inks for the Visual Studio Code Hot keys for Windows and MAC.

ECL Watch

ECL Watch is used to monitor what is happening with the clusters, although it is much more than that. ECL Watch is a Web-based Query Execution, Monitoring, and File Management Tool, which includes an interface for file Imports and Exports.

Recently the software has been updated. Any build prior to 8.6.4 will appear quite different. Some major upgrades can be found between its legacy version compared to the new modern version. There is a Transition Guide available on the wiki to help walk you through the changes and the main user documentation for ECL Watch is available here.

You are now officially connected to the play cluster! You can access the ECL Watch here and explore the interface.

Another alternative if you are a PC user and want to run the HPCC Systems Platform, the simplest and most natural environment may be to use a Hyper-V Virtual Machine. Hyper-V is standard on many versions of Windows, and tends to work better than most add-on virtualization systems. Please read Running HPCC Systems Platform on Microsoft Hyper-V  written by Lili Xu (Software Engineer III, LexisNexis Risk Solutions Group) for the full instructions.

Stay tuned for part 2 in this series, where we will use these tools in greater depth to sort freshly imported data sourced from the web.

George S Foreman joined the LexisNexis Risk Solutions team in early 2022 as a Product Development Technical Writer. An experienced communicator, George has held diverse positions in both B2C and B2B industries with a focus on writing Standard Operating Procedures, process improvement documentation, and marketing collateral. George brings a deep electrical and mechanical background, which enables him to bring a new perspective to the suite of documentation and training resources available for HPCC Systems and ECL users.

When not writing about his favorite big data platform, George can usually be found enjoying the company of his family and four children in the Florida sunshine.