Using your favorite language or data source with HPCC Systems
This blog has been updated to include plugins and connectors that are now supported by HPCC Systems. For those of you who have been here before, updates are clearly marked so you can jump right in.
The full list of supported languages, plugins and connectors is also available on our wiki, including links to other information you might find useful, such as using the ECL Language and using TensorFlow with HPCC Systems. Also, see the list of available ECL bundles providing you with the flexibility to extend your use of HPCC Systems in other ways, for example, using machine learning algorithms, visualisations etc.
Just because HPCC Systems can be used with our ECL programming language, does not mean this is the only language you can use to query your data. You can embed a number of different languages within your ECL code. Not only this, you can process data on a HPCC Systems cluster from a variety of different sources using the various plugins and connectors we provide specifically to help you bridge the gap.
HPCC Systems already supports the embedding of Java, R, C++, Cassandra, Python, SQL and SqLite in your ECL code. There is a separate blog you can read about the use of the EMBED feature which provides all the details you need to know for each of these languages. You can also read what Richard Chapman (VP, Research and Development and HPCC Systems platform lead architect) has to say about embedding C++ efficiently. Check out the sources and readmes for each of these embeddable languages in Github.
We also supply plugins for Redis and Memcached, where values can be set and retrieved by key simply by making calls to functions in the plugin. We support the Redis publish/subscribe option and more details about the usage of this plugin can be found in the Redis readme. There is also a readme for Memcached providing details about this plugin and how to get started using it.
We are continuing to extend our reach in this area and now support the following languages which are supplied as plugins to the HPCC Systems platform:
- Spark – Our spark integration enables the use of various languages to interact with HPCC Systems data (Java/Scala), allowing users to process that data in the Spark environment. We also support writing Spark based datasets to HPCC Systems. More information is available in this blog Leveraging the Spark-HPCC Ecosystem and the plugin is available for download on our website.
- Kafka – We support the streaming of data to HPCC Systems using Apache Kafka via a Spring Framework (http://spring.io) based HTTP REST server. More information is available in our hpcc-streaming-kafka GitHub repository.
- Couchbase – HPCC Systems provides an embedded ECL Couchbase plugin which can be downloaded from our website. Supporting documentation and information about dependencies is available in our libcouchbase GitHub repository.
Add ons and connectors
Here is a list of add ons and connectors we provide to help you process data from other sources on HPCC Systems:
- WsSQL – Provides an SQL interface to HPCC Systems.
- ECL Data Integration Plugins for Pentaho – Make big data development as easy as drag and drop
- R Integration – Quickly integrate to the HPCC Systems platform by writing ECL queries using R. Find out which features are currently available.
- JDBC driver – Connect to HPCC Systems via your favorite JDBC client and access your data without writing a single line of ECL.
- Web data connector for Tableau – Tableau Desktop users can now access local copies of HPCC Systems data. Learn more by reading this blog post about this connector.
- DFSClient (Distributed File System Client) – Distributed data ingestion and extraction library which uses internal HPCC Systems binaries to efficiently read and write data remotely in parallel. It supports generic and custom dataset creation and translation through IRecordBuilder and IRecordAccessor interfaces. This add on allows the efficient reading of HPCC Systems data, and the writing of external data into HPCC Systems from Java environments. Examples of how to use this add on can be found in the HPCC Systems HPCC4J Project Guide and the sources are available in our hpcc4j GitHub repository.
- WSClient (Web Service Client) more commonly known as JAPI – This API standardizes and facilitates interaction with HPCC Systems Web based Services (ESP Web services), providing a mechanism for actuating HPCC Systems web service methods. Find our more about this add on which provides Java users access to all our ECL Watch services.
So although HPCC Systems is an end to end solution, it doesn’t have to be used that way if it doesn’t suit you. If you want to continue using your favorite language or datastore but need to use the processing power of a high performance computing cluster, go for it! The HPCC Systems platform is all about usability and interoperability. The embedded language capability and variety of connectors that it supports are a reflection of this.
We’re chipping away at a list of other sources we’d like to support. But if we don’t provide the embedded language, plugin or connector you need, why not contribute the code to provide it yourself? Take a look at the sources to see how others have added theirs and there is even a template for the supporting documentation.
- Download the latest version of HPCC Systems, (selecting Windows as the operating system to download ECL IDE) and the supporting documentation.
- Find out more about ECL (Enterprise Control Language) and take an online training course.
- Find out more about how HPCC Systems works, watch some video tutorials and case studies.
- Listen to some of our users talk about how they have used HPCC Systems to deliver results, fast.
- Find out how to contribute to our open source project.