
Complete data lake management
Why use HPCC Systems?
Because it’s better at bigger data. Our comprehensive, dedicated data lake platform makes combining different types of data easier and faster than competing platforms — even data stored in massive, mixed schema data lakes — and it scales very quickly as your data needs grow. It’s also open source, free to use, and easy to learn. You can acquire, enrich, deliver and curate information faster using HPCC Systems — and save time and money, now and in the future.
- Open source data lake platform
- Batch, real-time and streaming data ingestion
- Built-in data enhancement and Machine Learning APIs
- Scalable to many petabytes of data
- Runs on commodity hardware and in the cloud
- Increased responsiveness to customers and stakeholders

Free. Fast. Open source.
The HPCC Systems stack consists of a full suite of components that cater to every aspect of your data workflow. Click to watch.
Case studies
Organizations have used HPCC Systems in demanding production environments for more than a decade, making it the most proven solution of its type. Learn how innovators are using the HPCC Systems platform in these detailed case studies.
Why these organizations use HPCC Systems
Complete capabilities for every aspect of your workflow
Our tools make managing your data easy.
Tombolo catalogs all the data assets in your data lake, including relationships.
ECL Cloud IDE makes learning ECL easy by providing rich integration between data and code without needing to install client software.
Our powerful data engines execute automatically in a highly performing, parallel work stream.
Thor, our data refinery engine, let’s you take control of data transformation. Thor can easily profile, clean, enhance, transform, and analyze mixed-schema data.
ROXIE is an index-based search engine that performs real-time queries through a variety of interfaces including SOAP, XML, REST, and SQL.
Support for tools such as Couchbase, MySQL, Kafka, and MariaDB enable real-time data ingestion for live stream IoT workloads.
The platform also leverages Tensorflow to perform CPU-based Neural Networks Learning, giving you the simplicity of ECL and the power of Tensorflow.
Built in libraries include scripts for information extraction, profiling, cleaning, normalization, and analytics.
Our Machine Learning library works efficiently in a parallel distributed environment, so you can execute Machine Learning algorithms without moving data to a new platform.
Seamless integration makes it easy to deliver the flexibility your clients need.
Our platform offers robust connectivity options and integrates with a number of third-party solutions to make data lake management as easy and seamless as possible.
ECL provides a fast, powerful coding experience from ingestion to information delivery.
ECL is an easy-to-learn, advanced, and flexible declarative language that was initially developed for complex data scenarios more than 20 years ago and has been tested and refined continuously ever since.
Ready to dive in?
HPCC Systems is free and open source, so you can test and implement it without making a big investment. Visit our Get Started page to explore the power of HPCC Systems. Whether you want to try out our ECL playground or download the full program, we've made it easy for you to get started using HPCC Systems in less than an hour.
To continue reading about the platform, checkout the About page or the HPCC Systems User Guide. If you’re looking for more technical instruction, visit our Training page to find online or in person classes.