White Papers

Our collection of white papers provide in-depth analysis and discussion of the components of the HPCC Systems platform and how they stack up to other market place solutions.

How it Works

How it Works: End-to-end Data Lake Management

Ryan Evans – September 01, 2020

Increase responsiveness and scalability while reducing costs associated with extremely large, unstructured datasets.

How it Works: Introduction to HPCC Systems

Ryan Evans – September 01, 2020

This white paper provides an introduction to the HPCC Systems Platform that solves large data processing problem.

How it Works: ECL an Overview

Ryan Evans – September 01, 2020

Learn about Enterprise Control Language (ECL), the data-oriented programming language that sets HPCC Systems apart from other big data solutions.

How it Works: Models for Big Data

Ryan Evans – September 01, 2020

This paper explores data models used for big data processing and shows how the preferred technology is one that can flexibly move between models.

Data Management

Data Management: Data Lake Curation and Governance with Tombolo

Ryan Evans – September 01, 2020

Conduct curation and governance operations in an automated fashion to consistently and reliably curate huge amounts of inbound new data and ensure the continuous availability of the Data Lake.

Examples

Examples: Math & the Multi-Component Keys

Ryan Evans – September 01, 2020

This paper explains the math behind the phenomenal performance of the multi-component key within the HPCC Systems platform and ECL.

Examples: Data Intensive Computing Solutions

Ryan Evans – September 01, 2020

Data-intensive computing is a new computing paradigm that enables big data applications previously thought to be impractical or infeasible.

Examples: Cyber Security Analytics

Ryan Evans – September 01, 2020

This paper shows how HPCC Systems technology can counter cyber threats that seek to exploit the data deluge that’s swamping enterprises and governments.

Advanced Topics

Advanced Topics: HPCC Systems FUSE

Ryan Evans – September 01, 2020

The FUSE driver for HPCC Systems enables you to use Filesystem in User Space (FUSE) technology to interact with files on an HPCC Systems cluster.

Advanced Topics: Social Network Analytics & Fraud

Ryan Evans – September 01, 2020

Learn how social network analytics are helping to thwart organized health care fraud.

Advanced Topics: Thinking Declaratively

Ryan Evans – September 01, 2020

This paper helps developers to think declaratively and shows the power of ECL, the declarative programming language designed to solve big data problems.

Advanced Topics: Using Juju Charm to Configure and Deploy HPCC Systems® on Amazon Web Services (AWS)

Ryan Evans – September 01, 2020

The FUSE driver for HPCC Systems enables you to use Filesystem in User Space (FUSE) technology to interact with files on an HPCC Systems cluster.

Advanced Topics: Using HtS3 to Deploy HPCC Systems and Save and Restore Files

Ryan Evans – September 01, 2020

This paper describes how to setup and use HtS3 to configure and deploy HPCC Systems to AWS; as well as save and restore files on the deployed Thor cluster.

Advanced Topics: Lambda Architecture and HPCC Systems

Ryan Evans – September 01, 2020

Lambda Architecture is a set of platform-agnostic principles and best practices for Big Data processing systems. This whitepaper explains how HPCC Systems is a naturally-evolved example of the Lambda Architecture.

Performance

Performance: Aggregate Data Analysis

Ryan Evans – September 01, 2020

Learn how the HPCC Systems platform and ECL language surpass traditional RDBMS and SQL as a solution to today’s big data challenges.

Performance: Performing in the Pig-Pen

Ryan Evans – September 01, 2020

This set of benchmark tests shows how applications written in ECL outperform Hadoop applications written in PIG or Java, for identical big data tasks.

Performance: Intelligent ETL

Ryan Evans – September 01, 2020

This paper shows how HPCC Systems technology solves previously intractable extract, transform, and load (ETL) problems for massive datasets.

Performance: ECL for Hadoopers

Ryan Evans – September 01, 2020

If you’re a Hadoop user and you want to know the ECL equivalents of common Hadoop functions, this is the paper for you.

Performance: ECL for PIGgers

Ryan Evans – September 01, 2020

This paper helps PIG users get up to speed in ECL as quickly as possible.

Performance: HPCC Systems with Cisco Unified Computing System

Ryan Evans – September 01, 2020

LexisNexis has collaborated with Cisco to offer a high-performance analytics platform that is scalable, flexible, and cost effective.

Customer White Papers

Customer White Papers: ClearFunnel and HPCC Systems

Ryan Evans – September 01, 2020

This study is a collaboration between HPCC Systems and ClearFunnel for bringing into focus the real world, multi-year experience of a cloud-based Big Data and Data Science startup in successfully building an advanced analytics business based upon using a homogeneous technology stack.

Books

These handbooks are designed as a reference for researchers, programmers, business managers, entrepreneurs and investors within the big data industry.

Definitive HPCC Systems: Data Transformation and Delivery

Ryan Evans – September 01, 2020

Written by HPCC Systems Chief Trainer and Senior Consulting Software Engineer, Richard Taylor

These Definitive HPCC Systems books are a three volume series to introduce the HPCC Systems platform to anyone interested in evaluating it for use on their own big data projects.

This second volume is an in-depth introduction to the ECL programming language used on HPCC Systems environments.

Definitive HPCC Systems: Overview/Platform Setup

Ryan Evans – September 01, 2020

Written by HPCC Systems Chief Trainer and Senior Consulting Software Engineer, Richard Taylor

These Definitive HPCC Systems books are a three volume series to introduce the HPCC Systems platform to anyone interested in evaluating it for use on their own big data projects.

This first volume is an overview of the platform’s infrastructure and design that shows how to quickly ramp up HPCC Systems environments for proofs of concept, then move on to production.

Big Data Technologies and Applications

Ryan Evans – September 01, 2020

Furht, Borko, Villanustre, Flavio (Eds.) 1st Edition, 2016, XVIII, 400 p. 118 illus.
Describes real-life solutions using big data analytics
Covers wide-ranging applications such as security, fraud, and machine learning
Describes various data intensive applications
Intended for a wide variety of people including researchers, scientists, programmers, engineers, designers, developers, educators, and students

Handbook of Data Intensive Computing

Ryan Evans – September 01, 2020

Furht, Borko; Escalante, Armando (Eds.) 2011, 2011, XVIII, 793 p. 297 illus.
Describes and evaluates the current state-of-the-art in new field
Presents current systems, and applications from main research labs in this new explosive field
Written at a level that business managers, entrepreneurs, and investors will find beneficial

Handbook of Cloud Computing

Ryan Evans – September 01, 2020

Furht, Borko; Escalante, Armando (Eds.) 1st Edition., 2010, XIX, 634 p. 230 illus.
Includes contributions from world experts working in academia, research institutions and industry
Offers case studies, examples and exercises throughout
Covers systems, tools, and services of leading providers such as Google, Yahoo, Amazon, IBM, and Microsoft