Skip to main content

HPCC Systems blog contributors are engineers and data scientists who for years have enabled LexisNexis customers to use big data to fulfill critical missions, gain competitive advantage, or unearth new discoveries. Check this blog regularly for insights into how HPCC Systems technology can put big data to work for your own organization.

Flavio Villanustre on 12/14/2012

As the end of the 2012 calendar year approaches, at least for a good chunk of the world (and may come to an end on 12/21 for a crazy bunch), some people start celebrating holidays in different cultures and countries. I consider this season a good time to go over things to come in the HPCC Systems platform arena.

Flavio Villanustre on 11/16/2012

I often get asked about comparing the HPCC Systems platform and Hadoop. As many of you probably know already, there are a number of substantial differences between them, and several of these differences are described here.

Renato Golin on 11/05/2012

See the Introduction for this article here.

Previous chapter: Step 3: The Optimisation, and More Tests.

This step is based on the following commit:

Renato Golin on 11/05/2012

See the Introduction for this article here.

Previous chapter: Step 2: The Distributed Flag, and Execution Tests.

This step is based on the following commit:

Renato Golin on 11/05/2012

See the Introduction for this article here.

Previous chapter: Step 1: The Parser, The Expression Tree and the Activity.

Renato Golin on 11/05/2012

See the Introduction for this article here.

This part of the tutorial refers to the commit bellow:

Git commit: DATASET (N, transform(COUNTER))
https://github.com/hpcc-systems/HPCC-Platform/pull/1285/files

Renato Golin on 11/05/2012

This tutorial will walk you though adding a new feature in the compiler, making sure it executes correctly in the engines, and performing some basic optimisations such as replacing and inlining expressions.

When adding features to the compiler, there are two main places where you have to add code: the compiler itself, including the parser, the expression builder and exporter, and the engines (Roxie, Thor and HThor), including the common graph node representation.

Flavio Villanustre on 07/20/2012

As I was preparing the Keynote that I delivered at World-Comp'12, about Machine Learning on the HPCC Systems platform, it occurred to me that it was important to remark that when dealing with big data and machine learning, most of the time and effort is usually spent on the data ETL (Extraction, Transformation and Loading) and feature extraction process, and not on the specific learning algorithm applied.

Renato Golin on 06/27/2012

The ECL compiler

When ECL code is compiled, the internal representation is a graph of expressions that correlates each ECL instruction as a dependency to others. The compiler then walks through this expression graph, looking for patterns, dead code, common expressions and so on, until supposedly optimal code is printed at the end.

Renato Golin on 06/20/2012

HPCC's distributed file system has the concept of SuperFiles, a
collection of files with the same format, that is used to aggregate
data and automate disk reads.