With a given amount of cluster hardware and facing a given big data processing task, the HPCC Systems platform enables you to complete the task more quickly than competing big data software solutions. It requires substantially less coding, and the code will execute faster. The HPCC Systems platform’s greater efficiency and its superior performance for a given cluster size means you can solve big data problems faster, or more economically, or a combination of both.
The following benchmark tests provide some apples-to-apples performance comparisons between the HPCC Systems platform and a prominent competing solution, Hadoop.
Results show that an HPCC Systems four node Thor cluster took only 98 seconds to complete a Terasort with a job size of 100 gigabytes (GB) on a cluster five times smaller than Hadoop. The HPCC Systems platform ran the test in three lines of ECL code compared to previous tests that took more than 700 lines of Java MapReduce code in the Hadoop equivalent.
PigMix is a set of 17 Pig programs that are used as a benchmark to measure the comparative performance of the Pig programming language versus hand-coded Java running in a Hadoop environment. The algorithms were chosen and coded by the Pig community and should be representative of what Pig is used for and embody best practices for how to do it. The benchmark tests found that ECL outperforms Pig and Java significantly on the Hadoop PigMix benchmark on an identical hardware configuration. Across all tests, ECL was an average 4.45x faster than Pig and 3.23x faster than hand-coded Java.