HPCC Frequently Asked Questions - General Information and Capabilities

Printer-friendly version
Printer-friendly versionHPCC Systems currently offers training in Atlanta, GA, San Francisco, CA and Boca Raton, FL. Onsite training at your location is also available for large groups. Minimum of 6 students is required. Maximum of 20 students. Price is based on per person plus travel expenses. View our schedule of upcoming training classes or contact us for more information.
Printer-friendly versionHPCC Systems offers a variety of training classes and tracks geared to high-level managers, developers and administrators who want to learn the HPCC platform components including ECL, Thor, Roxie, ECL IDE and maintaining the environment. View our recommended learning tracks to see which training program is best suitable for your role.
Printer-friendly versionHPCC Systems offers distinct HPCC training programs suitable for different roles within your organization from basic introductory to advanced training classes. In addition we offer free training videos on several topics. View our current schedule of upcoming training.
Printer-friendly versionHPCC (High Performance Computing Cluster) stores and processes large quantities of data, processing billions of records per second using massive parallel processing technology. Large amounts of data across disparate data sources can be accessed, analyzed and manipulated in fractions of seconds. HPCC functions as both a processing and a distributed data storage environment, capable of analyzing terabytes of information.
Printer-friendly version

The HPCC has been in active development and use for over 10 years.

Printer-friendly version

This technology has been proven in the marketplace for the past ten years. Our HPCC technology powers the products and solutions of the LexisNexis Risk Solutions business unit, whose mission is to provide essential insights to advance and protect people, industry and society. LexisNexis Risk Solutions customers include top government agencies, insurance carriers, banks and financial institutions, health care organizations, credit card issuers, top retail card issuers, cell phone providers and a range of other industries. HPCC technology is also used to provide enhanced content to the new Lexis electronic products that serve legal, academic and research industries.

Printer-friendly version

Yes. Starting at the lowest level HPCC generates C++ and not Java; that immediately gives it an efficiency advantage. HPCC has also been in critical production environments for over a decade. The time and effort placed in individual components give a tangible performance boost. Our analysis of ECL executing code translated directly from the PigMix shows an average performance improvement of 3.7x.

That said, the real performance of the HPCC begins to show when the ECL language is used to its fullest to express data problems in their most natural form. In the hands of a skilled coder, speed improvements in excess of an order of magnitude are common, and two orders of magnitude are not out of the question.

Printer-friendly version

Yes. HPCC works over the internet and/or over a private network. It also operates on either distributed or centralized systems.

Printer-friendly version

No. The HPCC is not a traditional transactional database.

Printer-friendly version

HPCC is completely scalable, capable of meeting any database need regardless of size. It can be used for almost any data-centric task.

Printer-friendly version

You can call queries deployed on HPCC using SOAP and REST/JSON. You can also use a web form which is provided for testing.

Printer-friendly version

Although we do not currently test this configuration, our source code is available for developers to explore these possibilities. Currently, only Client Tools is supported on Apple OSX.

Printer-friendly version

Yes. The HPCC Thor works well on Amazon AWS EC2. More information is available in the Install Thor on AWS documentation.

Printer-friendly version

The HPCC is built from the ground up to work as a single cohesive super computer. Managing and developing solutions for the HPCC is far simpler.

Printer-friendly version

Historically Beowulf clusters have defined their space in the field of computational analysis and mathematics. HPCC is designed for the purpose of data manipulation and is geared for that specific purpose.

For example, in a Beowulf Cluster the programmer explicitly controls the inter-node communication via a facility such as MPI (Message Passing Interface) to perform a global data operation; while in an HPCC system the inter-node communication is performed implicitly.

Printer-friendly version

ECL (Enterprise Control Language) is a programming language designed and used with the HPCC system. It is specifically designed for data management and query processing. ECL code is written using the ECL IDE programming development tool.

ECL is a transparent and implicitly parallel programming language which is both powerful and flexible. It is optimized for data-intensive operations, declarative, non-procedural and dataflow oriented. ECL uses intuitive syntax which is modular, reusable, extensible and highly productive. It combines data representation and algorithm implementation.

Learn more about ECL.

Printer-friendly version

The ECL IDE is an integrated development environment for the ECL language designed to make ECL coding easy and programmer-friendly. Using the ECL IDE you can build, edit and execute ECL queries, and mix and match your data with any of the ECL built-in functions and/or definitions that you have created.

The ECL IDE offers a built-in Attribute Editor, Syntax Checking, and ECL Repository Access. You can execute queries and review your results interactively, making the ECL IDE a robust and powerful programming tool.

For a more detailed look at the ECL IDE, see the HPCC Data Tutorial that provides a walk-through of the development process from beginning to end using the ECL IDE.

Printer-friendly version

Roxie (Rapid Online XML Inquiry Engine) is the data delivery engine used in HPCC to serve data quickly and can support many thousands of requests per node per second.

Printer-friendly version

Thor (The Data Refinery Cluster) is responsible for consuming vast amounts of data, transforming, linking and indexing that data. It functions as a distributed file system with parallel processing power spread across several nodes. A cluster can scale from a single node to thousands of nodes.

Printer-friendly versionWhether you are using an ISAM (Indexed Sequential Access Method) or SQL file system, the HPCC can be a great resource for analyzing and reporting on your existing data, particularly if your data is starting to get very large and hard to manage on your existing system. All that you need to do is export your data files to either a fixed length format, CSV (comma separated values) format, or XML format, and then copy them to the HPCC Landing (or Drop) Zone and spray them to the THOR Data Refinery in HPCC. After that, your related files can be quickly joined and transformed using one of many ECL transformation functions. Your results can be stored in new tables and later indexed for faster access on the Roxie Data Delivery Engine which is also built-in to your HPCC.
Printer-friendly version

Big Data is a term that refers to very large (e.g., tera or petabyte) data sets and secure storage facilities that are created and manipulated by hardware and software tools, and the processes and procedures used behind them to do this.

As a leading information provider, LexisNexis has more than 35 years experience in managing big data, from publicly available information such as worldwide newspapers, magazines, articles, research, case law, legal regulations, periodicals, and journals – to public records such as bankruptcies, liens, judgments, real estate records – to other types of information.

To manage, sort, link, and analyze billions of records within sub-seconds, LexisNexis Risk Solutions designed a data intensive supercomputer built on our own high performing computing cluster (HPCC) platform that is proven for the past 10 years with customers who need to sort through billons of records. Customers such as leading banks, insurance companies, utilities, law enforcement and Federal government depend on LexisNexis technology and information solutions to help them make better decisions faster.

Printer-friendly version

To manage, sort, link, and analyze billions of records within seconds, LexisNexis Risk Solutions designed a data intensive supercomputer that has been proven for the past 10 years with customers who need to process billons of records within seconds. Customers such as leading banks, insurance companies, utilities, law enforcement and Federal government depend on LexisNexis Risk Solutions. LexisNexis has offered this platform as an open source solution under HPCC Systems. LexisNexis Risk Solutions is a $1.5 billion business unit of LexisNexis, a $6 billion information solutions company. LexisNexis is owned by Reed Elsevier, which had revenues of $12 billion in 2010.

Printer-friendly version

ESP (Enterprise Services Platform) provides an easy to use interface to access ECL queries using XML, HTTP, SOAP (Simple Object Access Protocol) and REST (Representational State Transfer).

Contact Us

email us   Email us
Toll-free   US: 1.877.316.9669
International   Intl: 1.678.694.2200

Sign up to get updates through
our social media channels:

facebook  twitter  LinkedIn  Google+  Meetup  rss  Mailing Lists

Get Started