HPCC Systems Blog

HPCC Systems blog contributors are engineers and data scientists who for years have enabled LexisNexis customers to use big data to fulfill critical missions, gain competitive advantage, or unearth new discoveries. Check this blog regularly for insights into how HPCC Systems technology can put big data to work for your own organization.

by Lorraine Chapman
on 04/20/2017

Wherever you look these days, analysts are providing visual representations of the data they mine to help businesses make decisions. Since data is our business, this is something we have been doing for some time now, using the visualization capabilities built in to HPCC Systems®.  

The new HPCC Systems Visualizer Bundle allows you to visualize results generated by your ECL code rather than relying on javascript.

by Jessica Lorti
on 04/07/2017

On March 30, 2017, HPCC Systems hosted the latest edition of The Download: Tech Talks.  These technically-focused talks are for the community, by the community.  The Download: Tech Talks is intended to provide continuing education through high quality content and meaningful development insight throughout the year.

by Lorraine Chapman
on 04/05/2017

One of the many improvements coming your way this year is a complete restructuring of the HPCC Systems Machine Learning Library. Our ML library may be used alongside the HPCC Systems platform, ECL IDE and the various (and increasing) numbers of embedded language plugins and third party modules that you can use to tailor HPCC Systems to meet your specific needs.

The restructuring is an ongoing process which is likely to run on into 2018, so this won’t be the only post you read about it. Once complete, the HPCC Systems Machine Learning Library will perform better, be easier to use and come supported with more extensive documentation and examples.

by Lorraine Chapman
on 03/27/2017

Shweta Oak is studying for a Bachelor of Engineering at the Sardar Patel Institute of Technology in India, which is affiliated to Mumbai University. Shweta applied to complete a project as part of the HPCC Systems intern program in 2016. Unfortunately, we were not able to offer her a place. Most students at this stage would probably have moved on, but not Shweta! She emailed to ask whether there was a project she could work on as a regular contributor to get some experience to support her interest in machine learning. So, I put her in touch with our Machine Learning project leader, John Holt who supported her work on a project about non-negative matrix factorization (NMF).

The overall aim of this project is to implement an ECL version of the Small K project implemented by Georgia Tech. Shweta was tasked with evaluating different algorithms, to assess which might be the best to use to implement NMF in ECL.

by Richard Chapman
on 03/02/2017

When reading a disk file in ECL, the layout of the file is specified in the ECL code. This allows the code to be compiled to access the data very efficiently, but can cause issues if the file on disk is actually using a different layout. In particular, it can present a challenge to the version control process, if you have ECL queries that are being changed to add functionality, but which need to be applied without modification to datafiles whose layout is changing on a different timeline.

We have had a partial solution to this dilemma available in Roxie for index files for a while, with the ability to apply runtime translation from the fields in the physical index file to the fields specified in the index. However it suffered from significant potential overhead and was not available for flat files or on Thor. Until now...