Skip to main content

HPCC Systems blog contributors are engineers, data scientists, and fellow community members who want to share knowledge, tips, and other helpful information happening in the HPCC Systems community. Check this blog regularly for insights into how HPCC Systems technology can put big data analytics to work for your own needs.

Flavio Villanustre on 12/09/2019
Flavio Villanustre (VP, Technology and CISO, LexisNexis Risk Solutions), recounts the advice he recently gave to a friend who found that increasing the amount of memory on his Thor cluster didn't impact performance as significantly as expected. Read on to hear Flavio's advice about how to approach improving the performance of Thor and why increasing the amount of memory isn't necessarily the answer.
Cassandra Walker on 12/02/2019
The RELX Group Information Assurance and Data Protection organization (IADP) provides oversight of privacy, security, and compliance practices as part of the company’s comprehensive risk mitigation program. The IADP generally works with Risk Solutions and Legal and Professional business, focusing on PII (personal identifiable information) and SPII (sensitive identifiable information), that are available through LexisNexis online products.
Bahar Fardanian on 11/20/2019
Wouldn’t it be great to show our future generation what the tech industry looks like? In a great move, CodeDay is showing our younger generation, high school and middle school students, tools and technologies that are being used in today’s world. Giving them an idea of what their future looks like and how they can take a role in it when it comes to college.
Lili Xu on 11/14/2019
In this blog, I will introduce another clustering bundle: DBSCAN Bundle, a highly scalable and parallelized implementation of DBSCAN algorithm. DBSCAN is a density-based unsupervised machine learning algorithm to automatically cluster the data into subclasses or groups.
Lorraine Chapman on 11/14/2019
The proposal application period for internships in 2020 has now closed. The proposal period for 2021 will open in the Fall, Read on to find out more about our intern program, how it works and how to apply.
Arjuna Chala on 11/08/2019
The next generation of Data Scientists will be tasked to solve problems around what IDC predicts will be 175 zettabytes of data by 2025. As part of helping universities educate students on solving Big Data problems at scale, LexisNexis Risk Solutions is promoting the open source HPCC Systems Data Lake platform by sponsoring hackathons, workshops, internships, and research programs.
Russ Whitehead on 11/06/2019
In order to conduct the complex analytics that bring meaning to data, big data platforms require access to massive amounts of potentially sensitive data. And, no matter how powerful or easy to use a big data platform is, it can become a serious liability if it isn’t properly secured. Which begs the question: What is required to properly secure a big data platform from unauthorized access or data theft?
Jessica Lorti on 11/01/2019
In this month’s “5 Questions” interview series, Flavio Villanustre talks with Allan Wroebel. Allan is a senior software engineer at LexisNexis Risk Solutions and a long-time ECL user. Initially working with data operations, Allan now serves as an ECL developer on both Thor and ROXIE.
Cassandra Walker on 10/22/2019
This ECL Tip spotlights the Enterprise Control Language (ECL) AGGREGATE built-in function. ECL AGGREGATE has been seen by many in the community as ‘complex,’ and as such, has been underused. However, in using AGGREGATE you can be sure you’re playing to the strengths of HPCC Systems.
Russ Whitehead on 09/27/2019
As of HPCC Systems version 7.6.0, a new cryptographic module has been added to the ECL Standard Library. Within this module, there is an assortment of cryptographic features available to ECL developers to utilize in order to safeguard their sensitive data, using industry standard cryptographic algorithms. Features include digital hash algorithms, symmetric and asymmetric encryption and decryption, and digital signatures, all of which can be applied to individual columns within an ECL dataset.