HPCC Systems 10 Year Open Source Anniversary
Celebrating 10 years of innovation, growth, and success in a vibrant community
HPCC Systems is celebrating 10 years as an open source platform with a new podcast series. This series will feature core members of the HPCC Systems community who contribute to the success of HPCC Systems through innovation, advocacy, and education.
This initial installment of the series features Arjuna Chala and Flavio Villanustre, discussing the early days of HPCC Systems and where we are today.
The full video recording of Arjuna and Flavio’s discussion is available on the HPCC Systems YouTube channel, under the “10 Year Anniversary Podcast Series” playlist.
The audio version is available on Soundcloud.
HPCC Systems has achieved a great deal of success over the years. This is a tremendous accomplishment, considering that HPCC Systems began the open source journey when the Hadoop ecosystem was progressing and evolving in the open source big data community. In this podcast, Flavio and Arjuna reflect on sharing that space at a time when there was very little knowledge in the community about HPCC Systems. Taking HPCC Systems open source was an opportunity to rise to the challenge of educating people on what big data really means, based on the longstanding knowledge and experience that LexisNexis® Risk Solutions has in high-performance computing and big data analytics.
Topics discussed in the podcast include:
- The 80% Problem
- HPCC Systems COVID-19 Portal
- High-Quality Platform
- ECL (Enterprise Control Language)
- Success stories
The 80% Problem
The focus of HPCC Systems has been on solving the “80% problem.” ETL (extract, transform, load), which consists of taking raw data, cleaning it, manipulating it, and then converting it into usable data is the most important step when processing data. This is where most of the work involving big data occurs. HPCC Systems provides a way to execute all ETL processes within the same platform.
HPCC Systems COVID-19 Portal
The HPCC Systems COVID-19 portal brings all available COVID-19 information together, showcasing a solution for a classic data lake problem. HPCC Systems collaborates with universities in the USA and UK on this project, including Florida Atlantic University, the University of Oxford, and Harvard University. Projects like the COVID-19 portal demonstrate the commitment of HPCC Systems to provide solutions for real-world problems by putting data analytics to work for the good of humanity.
The main customer for the HPCC Systems platform is LexisNexis® Risk Solutions. Everything on the platform is vetted, ensuring a high-quality product. The fact that LexisNexis successfully uses the HPCC Systems platform on a daily basis demonstrates the benefits to an organization that holds a data lake or has a need to process large quantities of data quickly and efficiently.
ECL (Enterprise Control Language)
HPCC Systems’ ECL (Enterprise Control Language) is great for processing ETL types of workflows. The patterns of data processing are built into the ECL programming language, which was developed specifically for building queries into big data.
Flavio and Arjuna conclude their discussion with a few examples of customer success stories, illustrating how use cases not only help the customer, but also contribute back to the HPCC Systems open source platform in the form of new features, enhancements, and machine learning algorithms. HPCC Systems is designed to handle any big data problem, and some of the use cases we hear about provide solutions that we might never have imagined, keeping in step with our changing world as well as new and emerging technologies.
Meet the participants
Arjuna Chala is Sr. Director of Technology Innovation for the HPCC Systems® platform at LexisNexis® Risk Solutions. With almost 20 years of experience in software design, Arjuna leads the development of next-generation big data capabilities including creating tools around exploratory data analysis, data streaming, and business intelligence. Arjuna strives to understand new technologies and bring innovative applications and designs to the HPCC System platform.
Dedicated to development excellence, Arjuna served as a key member of the team to bring the HPCC Systems platform to the open source community. In his work with HPCC Systems community leaders and system integrator partners, Arjuna’s efforts have contributed to the spread of HPCC Systems technology into the enterprise domestically as well as the international markets of China, Brazil, Europe, and India.
Arjuna has a BS in Computer Science from RVCE, Bangalore University.
Flavio Villanustre is CISO and VP of Technology for LexisNexis® Risk Solutions. He also leads the open source HPCC Systems platform initiative, which is focused on expanding the community gathering around the HPCC Systems Big Data platform, originally developed by LexisNexis Risk Solutions in 2001 and later released under an open source license in 2011. Flavio’s expertise covers a broad range of subjects, including hardware and systems, software engineering, and data analytics, and machine learning. He has been involved with open source software for more than two decades, founding the first Linux users’ group in Buenos Aires in 1994.
Learn more about this 10 Year Anniversary series on our wiki:
Visit our Tech Talk wiki for more information and to browse past episodes: