2020 Community Day Review and Resources

Like a lot of conferences in 2020, HPCC Systems Community Day was a virtual event. It has always been well attended and this year, attendance was better than ever. While nothing beats the live networking opportunities at an in-person event, not everyone can be there on the day and numbers may be limited according to room size. One advantage of a virtual event is the attendance flexibility it offers to organisers, presenters and delegates.

It was a first for us and we were delighted to receive so many comments about the quality of the interface, features and presentations on offer:

Image Showing the Community Day 2020 Main Hal

Our 2020 virtual Community Day event included an extensive array of presentations. There were also some additional learning experiences, providing the opportunity to:

  • View our annual technical posters from students working on HPCC Systems related projects in 2020
  • Attend ECL language workshops for those wanting to learn and extend their knowledge.
  • Visit our Interactive Expo including booths representing many different aspects of the HPCC Systems open source project.

Since it is not always possible to take in everything on the day, the purpose of this blog is to provide you with a taste of what the experience was like and offer you another chance to make the most of the many resources which are still available.

2020 Community Day Agenda

Presentations covered a wide spectrum of topics from the Cloud, HPCC Systems Tools and Machine Learning to COVID-19 research, using HPCC Systems in a variety of business settings and community collaborations. All the presentations are available via our YouTube channel for you to watch at your leisure.

Go to our 2020 Community Day Video Library and select a session to watch using the numbered key below:

Plenary Sessions

  1. Opening plenary keynotes including:
    Welcome by Flavio Villanustre, VP Technology & CISO, LexisNexis Risk Solutions Group
    HPCC Systems & the Cloud by Richard Chapman, VP Head of R&D, LexisNexis Risk Solutions Group
    HPCC Systems and the cloud should be a natural fit. Both involve large quantities of identical, commodity, interchangeable hardware, working together on a problem. Find out about the changes that we are making (or have made) to make cloud deployments of the HPCC Systems platform seamless.
    Data Lake Overview by Roger Dev, Sr Architect, LexisNexis Risk Solutions Group
    Data Lake is an architecture and methodology for the continuous extraction of value from complex and diverse data resources, whether public or private. It has been successfully used by many organizations and has spawned whole new lines of business. More important, it has enabled these businesses to continue to extract new value, even as earlier insights are put into production. This talk introduces Data Lake Concepts and Terminology for those who are new to the subject.
  1. Closing plenary sessions including:
    Community Recognition and Poster Contest Awards by Trish McCall and Lorraine Chapman, LexisNexis Risk Solutions Group

    Includes the Community Recognition and David Kan Ambassador Award winners. Also congratulates the Best Poster Award winners in the categories of data analytics, platform enhancement and use case. Find out who won our first ever Community Choice Award, which was voted for on the day by delegates attending our 2020 Community Day Summit.
    Machine Learning Advancements on HPCC Systems by Lili Xu & Roger Dev, LexisNexis Risk Solutions Group
    The HPCC Systems Machine Learning Library continues to evolve and provide richer capabilities. This talk focuses on two major areas of new development, enhanced methods for data preparation and the Generalized Neural Networks bundle version 2.0
    2021 Platform Vision by Gavin Halliday, LexisNexis Risk Solutions Group
    Find out what the core HPCC Systems team has in store in terms of current plans and the year ahead on the platform.
    Wrap-up & Adjourn by Flavio Villanustre, LexisNexis Risk Solutions Group

HPCC Systems Breakout Sessions

  1. Data Lake Deep Dive by Roger Dev, LexisNexis Risk Solutions Group
    Learn how to build a data lake repository, explore your data, enrich it, convert data discoveries into production applications and scale to nearly unlimited data volumes and user base sizes.
  2. Optimal Lockdown Control for COVID-19 by Yuting Fu, Researcher, Oxford University, UK
    This research project extends the SIR epidemic model to include the effects of working hours and consumption on the spread of the pandemic, providing insights into how to set a lockdown rate that maximises the benefit of the whole society.
  3. Smart Learning & Market Insight in Hospitality by David Dasher, CTO, CPL Learning
    Showcases three projects including smart learning for enhancing learner knowledge, tracking market data in the US hospitality industry using RestauranTrak and a suite of APIs comparing current performance with previous months and years, providing insights into the effects of the COVID-19 pandemic on the hospitality industry in the UK.
  4. NLP++ Plugin by David de Hilster, LexisNexis Risk Solutions Group
    Highlights the advantages of NLP++ over other NLP systems including machine learning and shows why this technology is uniquely suited to integrate with HPCC Systems.
  5. COVID-19 Data Lake by Arjuna Chala, Lili Xu, & Roger Dev, LexisNexis Risk Solutions Group
    Introduces the HPCC Systems COVID-19 Tracker project, illustrating the power of the HPCC Systems data lake methodology to rapidly create and evolve a data lake which provides deep insights into the evolution and spread of the COVID-19 pandemic around the world.
  6. Managing Data Pipelines by Adwait Joshi, Chief Seer, DataSeers
    Learn about the approach DataSeers is taking to “tame the data demon”, which comes in various sizes, shapes and velocities.
    Scaling Data Science & Analytics by Bill Franks, DataSeers Advisor, Chief Analytics Officer, IIA
    Discusses some of the current trends helping organisations to successfully deploy high scalable data science processes.
  7. Leveraging and Evaluating Kubernetes Support on Microsoft Azure by Yash Mishra, Clemson University
    Take a deep dive into provisioning and deploying containerised HPCC Systems clusters including the cluster lifecycle, architectural differences between Kubernetes and legacy environments, storage and cost considerations as well as some of the challenges involved in this 2020 intern project which closely follows the ongoing development of our new Cloud native platform.
  8. GNN bundle talks including:
    Using the HPCC Systems Generalized Neural Network (GNN) Bundle with TensorFlow to Train a Model to Find Known Faces Leveraging the Robotics API by American Heritage School of Boca/Delray Robotics Team

    This project supports the ongoing development of an autonomous security robot project at American Heritage School (AHS) in Florida. It involves upgrading the Robot API created by a student from AHS in 2018 as well as using the HPCC Systems GNN bundle with TensorFlow.
    Distributed GPU Accelerated Neural Networks with GNN by Robert Kennedy, Florida Atlantic University
    This 2020 intern project demonstrates that it is now possible to spread GNN computations over multiple GPUs which can either be multiple GPUs in one machine or across multiple GPUs across multiple machines.
  9. The Making of an Agriculture Data Lake by Dr Vincent Freeh, Kelly Zering and Gurman Singh, North Carolina State University
    Provides an overview of this project as well as a live demonstrations of the data viewing and decision support applications.
  10. Athlete 360: Leveraging HPCC Systems for Player Performance and Return during COVID-19 by Christopher Connelly, North Carolina State University
    Learn how this ongoing project has extended it reach to provide a 360 degree view of an athlete’s wellbeing to ensure peak performance and see how this project is helping with decision making for athletes returning to high level training and performance during the COVID-19 pandemic.
  11. Virtual CodeDay and the Big Data Challenge – Bringing Students Together to Raise Awareness and Talent in the Big Data Analytics Field by Tyler Menezes, Executive Director, CodeDay
    CodeDay took their events online in 2020 and created more opportunities for high school students to learn what Big Data Analytics is all about. Tyler Menezes, shares how he partnered with LexisNexis Risk Solutions Group to provide students with a platform to work on real life big data challenges leveraging HPCC Systems. Thousands of students have taken part in this event, creating an amazing experience for the students to help boost their skills and interest in big data analytics.
  12. A HAT Story – HPCC Analytics Tool by Apaar Sinha, LexisNexis Risk Solutions Group
    ECL experts and beginners will find this tool useful for getting access to insights on work unit performance, answering questions such as, why did this work unit run slower than usual, or how much time was spent handling spills or I/O operations.

2020 Community Awards

We work with many talented people whose contributions are highly valued by our community members. This year our award winners were also presented with digital badges for displaying on their social media profiles.

2020 Community Recognition Award

Image showing the Community Recognition Badge

This award recognises external community members for their innovative use of HPCC Systems in their research, solutions and open source projects and also for their contributions to the HPCC Systems open source community.

Image of Dr Shobha holding her 2020 Community Recognition Award

Congratulations to Dr. G. Shobha
Computer Science and Engineering Department
RV College of Engineering, India

Since 2018, we have been working with Dr Shobha and her team alongside our LexisNexis Risk Solutions Group colleagues, to complete a number of HPCC Systems related projects. Dr Shobha has been instrumental in the success of these projects as well as encouraging a number of students to complete projects as HPCC Systems interns.
Find out more about our collaboration with Dr Shobha and her team.

2020 David Kan Ambassador Award

Image showing the David Kan Ambassador Award Badge

This award recognises RELX and LexisNexis Risk Solutions Group colleagues who have significantly promoted and contributed to the growth of our community, serving as strong supporters, proud evangelists, subject matter experts and champions.

In 2016, this award was named in memory of David Kan, one of the first and very dedicated HPCC Systems Ambassadors.

Image of Tim Humphrey with his 2020 David Kan Ambassador Award

Congratulations to Tim Humphrey
Consulting Software Engineer
LexisNexis Risk Solutions Group

Tim is a long-standing supporter of our academic and business partners, using his expertise to provide working clusters for test purposes. This vital service has not only provided practical support, it has also allowed us to explore and test a variety of setup requirements, for example, using HPCC Systems with TensorFlow. He has also been an active supporter of our intern program since it began in 2015, by mentoring students specifically in the field of machine learning.

Community Recognition & Poster Awards Ceremony

Our Community Recognition awards were presented live on the day, alongside the winners of our 2020 Technical Poster Contest (details below). Watch the announcements presented by Trish McCall & Lorraine Chapman, LexisNexis Risk Solutions Group.

2020 Technical Poster Contest

Our virtual 2020 Technical Poster Contest included entries from 13 students, located across 4 continents and spanning the entire academic spectrum from high school through to PhD.

Each student supplied a detailed abstract, a five-minute video recording in lieu of the usual ‘on the day’ presentation and a photo of themselves. We also setup up a series of virtual meeting sessions by time zone, to give judges the opportunity to get answers to questions that might help them when scoring the poster entries.

2020 Poster Contest Participants

Thank you to all the poster contest participants who worked on HPCC Systems related projects in 2020. Their contributions are valuable additions to our open source project, and we commend all of them for their hard work and dedication.

Details about each poster can be found on our 2020 Poster Contest Wiki.

Our conference platform provided a virtual poster hall with a booth for each student, displaying their picture, poster and video presentation. You can get an idea of how that worked from the Virtual Poster Booth page on our Technical Poster Contest Wiki.

Image showing the faces of all the 2020 Poster Presenters

We also thank our LexisNexis Risk Solutions colleagues, university professors and school teachers who provided mentoring in 2020, supporting the students as they worked on their projects.

2020 Poster Contest Judges

Our four judges included 3 LexisNexis Risk Solutions Group colleagues and our special guest judge from Imperial College, London.

Image showing the four 2020 Poster Contest Judges

During the judging period, scorecards were provided for judges to award points in the different scoring categories as well as their comments. They were able to view the posters and watch the accompanying video presentations according to their working schedule.

Final scoring was completed after the virtual Q and A sessions with poster presenters.

Find out more about the virtual judging process.

All in all, we had the scorecards back and totted up in advance of our Community Day event, with the results kept under wraps ready for the big reveal on the day!

Presenting the Award Winners

Four awards were presented to students taking part in this contest.

Each winner was also presented with our new digital 2020 Poster Winner badge, that can be used on their social media profile.

2020 Winner Best Poster Award – Data Analytics

Image showing the 2020 Poster Winner Badge

Awarded to the student whose poster project demonstrated the best contribution and use to our platform and open source community in the field of data analytics.

Congratulations to Robert Kennedy
PhD Candidate, Florida Atlantic University

Headshot of Robert Kennedy

Distributed GPU Accelerated Neural Networks with GNN
This work expands HPCC Systems and the GNN bundle by leveraging multiple GPUs that span across a cluster. Using hardware acceleration with the GNN bundle allows ECL machine learning developers to drastically reduce training time. This work is not limited to one GPU or one physical computer, demonstrating that is it now possible to spread GNN computations over multiple GPUs. View Poster Contest Resources.

2020 Winner Best Poster Award – Platform Enhancement

Image showing the 2020 Poster Winner Badge

Awarded to the student whose poster project demonstrated the best direct contribution to the HPCC Systems platform in the form of a new feature or enhancement of great benefit to users of our open source platform.

Congratulations to Nathan Halliday

Image of Nathan Halliday

The Parallel Workflow Engine
Nathan’s project extends the parallel workflow engine to increase support for different ECL language constructs and is beneficial for all ECL programmers. For production systems, money will be saved by providing the clusters with more work sooner. For cloud environments, additional resources can be added dynamically to maximise the benefits of the faster processing. View Poster Contest Resources.

2020 Winner Best Poster Award – Use Case

Image showing the 2020 Poster Winner Badge

Awarded to the student whose poster project demonstrated the best use case scenario of the HPCC Systems platform and provides a significant contribution to the HPCC Systems open source project and community.

Congratulations to Jefferson Mao
Lambert High School, Georgia, USA

Headshot of Jefferson Mao

HPCC Systems on Google Anthos
Google Anthos allows HPCC Systems to be managed from separate cloud platforms through one centralized command center. It comes with a plethora of options that range from config management to service mesh. The main benefit for HPCC-Systems is the ability to manage Kubernetes environments on any cloud. With Anthos, HPCC Systems has access to a common abstract layer that manages deployment, upgrades, configurations, networking, and scaling. View Poster Contest Resources.

2020 Community Choice Award

Image showing the 2020 Poster Winner Badge

This was awarded for the first time in 2020. All Community Day attendees voted for their favourite poster contributing to an award that truly reflects the views of the HPCC Systems Open Source Community.

Congratulations to Jack Fields
American Heritage School of Boca/Delray, Florida, USA

Headshot of Jack Fields - 2020 Community Choice Award Winner

Using HPCC Systems GNN Bundle with TensorFlow to Train a Model to Find Known Faces Leveraging the Robotics API
Jack expanded HPCC Systems’ integration with the AHS robotics project by using the GNN bundle with TensorFlow. Jack joined the HPCC Systems Intern Program in 2020 to complete this project which involved using a database with student information to train a model to recognize known faces. He also upgraded the ROBOT API to work with the newest versions of ROS.  View Poster Contest Resources.

Virtual ECL Language Workshop

Our workshops are provided for ECL developers who want to expand their knowledge of using ECL with the HPCC Systems Platform. This workshop was delivered in three distinct one-hour sessions over three days. Each session delving deeper, providing a learning experience for different levels of ECL knowledge.

These courses are available via the HPCC Systems Community Virtual Summit Workshops YouTube channel and are delivered in English by Bob Foreman (Senior Software Engineer, LexisNexis Risk Solutions Group) and in Brazilian Portuguese by Hugo Watanuki (Senior Technical Support Engineer, LexisNexis Risk Solutions Group, Brazil).

There are some resources that accompany these courses. Links to the lesson materials are shown for each session below in both languages and code examples are also available:

Session 1 – The Relational Realm

Unleashes the power of ECL with Relational Datasets and discusses the advantages of using them.

Image Showing the course outline for The Relational Realm

Get the lesson materials:

Session 2 – The ROXIE Realm

Builds on Session 1, moving forward by delivering this data to ROXIE. We discuss and demonstrate the best practices of building indexed datasets and powerful data queries.

Image showing the course content for The ROXIE Realm

Get the lesson materials:

Session 3 – The Regression Realm

Dives into a practical introduction to Machine Learning using ECL and Learning Trees bundle and demonstrates how to build a Regressive (Quantitative) Model using a Property Price Predictor.

Image showing the course content for The Regression Realm

Get the lesson materials:

Prerequisite Knowledge

These courses assume that you have a basic understanding of HPCC Systems and the ECL language. We recommend that you complete the online Introduction to ECL Courses Part 1 and 2 before taking these workshops, particularly if you are new to HPCC Systems. Our introductory ECL courses are available in the Training area of the HPCC Systems website, along with many others designed to extend your knowledge of the ECL language and machine learning, as well as those outlining recommended practices for managers and those who maintain HPCC Systems clusters.

HPCC Systems Interactive Expo 2020

Following on from last year’s Interactive Expo success, in 2020, the conference interface provided a virtual Interactive Expo hall with booths for seven different areas of interest. Attendees were able to view resources for each booth and chat live to HPCC Systems colleagues.

Pictures showing the Interactive Expo Hall and a Booth

We have made our Interactive Expo ‘booths’ available to you via pages on our Community Wiki. Each page provides a contact point so you can ask any questions you may have and the Learn more… section includes links to relevant videos, blogs, documentation, demos and more. Since we are Cloud focused at the moment, here’s a look at our Come to the Cloud Interactive Expo Booth:

Image showing the Come to Cloud Booth

Our Interactive Expo Wiki provides ‘booths’ for the following seven areas:

  • Academic Chat and Share
    Find out more about our Academic Program, including the HPCC Systems Intern Program.
  • Come to the Cloud
    Our new HPCC Systems cloud native platform is currently under development. Access resources to help you become an early adopter and get answers to any questions you may have.
  • Community Contribution Corner
    For anyone who wants to know more about contributing to our open source platform.
  • ECL Training Chat and Resources
    Whether you are learning ECL language from scratch or extending your knowledge and expertise, our trainers have resources to help.
  • HPCC Systems Platform Q and A
    Meet the team, find out more and post your questions to those who are in the know about the HPCC Systems Platform.
  • Meet the Machine Learning Team
    Learn about ML related research projects and the ML bundles currently available or take a tutorial to help you with your own ML project.

Looking ahead to HPCC Systems Community Day 2021

We are already planning our 2021 Community Day Summit.

During 2021 we are celebrating the 10th anniversary of HPCC Systems becoming an open source platform. We will be marking this achievement throughout the year and particularly at our conference in the Fall.

We would love to hear your story of using HPCC Systems over the years. Whether you are a new user, have been with us since the beginning or have joined our community over the years, you all have a story to tell.

Let us know if you’d like to share your HPCC Systems successes with us and our open source community in 2021 and join us in our celebrations.