Fly on the Wall – HPCC Systems Community Day Summit 2018

Our 2018 summit was held at the Atlanta Marriott Buckhead Hotel and Conference Center, on October 8-9th. This two-day event started with a day of hands-on ECL training, followed by our 3rd annual Poster Presentation Contest and the Community Day summit the next day. Our sponsors this year were Cognizant, DellEMC, Infosys, Merit and Datum Software. We always have a theme and this year it was Innovation and Reinvention Driving Transformation. Get an idea of the atmosphere on the day by watching this video.

As you can imagine, there is a lot of preparation work involved in hosting this event, so we are always bustling around the hotel the day before. It is almost as if the event has already started because I can feel the energy being created by those who have already arrived. There is also a photographer around taking headshots. I’m on the list for one of those and have already missed my slot by insisting others go first. Avoiding it? Probably, but it will happen sooner or later!

Pre-Event Training Workshops

The day before the conference, two workshops are taking place providing hands-on ECL training. Workshop attendees arrive with their laptops having already installed our ECL IDE and we provide the details of the cluster that will be used, manuals and code examples. Each workshop lasts for three hours.

The morning course is about data extraction and transformation with ECL. It provides an overview of HPCC Systems and gives introductory guidance on how to build powerful data queries using ECL. This course is led by Bob Foreman who is one of our regular trainers with years of experience and knowledge in coding with the ECL language.

The afternoon course moves on to look at how to solve real world data problems using HPCC Systems and ECL, building on the knowledge learned in the morning session. This course is led by Dan Camper, who is now an employee of LexisNexis but started using HPCC Systems and ECL as a community member some years ago. Dan shares his knowledge and experience of how to manage disparate data easily and efficiently, demonstrating how to profile, aggregate and analyse the data.

We have a number of on-line training resources if you are interested in learning or extending your knowledge of the ECL Languages.

The lunch break provides networking opportunities and a chance to chat with the trainers. I was setting up the room next door in preparation for our Technical Poster Contest taking place that afternoon, so I wandered out to meet the attendees. It was lovely to see a number of students from our intern program had joined as well as one or two people I recognised from our partner programs.

The workshops are very intensive, covering a large amount of material over the 6-hour block. Each attendee takes home a printed manual and thumb drive with examples and lab exercise solutions. Each workshop introduces different sample datasets to examine, explore, and analyse. It was truly a global event, with some travelling from Brazil and the U.K. as part of the Community Day event.

That evening, more and more people started arriving in time for the early start the next day and I enjoy catching up with those who regularly attend and meet lots of new people. Some appear to be new to me, but then I realise I know them after all, having spent time meeting with them in conference calls during the year. It is good to be able to put faces to names. A recurrent theme at this conference and not just for me!

HPCC System Community Day Summit

As always, this day starts with registration and a very sociable breakfast. As I walk past tables, I can see animated discussions taking place already. By lunchtime, many attendees will be actively seeking out people others have recommended they meet. This is one of the many aims of the day. 

Welcome to Community Day by Flavio VillanustreWelcome to Community Day by Flavio Villanustre  

By 8.30am, we are all in the main conference hall and the conference is opened by Flavio Villanustre who welcomes not only those attending in person, but also our livestream audience. The first part of the day is spent in this room listening to a number of speakers addressing the whole conference, but later, we divide into breakout groups for smaller sessions. It’s the first year we have tried this approach which we adopted to give as many speakers as possible the opportunity to present.

Track 1 – HPCC Systems in Industry

One of the things that really comes through in this track is the contribution HPCC Systems has made to helping these businesses solve their big data problems fast and efficiently.

Jeff Lewis from Sutton Bank talks about how HPCC Systems has helped them to quickly identify fraud patterns from many different sources of data. He expects to be able to make a big impact on preventing identity fraud at the bank by finding relationships within the data that have not previously been accessible. He goes on to talk about how using HPCC Systems will allow them to scale up in the future, while helping them to keep their costs down and increase their revenue at the same time. It’s a very interesting and powerful presentation.

Another that peaks my interest, is the presentation by Luke Pezet from Archway Health Advisors. Luke talks about the challenges they have faced in making the transition from using SAS to HPCC Systems. He runs through some of the differences between the two languages when creating a model to get answers from a dataset. Luke demonstrates his SASsy bundle which provides the same functions as SAS but in ECL and it formats the results. As well as providing a fascinating presentation, Luke has contributed his SASsy bundle to our open source project and you will find it in our HPCC Systems GitHub Repository. He also mentioned using the Data Patterns Bundle featured during our Roadmap Tech Talks sessions by Dan Camper.

Watch the presentations in this track:

  • Platinum Sponsorship Keynote: Driving Innovation with Artificial Intelligence, Badhri Krishnamoorthy, Cognizant Watch Recording / View Slides
  • Gold Sponsorship Keynote: Soaring Through Emerging Technologies in the Big Data Era, Prasad Joshi, Infosys Watch Recording / View Slides
  • Prepaid Banking on Steroids – Managing Massively Scalable Datasets with Ease, Jeff Lewis, Sutton Bank Watch Recording / View Slides
  • HPCC Systems vs SAS: The Final Countdown, Luke Pezet, Archway Health Advisors Watch Recording / View Slides
  • Integrating CRM and Sales Systems to Drive ROI and Increase Sales Projection Accuracy by 10x, David Dasher, CPL Online Watch Recording / View Slides
  • How HPCC Systems is Building the next generation Credit Bureau, David Wheelock, Mauricio Nunes de Oliveira, Robert Berger & Lucas Sobrinho, LexisNexis Risk Solutions Watch Recording / View Slides

Track 2 – HPCC Systems in Academia

We work with a number of universities in the US and Europe, sponsoring students and professors working on research papers related to solving big data problems. We also sponsor school projects focused on STEM related subjects. These collaborations are very successful and close to my heart. The three presentations in this track are related in some way to students I have worked with this year as part of my involvement with the HPCC Systems academic program.

Jingqing Zhang is a PhD student from Imperial College, London who is working on a project that involves using TensorFlow with HPCC Systems.

Coincidentally, Robert Kennedy was accepted on to our intern program this year, to leverage HPCC Systems and Tensorflow to extend the HPCC Systems Machine Learning and Deep learning capabilities in a distributed manner. In fact, it is possible that Robert’s work may be useful to Jingqing’s PhD project as well as other users of the HPCC Systems Machine Learning Library.

Robert was also awarded a well-deserved third place in our Technical Poster Contest held at the same event.

Taiowa Donovan is the Robotics Director at American Heritage School in Florida.  This is really great presentation which shows that there are some amazing STEM related projects going on in schools and skilled young programmers with serious potential. One of Tai’s students, Aramis Tanelus, joined our intern program this year to complete a project to add APIs for HPCC Systems data ingestion for common robot sensors.

Taiowa brought his young team of high school students to our conference to showcase their autonomous agricultural robot. Everyone had a chance to see it in action during the breaks.

American Heritage School Students with their RobotAmerican Heritage School students with their robot, from the left, Aramis Tanelus, Anthony Nicotra, Justin Schuster and Kyle Tanner 

Watch the presentations in this track.

  • Deep Content Learning in Traffic Prediction and Text Classification, Jingqing Zhang, Imperial College of London Watch Recording / View Slides
  • Parallel Distributed Deep Learning on HPCC Systems Taghi Khoshgoftaar & Robert Kennedy, Florida Atlantic University Watch Recording / View Slides
  • Autonomous Agricultural Robot: Is the Machine Uprising Coming Sooner Than You Think? Taiowa Donovan & Robotics Team, American Heritage School Watch Recording / View Slides / View Robotics Demo

Lunch followed providing another networking opportunity and a chance to look at all the posters entered into our competition by students working on HPCC Systems related projects. The previous evening, Trish McCall and I worked through the judges’ scorecards to determine the winners. It’s a tough contest to judge because the entries are always of a very high standard and cover a wide spectrum of topics. I was in the room during the contest, watching as the judges paid careful attention to each poster and listened to each student as they presented. I simply can’t do justice to the impressive achievements of these hard-working students in this blog, but you can view the abstracts and see all the posters for yourself by visiting our Technical Presentations Wiki.

After lunch, Flavio Villanustre invited all the poster presenters on to the stage and announced the winners (Watch Recording / View Slides).

2018 Poster Contest Winners2018 Poster Contest Winners, (from the left) Saminda Wijeratne, Robert Kennedy, Nicole Navarro  

Every year, we present a Community Recognition Award. This award is presented to a someone whose contribution to our open source project has made a difference to our community. We like to recognise those who encourage others to join and contribute to our community and act as innovators by pushing the boundaries of how HPCC Systems may be used either in business or academic research.

This year, the award was presented to Taiowa Donovan from American Heritage School, for the wonderful work he has done in his school with the children who work on the robotics program he created.

Tai Donovan with Aramis TanelusTaiowa Donovan (right) with his student, Aramis Tanelus, who joined our intern program in 2018 and was supported by our LexisNexis Colleagues, David DeHilster (left) and Kevin Wang (second left)  

Community Day is the perfect time to celebrate that our open source community continues to grow as a result of the support and achievements of our users like Taiowa Donovan.

Track 3 – HPCC Systems Roadmap Tech Talks

Since the conference, we have released HPCC Systems 7.0.0 Gold. In this track, the presentations focus mainly on the features and enhancements included in this release. In his presentation, Gavin Halliday talks about workunit timing statistics that make it easier to understand queries in ECL Watch, improved keyed joins, Thor optimizations and using Bloom filters. He also spoke about improvements made to the Visualizer bundle and our new Spark connector which allows you to read and write from Spark. James McMullan went on to talk more specifically about this and other connectors in his tech talk.

Watch the presentations in this track:

  • Data Patterns – A Native Open Source Data Profiling Tool for HPCC Systems, Dan Camper, LexisNexis Risk Solutions Watch Recording / View Slides
  • Making IoT Data Actionable Using Predictive Analytics Dan Camper & Hicham Elhassani, LexisNexis Risk Solutions Watch Recording / View Slides
  • Learning Trees – Decision Tree Learning Methods, Roger Dev, LexisNexis Risk Solutions Watch Recording / View Slides
  • A First Look at HPCC Systems 7.0, Innovation in Action, Gavin Halliday, LexisNexis Risk Solutions Watch Recording / View Slides
  • Innovation with Connection, The new HPCC Systems Plugins and Modules, James McMullan, LexisNexis Risk Solutions Watch Recording / View Slides

Time for a short break before the breakout sessions start. I’m moderating to help with the recordings so anyone who wants to watch them later, can do so. As I’m moving on, I notice a group of the poster presenters milling together congratulating each other and chatting. This happens every year. The students who enter are always so interested in all the projects and are so very supportive of each other despite the fact that it is a contest. It’s lovely to watch.

Track 4 – HPCC Systems Breakout Sessions

As Flavio mentioned earlier in the day, we decided to provide two rotations of talks at the end of the afternoon because of the number of presentation submission we had received from people wanting to speak. Notice how diverse the topics covered are including machine learning, visualising your data, using VS Code, automated test systems and documentation.

The difficult choice of which to attend on the day is mitigated by the fact that they are all recorded. So if you weren’t there on the day, or missed one, here’s your chance to catch up.

Watch the presentations in this track:

  • Documentation & Training: Optimizing Set-Similarity Join and Search with Different Prefix Schemes, Fabian Fier, Humboldt University Berlin Watch Recording / View Slides
  • Machine Learning: Using HPCC Systems ML to Map Thousands of Public Records Data Descriptions to Standard Codes, Lili Xu, Clemson University & Gus Reyna, LexisNexis Risk Solutions Watch Recording / View Slides
  • System Tools: Automated Test Systems for QA in your HPCC Systems Environment, Attila Vamos, LexisNexis Risk Solutions Suwanee Watch Recording / View Slides
  • User Interfaces: Visualizing your Data Natively on the HPCC Systems Platform with the “Visualizer Bundle”, Gordon Smith, LexisNexis Risk Solutions Watch Recording / View Slides
  • Documentation & Training: Preparing an Open Source Documentation Repository for Translations, Jim DeFabia, LexisNexis Risk Solutions Watch Recording / View Slides
  • Machine Learning: Predicting College STEM Enrollment using HPCC Systems in Educational Research, Itauma Itauma, Keiser University Watch Recording / View Slides
  • System Tools: Visualizing HPCC Systems Log Data Using ELK, Rodrigo Pastrana & Miguel Vazquez, LexisNexis Risk Solutions Watch Recording / View Slides
  • User Interfaces: Using the Open Source VS Code Editor with the HPCC Systems Platform, Arjuna Chala, LexisNexis Risk Solutions Watch Recording / View Slides

Our breakout sessions complete what has been a very full and exciting day of presentations which quite simply flew by.

Roll on next year!

Our Community Day Summit promotes the exchange of knowledge and experience from start to finish. It’s about all of these:

  • Reconnecting with old friends
  • Making new connections
  • Putting names to faces and faces to names
  • Getting information to support your project
  • Getting ideas for new projects
  • Finding out what the future plan is for HPCC Systems
  • Giving us information about what you need from the platform in the future.

In fact, the day is designed so that you get what you need from attending.

I certainly feel the buzz generated by the excited discussions happening all around and it’s always a real treat to meet so many of our community users eager to share their stories. I thoroughly enjoyed what was a wonderful couple of days and a very successful event.

If you missed it this year, get the full experience by joining us next year.

Look out for announcements on our Events page and get in touch if you’d like to present. We want to hear how you are using HPCC Systems and so does our open source community!