Skip to main content

City with IOTBy Arujna Chala and Lili Xu

Data Science and data analytics, in general, is a rapidly growing field. IoT and AI are changing the way we do things. Did we imagine 10 years ago that we would be asking Alexa to wake us up in the morning, order items on Amazon, and more? Did we imagine 10 years ago driving next to a Tesla driver who is sleeping at the wheels? The next generation of Data Scientists will be tasked to solve problems around what IDC predicts will be 175 zettabytes of data by 2025.  As part of helping universities educate students on solving Big Data problems at scale, LexisNexis Risk Solutions is promoting the open source HPCC Systems Data Lake platform by sponsoring hackathons, workshops, internships, and research programs.

As an example, the HPCC Systems team recently conducted a successful workshop at Kennesaw State University by challenging the students to perform data analysis on property assessment data for the city of Philadelphia.

Lili

A total of 12 participants divided across 4 teams worked on several solutions. The problem statement was deliberately vague and open to gauge the creativeness and analytics skills of each team.

The winning team based their solution on predicting the flipping value of the house. Outer and inner conditions, location, and price were among the parameters considered to come up with a scoring model.  The team impressed us with the application of their domain knowledge, understanding of the data, and the use of analytics skills.

Current houses

Result 1

The team that finished second attempted to build a scoring model to predict the value of the house based on location. They used the HPCC Systems machine learning library to build the prediction model.

KSU has been an enthusiastic HPCC Systems partner for the last three years. The successful completion of the 2019 hackathon helped the HPCC Systems team to test a few new tools and teaching aids to make the adoption of HPCC and ECL easy and straightforward. The new ECL Cloud IDE, coupled with the advances in machine learning and visualization made it easy for the students to ramp up and work on business problems.

Result and script WindorECL Cloud IDE –Result Window (Top) and Script Window (Bottom)

Circle resultECL Cloud IDE – Multi-Visualization Result

multi resultsECL Cloud IDE – Multi-Visualization Result

Finally, I want to thank the KSU organizers for doing an admirable job at organizing the event and all the volunteers from the great HPCC Systems team. A special shout out to Dawn Tatum of KSU for doing an admirable job throughout the year and the HPCC Systems team members of Lili Xu, Bahar Faradanian, Raja Sundarajan, Jerry Jacob, Dan Camper, and Jeremy Clements for doing an excellent job around mentoring the students and building the necessary tools.