Meet the 2024 HPCC Systems Interns
Welcome to the HPCC Systems Summer Internship Program 2024! This year, we are thrilled to introduce thirteen talented interns, each bringing unique skills and perspectives to enhance the HPCC Systems platform. From developing cutting-edge tools to improving automation workflows and exploring integrations with third-party environments, our interns are poised to make significant contributions to the HPCC Systems Community. Let’s delve deeper into each project and get to know the interns driving innovation at HPCC Systems.
Charan Nagaraj
Project: Migrate and Improve Regression Testing in GitHub Actions
Charan Nagaraj, a BSc student in Computer Science from RV College of Engineering, India, has embarked on a mission to modernize the regression testing framework within HPCC Systems. Traditionally, HPCC Systems has relied on the Overnight Build and Test (OBT) system for regression testing ECL bundles such as the Machine Learning bundles. However, with the evolution of DevOps practices, there is a shift towards leveraging GitHub Actions for automated build, test, and deployment pipelines.
Charan’s project involves migrating the existing regression testing scripts from OBT to GitHub Actions. While pursuing this, Charan will also develop an automated testing of hyperlinks in HPCC Systems user documents and GitHub README files. This transition aims to streamline testing processes, improve efficiency, and ensure quicker feedback loops for developers. By automating these workflows using GitHub Actions, Charan seeks to enhance the reliability and scalability of HPCC Systems testing infrastructure. Charan is being mentored by Attila Vamos, Consulting Software Engineer at LexisNexis Risk Solutions.
Follow Charan’s journey and insights in his blog journal as he navigates through the challenges and successes of migrating regression testing to GitHub Actions.
Eatesam Khan
Project: Create a New HPCC Systems Command Line Tool
Eatesam Khan, pursuing an MSc in Computer Science from California State University, USA, is tasked with developing a versatile command line tool for HPCC Systems. This tool is designed to facilitate seamless interaction with Enterprise Services Platform (ESP) services, which are integral components for interfacing with HPCC Systems.
The objective of Eatesam’s project is to empower users with a command line interface that simplifies API testing, service discovery, and task automation within HPCC Systems. By leveraging XML files generated from ECM files, Eatesam aims to dynamically configure the command line tool to interact with various ESP services efficiently. Terrence Asselin and Tim Klemm, respectively Consulting Software Engineer and Senior Software Engineer at LexisNexis Risk Solutions are mentoring Eatesam in this project.
Explore Eatesam’s progress and discoveries in his blog journal as he tackles the complexities of developing a new command line tool for HPCC Systems.
El Arbi Belfarsi
Project: Update and Improve the Generation of Platform Artifacts for HPCC Systems Builds
El Arbi Belfarsi, a PhD candidate in Computer Science at Kennesaw State University, USA, is spearheading efforts to enhance the CI/CD workflow at HPCC Systems. Historically, Jenkins has been utilized for automating the building and packaging of HPCC Systems platform artifacts. However, with the emergence of GitHub Actions as a robust CI/CD platform, there is an opportunity to optimize and streamline these processes further.
El Arbi’s project involves replacing existing web services with efficient bash scripts that integrate seamlessly into GitHub Actions workflows. These scripts will parse package files associated with specific versions of the HPCC Systems platform and generate metadata in JSON format. By leveraging GitHub Actions, El Arbi aims to improve deployment efficiency, reduce maintenance overhead, and enhance the scalability of the HPCC Systems CI/CD pipeline. El Arbi is being mentored by Michael Gardner and Ming Wang, respectively Senior Software Engineer and Consulting Software Engineer at LexisNexis Risk Solutions.
Follow El Arbi’s blog journal as he navigates through the intricacies of updating and improving platform artifact generation for HPCC Systems builds.
Elizabeth Lorti
Project: HPCC Systems Technology Marketing and Event Management (own idea)
Elizabeth Lorti, a Bachelor of International Development graduate from King’s College, UK, returns to HPCC Systems suggesting her own project idea with a focus on enhancing technology marketing and event management. Building on her previous experience, Elizabeth is leading initiatives to optimize digital campaigns and spearhead speaker and attendee engagement for the HPCC Systems Community Summit.
Elizabeth’s project involves implementing strategic marketing initiatives to enhance visibility and user engagement. While working closely with the organizing committee of the Technology Summit, Elizabeth will also provide project management support to ensure quality communication, secure deliverables and identify areas for improvement. By leveraging her background in communications and marketing, Elizabeth aims to elevate the HPCC Systems brand awareness and community involvement. Jessica Lorti, Director Technology Marketing at LexisNexis Risk Solutions, is mentoring Elizabeth in this project.
Discover Elizabeth’s strategic initiatives and reflections in her blog journal as she navigates through technology marketing and event management at HPCC Systems.
Gagana Premnath
Project: Integration of HPCC Systems Terraform CI with GitHub Actions
Gagana Premnath, a recent graduate with an MSc in Computer Science from Syracuse University, USA, is tasked with integrating HPCC Systems deployment on Azure Kubernetes Service (AKS) using Terraform with GitHub Actions. This project aims to streamline continuous integration and deployment pipelines, automate infrastructure provisioning, and optimize testing processes within HPCC Systems.
Gagana’s project builds upon an initial implementation using Terraform modules for AKS deployment, extending its functionality to include automated testing of ECL tasks and optimizing cost management on Azure. By leveraging GitHub Actions, Gagana seeks to enhance scalability, reliability, and performance within HPCC Systems CI/CD workflows. Godji Fortil and Ming Wang, respectively Software Engineer and Consulting Software Engineer at LexisNexis Risk Solutions are mentoring Gagana in this project.
Follow Gagana’s journey of automation and deployment in her blog journal as she explores the integration of Terraform CI with GitHub Actions.
Ilhan Gelle
Project: Test Suite for the HPCC Systems Parquet Plugin
Ilhan Gelle, pursuing a BSc in Computer Science from the University of Texas, USA, is focused on developing a robust test suite for the HPCC Systems Parquet plugin. As organizations increasingly rely on Parquet for efficient data storage and retrieval, Ilhan’s project aims to ensure the plugin’s functionality and performance under various conditions.
Ilhan’s project involves creating comprehensive test cases that cover different ECL and Parquet data types, benchmarking performance, and identifying areas for optimization within the plugin codebase. By leveraging her expertise in testing methodologies, Ilhan aims to enhance the reliability and efficiency of Parquet integration within HPCC Systems and is being mentored by Jack Del Vecchio, Software Engineer at LexisNexis Risk Solutions.
Explore Ilhan’s methodologies and findings in her blog journal as she navigates through the exciting journey of developing a test suite for the HPCC Systems Parquet plugin.
Nisha Bagdwal
Project: Develop an Automated ECL Watch Test Suite
Nisha Bagdwal, pursuing her MSc in Computer Science at Kennesaw State University, USA, is tasked with enhancing the ECL Watch UI testing framework within HPCC Systems. ECL Watch serves as a critical interface for managing and monitoring cluster workloads, making robust automated testing essential to ensure its reliability and functionality.
Nisha’s project involves expanding the existing Selenium-based test suite for ECL Watch, automating UI tests, and improving test coverage across different use cases. By leveraging automation, Nisha aims to accelerate testing cycles, reduce manual effort, and enhance overall quality assurance for ECL Watch within HPCC Systems. Atilla Vamos and Chris Lo, both Consulting Software Engineers at LexisNexis Risk Solutions are mentoring Nisha.
Follow Nisha’s progress in developing automated tests in her blog journal as she tackles the challenges of enhancing the ECL Watch test suite.
Sabrina Harris
Project: HPCC Systems Machine Learning Tutorials (own idea)
Sabrina Harris, pursuing an MSc in Applied Data Science at New College of Florida, USA, is implementing her own suggested project idea of expanding the HPCC Systems Machine Learning course curriculum. Building on the foundation laid by her mentor, Sabrina is developing comprehensive lesson plans, slideshows, code examples, and quizzes for machine learning topics such as Gaussian Process Regression, General Linear Models, and Support Vector Machines.
Sabrina’s project aims to empower users with practical knowledge and skills in machine learning using HPCC Systems. By creating educational resources, Sabrina seeks to foster learning, adoption, and innovation in machine learning capabilities within the HPCC Systems community. Bob Foreman, Software Engineering Lead at LexisNexis Risk Solutions is mentoring Sabrina in this project.
Discover Sabrina’s educational endeavors and teaching insights in her blog journal as she expands the HPCC Systems Machine Learning tutorials.
Scarlett Huang
Project: Investigate Third-Party Environments (Google Big Query)
Scarlett Huang, a high school student from Dreyfoos School of Arts, USA, is conducting research on integrating Google Big Query with the HPCC Systems Cloud Native Platform. This project explores the potential benefits of leveraging third-party environments to enhance big data processing capabilities within HPCC Systems.
Scarlett’s project involves assessing the compatibility, functionality, and use cases of integrating Google Big Query with HPCC Systems. By conducting in-depth research and evaluations, Scarlett aims to provide insights into optimizing data analytics and processing workflows for HPCC Systems users. Scarlett is being mentored by Ming Wang and Terrence Asselin, both Consulting Software Engineers at LexisNexis Risk Solutions.
Follow Scarlett’s exploration and findings in her blog journal as she investigates the integration of Google Big Query with HPCC Systems.
Shounak Joshi
Project: Investigate Third-Party Environments (Azure Synapse Analytics)
Shounak Joshi, pursuing a BSc in Computer Science at the University of Florida, USA, is focused on investigating the integration of Azure Synapse Analytics with HPCC Systems. This project aims to explore the compatibility, advantages, and use cases of leveraging Azure Synapse Analytics to enhance big data processing capabilities within HPCC Systems.
Shounak’s project involves evaluating the functionality, scalability, and performance benefits of integrating Azure Synapse Analytics with HPCC Systems. By conducting comprehensive research and analysis, Shounak seeks to provide recommendations for optimizing data analytics workflows and enhancing user experience. Ming Wang and Michael Gardner, respectively Consulting Software Engineer and Senior Software Engineer at LexisNexis Risk Solutions are mentoring Shounak in this project.
Explore Shounak’s research journey and insights in his blog journal as he investigates the integration of Azure Synapse Analytics with HPCC Systems.
Rohith Podugu
Project: Refactoring and Releasing PyHPCC
Rohith Podugu, a recent graduate holding an MSc in Computer Science from California State University, USA, is focused on refactoring and enhancing PyHPCC, a Python package for interacting with HPCC Systems. This project involves improving code structure, standardizing methods, enhancing documentation, and streamlining the release process to make PyHPCC more accessible and developer friendly.
Rohith’s project aims to enhance the usability and adoption of PyHPCC within the HPCC Systems community. By implementing best practices and optimizing the codebase, Rohith seeks to empower developers to leverage Python for seamless interaction with HPCC Systems data management and analytics capabilities. Amila de Silva, Senior Software Engineer at LexisNexis Risk Solutions is responsible for mentoring Rohith in this project.
Read about Rohith’s refactoring and contributions in his blog journal as he refines PyHPCC for the HPCC Systems community.
And there is more…
For the second year, the HPCC Systems academic program is also proudly partnering with the LexisNexis Risk Solutions Technology team in the Mumbai office to support two students from the NMIMS (Narsee Monjee Institute of Management Studies) in India as part of a 3-month internship opportunity with HPCC Systems. Their internships happened during the first quarter of 2024 and here is a brief overview of their projects.
Harsh Raj
Project: Vehicle Build Contributory System (own idea)
Harsh Raj, currently pursuing a Bachelor of Technology in Data Science at NMIMS, India, embarked on a project focused on data scraping, processing, and visualization of UK car manufacturers, providing valuable insights for the insurance industry.
The objective of Harsh’s project was to create an end-to-end pipeline that automates data extraction using Python libraries like Beautiful Soup and Selenium, performs data transformation and cleaning using HPCC Systems platform capabilities, and visualizes insights through tools like Power BI. Harsh was mentored by Srinivasan Kothandam and Aryaman Gautam, both Software Engineers at LexisNexis Risk Solutions, India.
Girikratna Sharma
Project: Integration of HPCC Systems with Power BI (own idea)
Girikratna Sharma, a Data Science student at NMIMS, India, worked on integrating HPCC Systems with Power BI using WsSQL for enhanced data analytics capabilities.
Girikratna’s project focused on establishing a seamless connection between Power BI and HPCC Systems via WsSQL, enabling data retrieval through SQL queries. Girikratna aimed to automate SOAP requests from Power BI to HPCC Systems, facilitating data analytics and visualization workflows. Girikratna was also mentored by Srinivasan Kothandam and Aryaman Gautam, Software Engineers at LexisNexis Risk Solutions, India.
More to come on these projects
Throughout the summer, the HPCC Systems interns will keep sharing their experiences, challenges, and achievements through their blog journals.
While some interns will speak at the 2024 Community Day Summit in October, all the interns will produce posters about their work which will be available in the HPCC Systems student wiki alongside an abstract and a 5-minute video presentation (see previous years’ posters here). Please keep an eye out later in the year for the posters to be submitted to the 2024 Poster Contest and remember, you get to vote for the winner as an attendee of the event.
So stay tuned as we delve deeper into the innovative projects and the impact the summer interns are making within the HPCC Systems community and beyond. Join us in celebrating their journey of growth, learning, and collaboration!
A shout out to the 2024 mentors!
Of course, none of this would be possible without our talented group of volunteer mentors, whose expertise, dedication, and willingness to share knowledge are invaluable assets to the HPCC Systems internship program.
On behalf of our talented summer interns, I want to express heartfelt thanks to each mentor who has graciously taken on the responsibility of guiding and supporting these transformative projects. As our interns embark on their projects, they are fortunate to have the mentors listed below who are committed to nurturing their potential and fostering innovation within HPCC Systems:
- Amila de Silva, Senior Software Engineer
- Aryaman Gautam, Software Engineer
- Attila Vamos, Consulting Software Engineer
- Bob Foreman, Software Engineering Lead
- Chris Lo, Consulting Software Engineer
- Godji Fortil, Software Engineer
- Jack Del Vecchio, Software Engineer
- Jessica Lorti, Director Technology Marketing
- Michael Gardner, Senior Software Engineer
- Ming Wang, Consulting Software Engineer
- Srinivasan Kothandam, Software Engineer
- Terrence Asselin, Consulting Software Engineer
- Tim Klemm, Senior Software Engineer
HPCC Systems Intern Program
The application period for the HPCC Systems Intern Program opens in late Fall every year. In the meantime, if you are a student thinking of applying or know someone you’d like to encourage to join, visit our list of available projects (new projects will be added as we approach the application period), watch our info session webinar and read our blog about the program for more information. You are also welcome to email us at students@hpccsystems.com.
About the Author
Hugo Watanuki
Manager Community Tech Programs
LexisNexis Risk Solutions
Hugo is responsible for the HPCC Systems internship program at LexisNexis Risk Solutions. He is part of the team effort responsible for supporting the outreach and growth of the HPCC Systems open source community. Hugo holds a PhD in Information Systems and has worked for more than 15 years in various technical roles in the IT industry.