Let’s get one thing straight…
Evolution through natural selection is an incredibly slow process. As powerful a process as it is, it took millennia for dogs to evolve from wolves; it is undeniable that evolution in nature takes time.
Evolution through open source is, on the other hand, an incredibly accelerated process.
By openly working together as a community, humans have been able to pivot and change the way the software development process works, evolving software at a much more accelerated pace than its proprietary closed source alternatives. This communal approach really speeds up evolution in every aspect of a product, from development, to testing, to documentation, to deployment and everyone reaps the rewards.
Like many other open source applications, our open source HPCC Systems platform project is hosted in GitHub, the great collaboration tool built on top of the git distributed version control system that Linus Torvalds created more than a decade ago.
In case you’ve been living in a box and don’t know GitHub, it hosts 38+ million projects, including our HPCC Systems platform which has over 16,000 commits and 7,000 pull requests since we migrated to open source five years ago (kudos to the whole community for helping out, innovating and evolving the platform).
By taking a company’s private tech and opening it up to the public, companies can go well beyond simple commercial applications. Facebook has already proven that it can massively improve the world’s infrastructure through the Open Compute Project and Amazon, Google and Netflix have also been traditionally good open source corporate citizens. The Android platform in particular has done a fantastic job of adapting the Linux kernel and associated ecosystem to mobile devices. Companies like HTC, Huawei, LG, Samsung and Sony have contributed and customized the platform to their own visions, and have further advanced the core platform.
It is important to remember that no code is completely neutral and immediately usable. When a developer creates something, it is always done for one of three reasons:
- For their own benefit.
- For the benefit of the company they work for.
- For entertainment value (just for fun).
This is good, but by sharing this code openly, this knowledge and work can then be built upon, extended and evolved for the community’s benefit.
The HPCC Systems platform was open sourced and released five years ago in 2011 for the express purpose of making a complete and powerful big data solution widely available to the community. As part of this open source big data platform, the Thor and Roxie clusters effortlessly transform data into valuable insights usable across an organization. ECL, its open dataflow programming language, has been conceived from the ground up to manipulate and query data in ways that are more natural to data analysts than other more traditional languages. With ECL, users can focus on what they want to show with the data, rather than in the data processing itself.
I’m excited by how we’ve managed to make HPCC Systems accessible to anyone that handles data and how the community has helped evolve every aspect of the platform on GitHub. Here are a few mandatory links, if you’d like to take a look for yourself and join this great open source community:
- The HPCC Systems platform
- Visualization Framework
- ROXIE Dashboard
- Integration with other data stores: Pentaho, Cassandra, KafkaHadoop
- Integration with open source visualization toolkits: D3, Dojo
- Integration with Machine Learning and other languages: ECL-ML, R, Java-API
With the amount of data being stored in medical records, Fitbit devices, Twitter feeds, Nest thermostats and the rest of the Internet of Things, there is so much potential for developers to better everyone’s lives. By open sourcing application stacks and data protocols and making these all accessible and comparable, I think the next five years for HPCC Systems holds unlimited potential.
Editor’s Note: Next in this series of The Evolution of HPCC Systems, Flavio will discuss the future of ECL and the potential machine learning holds.