Skip to main content

HPCC Systems blog contributors are engineers and data scientists who for years have enabled LexisNexis customers to use big data to fulfill critical missions, gain competitive advantage, or unearth new discoveries. Check this blog regularly for insights into how HPCC Systems technology can put big data to work for your own organization.

Lili Xu on 11/14/2019
In this blog, I will introduce another clustering bundle: DBSCAN Bundle, a highly scalable and parallelized implementation of DBSCAN algorithm. DBSCAN is a density-based unsupervised machine learning algorithm to automatically cluster the data into subclasses or groups.
Lili Xu on 03/04/2019
Imagine you are sitting in front of thousands of articles and trying to organize them into different folders. How would you accomplish it and how long would you expect to finish it? Reading all the articles one by one and spending days or even months to finish the task? If you have some sort of data but have no clue how to efficiently cluster them, then this article should be a right place to start.