Tue Aug 16, 2022 9:24 pm
Login Register Lost Password? Contact Us

Please Note: The HPCC Systems forums are moving to Stack Overflow. We invite you to post your questions on Stack Overflow utilizing the tag hpcc-ecl (https://stackoverflow.com/search?tab=newest&q=hpcc-ecl). This legacy forum will be active and monitored during our transition to Stack Overflow but will become read only beginning September 1, 2022.

Clustering on large data samples

Topics related to the set of Machine Learning libraries and Matrix processing algorithms

Thu Nov 26, 2015 5:01 pm Change Time Zone


Using default settings ran AggloN clustering on a dataset containing around 10K text sentences and the clusters generated are not at all convincing, it ends up with around 9K clusters

One observation is that when the input dataset is of smaller size (few hundred sentences) it does a good job in clustering the data, but on a larger dataset it gives strange output

Another observation is that on the same input dataset i get to see different cluster formations on a 10 node and 100 node cluster, this is strange, does anyone else encountered similar situation ?

Why is this behavior

Posts: 66
Joined: Wed Oct 05, 2011 10:09 am

Return to Machine Learning

Who is online

Users browsing this forum: No registered users and 1 guest