Clustering on large data samples
Hi,
Using default settings ran AggloN clustering on a dataset containing around 10K text sentences and the clusters generated are not at all convincing, it ends up with around 9K clusters
One observation is that when the input dataset is of smaller size (few hundred sentences) it does a good job in clustering the data, but on a larger dataset it gives strange output
Another observation is that on the same input dataset i get to see different cluster formations on a 10 node and 100 node cluster, this is strange, does anyone else encountered similar situation ?
Why is this behavior
Thanks
Sameer
Using default settings ran AggloN clustering on a dataset containing around 10K text sentences and the clusters generated are not at all convincing, it ends up with around 9K clusters
One observation is that when the input dataset is of smaller size (few hundred sentences) it does a good job in clustering the data, but on a larger dataset it gives strange output
Another observation is that on the same input dataset i get to see different cluster formations on a 10 node and 100 node cluster, this is strange, does anyone else encountered similar situation ?
Why is this behavior
Thanks
Sameer
- sameermsc
- Posts: 66
- Joined: Wed Oct 05, 2011 10:09 am
1 post
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 1 guest