Mon Sep 20, 2021 3:10 pm
Login

## Gravitational Clustering

This forum is for topics related to the Google Summer of Code (GSoC) projects and the HPCC Systems Intern program.
jamienoss

Posts: 2
Joined: Tue Sep 17, 2013 12:27 pm

hi jamie
My name is rishab and i have applied for gsoc 2015 for the implementation of g-clustering algorithm.In the comment section you told me to think over certain points.I have pondered over them.you told that we need to do some pathological testing for the random sampling of the point and calculation of force between the current and the randomly selected point.I think the algorithm can go wrong but if we iterate over many a times with variation of the force parameters then we can find an optimum value.But we should be careful not to tinker with values to extent its supervised behaviour is affected.so we will need to create a dummy dataset to get range of parameters such that the algorithm works.
please reply if I am right.
rgoel_0112

Posts: 2
Joined: Tue Apr 07, 2015 4:39 pm

Hi Rishab,

Iterating the entire procedure over G-deltaG space is a possible solution and would be regarded as a Monte-Carlo approach. In fact such computational 'runs' could be done in parallel. You could then look for boundaries of cluster convergence/divergence in this G-deltaG space, which would be a really interesting analysis of the algorithm itself - almost a must. However, using a Monte Carlo method as an actual implementation detail may not be desirable for this particular algorithm. I guess an incredible low resolution implementation of this phase space may be doable/useful. Another solution could be to compute a rough estimate for the fractal dimension of the data-space which would give you an estimate on the clustering/sparseness of the sample. This could then be used in a trivial calculation to determine adequate values for the actual algorithm's parameters.

None of the above methods would be considered 'tinkering'. 'Tinkering' refers to the need to cater such parameters on a sample by sample basis and 'by-hand' i.e. have the actual user test for a good set of parameters based on their particular dataset.

Kind regards,
Jamie
jamienoss

Posts: 2
Joined: Tue Sep 17, 2013 12:27 pm

Return to Student Programs

### Who is online

Users browsing this forum: No registered users and 1 guest