Thu Aug 11, 2022 7:58 pm
Login Register Lost Password? Contact Us

Please Note: The HPCC Systems forums are moving to Stack Overflow. We invite you to post your questions on Stack Overflow utilizing the tag hpcc-ecl ( This legacy forum will be active and monitored during our transition to Stack Overflow but will become read only beginning September 1, 2022.

Iterating Data Analysis

Topics specific to using ECL from a Data Analyst standpoint

Fri Jun 08, 2012 6:07 pm Change Time Zone

First a caveat - my knowledge of medicine is minimal - so it is plausible I have misunderstood some of your words. Notwithstanding:

1) The only question here is whether or not you know your grouping criteria apriori. If you do then a simple project where you use a MAP to convert your data into categoric variables will work fine.
If you do not know the grouping apriori then the Discretize routines inside ML allows you to categorize the various variables into either percentiles or evenly across a range.

2) This is the bit i am least likely to understand. In order to compute survival rate (as I understand it) - you need to know who is entering (through diagnosis) and who is leaving (through death) the population. That is what my first and last is computing. If you have those two I -think- you can compute the survival rate.

3) This is a straightforward clustering exercise. The agglomerative clustering piece of the ML libraries perform both the distance metric (we have about half a dozen) and the hierarchical clustering for you.

4) The above all work at extreme scale ...
Community Advisory Board Member
Community Advisory Board Member
Posts: 109
Joined: Fri Apr 29, 2011 1:35 pm

Fri Jun 08, 2012 6:33 pm Change Time Zone

Thanks dabayliss,

I will give that a shot and let you know how it works out.

Posts: 6
Joined: Wed Jun 06, 2012 5:50 pm


Return to ECL for Analysts

Who is online

Users browsing this forum: No registered users and 1 guest