Fri Aug 19, 2022 5:43 pm
Login Register Lost Password? Contact Us

Please Note: The HPCC Systems forums are moving to Stack Overflow. We invite you to post your questions on Stack Overflow utilizing the tag hpcc-ecl (https://stackoverflow.com/search?tab=newest&q=hpcc-ecl). This legacy forum will be active and monitored during our transition to Stack Overflow but will become read only beginning September 1, 2022.



Divide the dataset into time windows

Comments and questions related to the Enterprise Control Language

Wed Apr 10, 2019 1:43 pm Change Time Zone

Hi,

I have a dataset containing dates and times in each record. Now, I would like to divide the dataset into one hour windows. So, I would like smaller sets, such that the difference in time between the last and first record is at most one hour. I can easily compute the difference between times and know whether the difference is less or equal than an hour but I do not know how to construct the smaller sets.

My record looks like this:

datalayout := RECORD
STRING lineId;
Std.Date.Date_t date;
INTEGER time;
STRING eventId;
STRING eventTemplate;
END;

dataset := DATASET(dataPath, datalayout, THOR);
vzeufack
 
Posts: 41
Joined: Tue Sep 25, 2018 3:52 pm

Wed Apr 10, 2019 2:20 pm Change Time Zone

vzeufack,

Assuming your time data is similar to our Time_t format, you could simply add a grouping field, like this:
Code: Select all
IMPORT Std;
datalayout := RECORD
  STRING lineId;
  Std.Date.Date_t date;
  Std.Date.Time_t time;  //integer time in HHMMSS format
  STRING eventId;
  STRING eventTemplate;
END;

ds := DATASET([{'1',20190101, 12300,'A','ABC'},
               {'2',20190101, 15300,'B','ABC'},
               {'3',20190101,123300,'C','ABC'},
               {'4',20190101,125300,'D','ABC'},
               {'5',20190101,172300,'E','ABC'},
               {'6',20190101,175300,'F','ABC'}
              ], datalayout);

HrGrp(Std.Date.Time_t t) := TRUNCATE(t/10000);

PROJECT(ds,
        TRANSFORM({datalayout,UNSIGNED1 TimeGrp},
                  SELF.TimeGrp := HrGrp(LEFT.time),
                  SELF := LEFT));
Then use that grouping field however you need to, such as in the GROUP function, or the dedup condition for a ROLLUP or self-JOIN, or ... whatever you need.

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1619
Joined: Wed Oct 26, 2011 7:40 pm

Wed Apr 10, 2019 9:43 pm Change Time Zone

Thanks very much RTAYLOR!

It works perfectly although your proposition changes a bit my logic. Indeed I was trying to know how to implement the following logic in ECL, which would be done using a loop in languages like Java:

- Get the time of the first record
- compute time difference with next records until the difference exceeds an 1h
- group
- Repeat starting with next record.

I think the logic of you proposed is to group records according to their hours right?

So concretely, if the first record has timestamp 12:10:15, then I would like to group all records within an hour from 12:10:15. Then I would get the next record which would have timestamp 13:10:15 or greater and do the same process. How can I achieve that?
vzeufack
 
Posts: 41
Joined: Tue Sep 25, 2018 3:52 pm

Fri May 03, 2019 11:11 am Change Time Zone

Hi vzeufack,

I saw your post and it reminded me of a very similar problem I had a long time back.

One that RICHARD also solved for me.

See post:
https://hpccsystems.com/bb/viewtopic.php?f=10&t=3383

Its a bit different in that I needed to group all records that were within some time period (a year I think). So one record could end up in two groups.

Note quite what you want but an interesting problem and interesting solution.
Yours
Allan
Allan
 
Posts: 444
Joined: Sat Oct 01, 2011 7:26 pm

Fri May 03, 2019 2:06 pm Change Time Zone

Thanks very much Allan!
vzeufack
 
Posts: 41
Joined: Tue Sep 25, 2018 3:52 pm


Return to ECL

Who is online

Users browsing this forum: No registered users and 2 guests

cron