Sun May 19, 2019 8:47 pm
Login Register Lost Password? Contact Us


Divide the dataset into time windows

Comments and questions related to the Enterprise Control Language

Wed Apr 10, 2019 1:43 pm Change Time Zone

Hi,

I have a dataset containing dates and times in each record. Now, I would like to divide the dataset into one hour windows. So, I would like smaller sets, such that the difference in time between the last and first record is at most one hour. I can easily compute the difference between times and know whether the difference is less or equal than an hour but I do not know how to construct the smaller sets.

My record looks like this:

datalayout := RECORD
STRING lineId;
Std.Date.Date_t date;
INTEGER time;
STRING eventId;
STRING eventTemplate;
END;

dataset := DATASET(dataPath, datalayout, THOR);
vzeufack
 
Posts: 14
Joined: Tue Sep 25, 2018 3:52 pm

Wed Apr 10, 2019 2:20 pm Change Time Zone

vzeufack,

Assuming your time data is similar to our Time_t format, you could simply add a grouping field, like this:
Code: Select all
IMPORT Std;
datalayout := RECORD
  STRING lineId;
  Std.Date.Date_t date;
  Std.Date.Time_t time;  //integer time in HHMMSS format
  STRING eventId;
  STRING eventTemplate;
END;

ds := DATASET([{'1',20190101, 12300,'A','ABC'},
               {'2',20190101, 15300,'B','ABC'},
               {'3',20190101,123300,'C','ABC'},
               {'4',20190101,125300,'D','ABC'},
               {'5',20190101,172300,'E','ABC'},
               {'6',20190101,175300,'F','ABC'}
              ], datalayout);

HrGrp(Std.Date.Time_t t) := TRUNCATE(t/10000);

PROJECT(ds,
        TRANSFORM({datalayout,UNSIGNED1 TimeGrp},
                  SELF.TimeGrp := HrGrp(LEFT.time),
                  SELF := LEFT));
Then use that grouping field however you need to, such as in the GROUP function, or the dedup condition for a ROLLUP or self-JOIN, or ... whatever you need.

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1452
Joined: Wed Oct 26, 2011 7:40 pm

Wed Apr 10, 2019 9:43 pm Change Time Zone

Thanks very much RTAYLOR!

It works perfectly although your proposition changes a bit my logic. Indeed I was trying to know how to implement the following logic in ECL, which would be done using a loop in languages like Java:

- Get the time of the first record
- compute time difference with next records until the difference exceeds an 1h
- group
- Repeat starting with next record.

I think the logic of you proposed is to group records according to their hours right?

So concretely, if the first record has timestamp 12:10:15, then I would like to group all records within an hour from 12:10:15. Then I would get the next record which would have timestamp 13:10:15 or greater and do the same process. How can I achieve that?
vzeufack
 
Posts: 14
Joined: Tue Sep 25, 2018 3:52 pm

Fri May 03, 2019 11:11 am Change Time Zone

Hi vzeufack,

I saw your post and it reminded me of a very similar problem I had a long time back.

One that RICHARD also solved for me.

See post:
https://hpccsystems.com/bb/viewtopic.php?f=10&t=3383

Its a bit different in that I needed to group all records that were within some time period (a year I think). So one record could end up in two groups.

Note quite what you want but an interesting problem and interesting solution.
Yours
Allan
Allan
 
Posts: 363
Joined: Sat Oct 01, 2011 7:26 pm

Fri May 03, 2019 2:06 pm Change Time Zone

Thanks very much Allan!
vzeufack
 
Posts: 14
Joined: Tue Sep 25, 2018 3:52 pm


Return to ECL

Who is online

Users browsing this forum: No registered users and 2 guests

cron