Fri Dec 03, 2021 1:27 am
Login Register Lost Password? Contact Us


MERGE fails with set of dataset

Topics specific to the use of the ECL IDE

Tue Jun 12, 2012 1:30 pm Change Time Zone

I am new to ecl and hpcc system.
To sumarize my issue : I can merge two datasets with first syntax but I can't merge a set of dataset with second syntax. As I want to combine undetermined length datasets, I would prefer to use second form if possible.

In detail:
I am using the sample from documentation MERGE as below :

SetDS := [ds1,ds2];
ds4 := MERGE([ds1,ds2],letter,number);

With this syntax, I first receive a warning about deprecated use.

I replace by
ds4 := MERGE([ds1,ds2],letter,number,SORTED(letter,number));

then, I submit the query and fail with error message below :

000001D7 2012-06-12 14:35:52 2092 2092 Processing graph - graph(graph1, 1)
000001D8 2012-06-12 14:35:52 2092 2881 3000: Graph[1], nwaymerge[4]: SLAVE 192.168.23.128:20100: assert(started()) failed - file: /var/jenkins/workspace/CE-Candidate-3.6.2/CE/Ubuntu-10.04-i386/HPCC-Platform/thorlcr/activities/./../activities/thactivityutil.ipp, line 211
000001D9 2012-06-12 14:35:52 2092 2881 INFORM [EXCEPTION]
000001DA 2012-06-12 14:35:52 2092 2881 3000: Graph[1], nwaymerge[4]: SLAVE 192.168.23.128:20100: assert(started()) failed - file: /var/jenkins/workspace/CE-Candidate-3.6.2/CE/Ubuntu-10.04-i386/HPCC-Platform/thorlcr/activities/./../activities/thactivityutil.ipp, line 211
000001DB 2012-06-12 14:35:52 2092 2881 Posting exception: Graph[1], nwaymerge[4]: SLAVE 192.168.23.128:20100: assert(started()) failed - file: /var/jenkins/workspace/CE-Candidate-3.6.2/CE/Ubuntu-10.04-i386/HPCC-Platform/thorlcr/activities/./../activities/thactivityutil.ipp, line 211 to agent 192.168.23.128 for workunit(W20120612-143551)
000001DC 2012-06-12 14:35:52 2092 2881 INFORM [EXCEPTION]

It seems something goes wrong with graph but I don't know how to bypass this problem.
I would appreciate if you could help me. Thanks.
JM.
ideal
 
Posts: 86
Joined: Tue Jun 12, 2012 1:17 pm

Tue Jun 12, 2012 2:42 pm Change Time Zone

JM,

OK, I duplicated the problem and will report it.

The issue appears to me to be specific to the second form of MERGE (set of datasets) and the first form (comma-delimited list of files) does operate correctly in my testing, so using that first form would be your workaround.

I have also updated the example code in the docs to add that missing SORT option.

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1606
Joined: Wed Oct 26, 2011 7:40 pm

Wed Jun 13, 2012 11:07 am Change Time Zone

Hello Richard,

I am not sure now I need specifically a MERGE operation.
What I need exactly is to make a dataset from a set of datasets without using artificial records that complexify the code.

Now I am facing a new choice : either I find a suitable workaround, or I recompile source code branch with the fix on the VM image.

I would prefer the second solution as it is cleaner but I wonder if it is easy to do.

I have found some workaround but I am facing a new problem concerning access to one dataset in a set of dataset, by its indice ("sds[i]").

In detail, here is a simple example to illustrate :

Code: Select all
rec := RECORD
INTEGER i;
END;

ds1 := DATASET([{1},{2},{3}],rec);
ds2 := DATASET([{1},{2},{3}],rec);
ds3 := DATASET([{1},{2},{3}],rec);

SET OF DATASET(rec) sds := [ds1,ds2,ds3];

// f do nothing : return always the same dataset, just for test
DATASET(rec) f(SET OF DATASET(rec) fds,INTEGER c) := FUNCTION
   RETURN DATASET([{1},{2},{3}],rec);
END;

DATASET(rec) ds := GRAPH(sds[0],3,f(ROWSET(LEFT),COUNTER));

OUTPUT(ds);


I get an error :

C:\Users\JEAN-M~1\AppData\Local\Temp\TFRCC4.tmp (19,29) : 3000: assert(false) failed - file: /var/jenkins/workspace/CE-Candidate-3.6.2/CE/Ubuntu-10.04-i386/HPCC-Platform/common/deftype/deftype.ipp, line 553

When I change the code with :
Code: Select all
DATASET(rec) ds := GRAPH(ds1,3,f(ROWSET(LEFT),COUNTER));
there is no more error.

I am surprised because the sds[0] should be accessible as a dataset according to GRAPH documentation example.

I have also another question : is it possible to have an equivalent of COUNT operation with a set of dataset ?

Best Regards,
JM
ideal
 
Posts: 86
Joined: Tue Jun 12, 2012 1:17 pm

Wed Jun 13, 2012 2:10 pm Change Time Zone

JM,
What I need exactly is to make a dataset from a set of datasets without using artificial records that complexify the code.
To my simple mind, this sounds like you just want to treat multiple datasets as a single entity and query that single entity. If that is correct then there are a couple of ways to go about it. The first is to simply append the datasets, using either the + or & operators, llike this:
Code: Select all
rec := RECORD
  INTEGER i;
END;

ds1 := DATASET([{1},{2},{3}],rec);
ds2 := DATASET([{3},{2},{1}],rec);
ds3 := DATASET([{5},{4},{6}],rec);

dsA := ds1 + ds2 + ds3;
dsB := ds1 & ds2 & ds3;

OUTPUT(dsA);
OUTPUT(dsB);
The second way to go about it would be to use Superfiles, as described in the Programmer's Guide (or come to our ECL classes http://hpccsystems.com/community/traini ... s/training -- Superfiles are taught in the Advanced Thor class).
I am surprised because the sds[0] should be accessible as a dataset according to GRAPH documentation example.
I think you may have mis-read the docs. You were trying to apply the GRAPH docs discussion of its third parameter and use that information in its first parameter -- which obviously does not work. :)

The ROWSET(LEFT) may take an index value of 0 as an argument to the processor call specified by the third parameter to GRAPH. AFAIK, this is the only use of a 0 index value anywhere in ECL. ECL is 1-based in all other cases.

I have also another question : is it possible to have an equivalent of COUNT operation with a set of dataset?
If you want a COUNT of the number of datasets in the set, then yes. If you want a COUNT of the number of records across all the datasets in the set, then no. However, COUNT will function correctly on the appended datasets and superfiles I described above.

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1606
Joined: Wed Oct 26, 2011 7:40 pm


Return to ECL IDE

Who is online

Users browsing this forum: No registered users and 1 guest