lost order when concat dataset using "+"
Hello,
I want to concat two datasts as the return of a function, however I found that the order of the return dataset is always the same. Is there a way I can preserve the order when doing concatenation?
Thank you,
Will
I want to concat two datasts as the return of a function, however I found that the order of the return dataset is always the same. Is there a way I can preserve the order when doing concatenation?
Thank you,
Will
- Code: Select all
Layout_Person := RECORD
UNSIGNED1 PersonID;
STRING15 FirstName;
STRING25 LastName;
END;
allPeople := DATASET([ {1,'Fred','Smith'},
{2,'Joe','Blow'},
{3,'Joe','Blow'},
{4,'Joe','Blow'},
{3,'Jane','Smith'}],Layout_Person);
somePeople := allPeople(LastName = 'Smith');
// Outputs ---
concat_tb1(dataset(layout_person) dt) := function
t1 := table(dt,{string30 name:= dt.lastname, cnt:=count(group)},lastname);
t2 := table(dt,{string30 name:= dt.firstname, cnt:=count(group)},firstname);
return t1 + t2;
end;
concat_tb2(dataset(layout_person) dt) := function
t2 := table(dt,{string30 name:= dt.firstname, cnt:=count(group)},firstname);
t1 := table(dt,{string30 name:= dt.lastname, cnt:=count(group)},lastname);
return t2 + t1;
end;
output(concat_tb1(allpeople));
output(concat_tb2(allpeople));
- hhuang
- Posts: 2
- Joined: Wed May 08, 2019 7:35 pm
Use the & operator instead of the + operator.
- bforeman
- Community Advisory Board Member
- Posts: 1006
- Joined: Wed Jun 29, 2011 7:13 pm
Thank you Bob.
I tired to replace "+" with "&", but it is still the same.
I also tired to concat record sets outside a function and it had no problem. Don't know why this happened.
Here is my code:
I was expecting two different results:
Blow 3
Smith 2
Joe 5
and
Joe 5
Blow 3
Smith 2
But it always generates the first one.
I tired to replace "+" with "&", but it is still the same.
I also tired to concat record sets outside a function and it had no problem. Don't know why this happened.
Here is my code:
- Code: Select all
Layout_Person := RECORD
UNSIGNED1 PersonID;
STRING15 FirstName;
STRING25 LastName;
END;
allPeople := DATASET([ {1,'Joe','Smith'},
{2,'Joe','Blow'},
{3,'Joe','Blow'},
{4,'Joe','Blow'},
{3,'Joe','Smith'}],Layout_Person);
// Outputs ---
concat_tb1(dataset(layout_person) dt) := function
t1 := table(dt,{string30 name:= dt.lastname, cnt:=count(group)},lastname);
t2 := table(dt,{string30 name:= dt.firstname, cnt:=count(group)},firstname);
return t1 & t2;
end;
concat_tb2(dataset(layout_person) dt) := function
t1 := table(dt,{string30 name:= dt.lastname, cnt:=count(group)},lastname);
t2 := table(dt,{string30 name:= dt.firstname, cnt:=count(group)},firstname);
return t2 & t1;
end;
output(concat_tb1(allpeople));
output(concat_tb2(allpeople));
I was expecting two different results:
Blow 3
Smith 2
Joe 5
and
Joe 5
Blow 3
Smith 2
But it always generates the first one.
- hhuang
- Posts: 2
- Joined: Wed May 08, 2019 7:35 pm
Will,
I think you might have stumbled on a compiler issue. I will ask the development team to look at this.
Thank You!
Bob
I think you might have stumbled on a compiler issue. I will ask the development team to look at this.
Thank You!
Bob
- bforeman
- Community Advisory Board Member
- Posts: 1006
- Joined: Wed Jun 29, 2011 7:13 pm
hhuang,
I duplicated this. You need to submit a JIRA to report the issue.
HTH,
Richard
I duplicated this. You need to submit a JIRA to report the issue.
HTH,
Richard
- rtaylor
- Community Advisory Board Member
- Posts: 1619
- Joined: Wed Oct 26, 2011 7:40 pm
Hi,
This issue has been introduced between:
This issue has been introduced between:
- Code: Select all
community_6.4.14-1 server internal_6.4.38-1 compiler unknown <= works
eclide_7.2.0-rc4 server internal_7.0.18-rc1 Compiler 7.2.0 community_7.2.0.rc4 <= bug
- Allan
- Posts: 444
- Joined: Sat Oct 01, 2011 7:26 pm
Please report a jira - including which platform you are running against (hthor/thor/roxie). I tried your example, and didn't get the results you had, so I must have been doing something different.
I had:
Blow ,3
Smith ,2
Joe ,5
Joe ,5
Blow ,3
Smith ,2
I had:
Blow ,3
Smith ,2
Joe ,5
Joe ,5
Blow ,3
Smith ,2
- ghalliday
- Community Advisory Board Member
- Posts: 199
- Joined: Wed May 18, 2011 9:48 am
ok, I understand the issue. It will only occur on Thor.
The problem is that & only preserves local ordering.
Say you have a 3 way thor with a dataset A with parts a1,a2,a3 and a dataset B with parts b1, b2, b3. Then the global ordering of A & B will be a1, b1, a2, b2, a3, b3.
This is because the parts on node 1 will be appended in order, then the parts on node 2, followed by the parts on node 3.
In your example the TABLE() statements will cause the rows to be distributed to different nodes. The order of the results, and the node they live on coming out of the TABLE() will depend on the size of the thor cluster.
It may have changed between versions because the default implementation of TABLE() may have changed (I haven't checked). In general there are no guarantees about the order or distribution of rows coming out of a (non local) TABLE.
The problem is that & only preserves local ordering.
Say you have a 3 way thor with a dataset A with parts a1,a2,a3 and a dataset B with parts b1, b2, b3. Then the global ordering of A & B will be a1, b1, a2, b2, a3, b3.
This is because the parts on node 1 will be appended in order, then the parts on node 2, followed by the parts on node 3.
In your example the TABLE() statements will cause the rows to be distributed to different nodes. The order of the results, and the node they live on coming out of the TABLE() will depend on the size of the thor cluster.
It may have changed between versions because the default implementation of TABLE() may have changed (I haven't checked). In general there are no guarantees about the order or distribution of rows coming out of a (non local) TABLE.
- ghalliday
- Community Advisory Board Member
- Posts: 199
- Joined: Wed May 18, 2011 9:48 am
8 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 2 guests