Mon Oct 18, 2021 8:32 am
Login Register Lost Password? Contact Us


Comparing files

Questions around writing code and queries

Fri Jul 08, 2011 3:10 pm Change Time Zone

What is the best way to determine that a new file contains new records when compared to the current file using ECL?
John.Freibaum
 
Posts: 4
Joined: Fri Jul 08, 2011 3:02 pm

Fri Jul 08, 2011 3:17 pm Change Time Zone

The best way would be to use a LEFT ONLY Join.
An example of a left only join would be :

j_new_recs := join(ds, ds_father,
left.FIELD = right.FIELD,
transform(recordof(ds), self := left),
left only,
local);

ds = Current Dataset
ds_father = Father Dataset
FIELD = Any field that is in both the ds and ds_father layout.
gmwitz
 
Posts: 1
Joined: Fri Jul 08, 2011 3:13 pm

Fri Jul 08, 2011 7:10 pm Change Time Zone

Well,

You do (probably) want a left only join; but you need to be careful regarding the join condition.

IF your file has a unique record id, then you can do a simple left only join -

NewRecs := JOIN(NewFile,OldFile,LEFT.UniqueID=RIGHT.UniqueID,TRANSFORM(LEFT),LEFT ONLY);

If your file does NOT have a unique ID - but each record is unique then you can do:

NewRecs := JOIN(NewFile,OldFile,LEFT.Field1=RIGHT.Field1 AND LEFT.Field2=RIGHT.Field2 ... LEFT.FieldN=RIGHT.FieldN,TRANSFORM(LEFT),LEFT ONLY);

If either or both files might contain complete duplicates then you really want to dedup both sides first (or you will get a cross-product out of the join. Thus

N1 := DEDUP(NewFile,WHOLE RECORD,ALL);
N2 := DEDUP(OldFile,WHOLE RECORD,ALL);

// Perform the JOIN on these

If you only want to know if there is new data - but you don't care what it is - then actually you can cheat:

IF ( COUNT(DEDUP(OldFile,WHOLE RECORD,ALL))<>COUNT(DEDUP(OldFile+NewFile,WHOLE RECORD,ALL)),'New Data','None');

Incidentally, the whole business of turning a stream of database snapshots into date-denoted basefile is a moderately sticky business; and is one of the features of our SALT tool (that generates ECL)

David
John.Freibaum wrote:What is the best way to determine that a new file contains new records when compared to the current file using ECL?
dabayliss
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 109
Joined: Fri Apr 29, 2011 1:35 pm

Fri Jul 08, 2011 7:17 pm Change Time Zone

Thank you to the two of you, this is very helpful and informative.
John.Freibaum
 
Posts: 4
Joined: Fri Jul 08, 2011 3:02 pm


Return to Programming

Who is online

Users browsing this forum: No registered users and 1 guest

cron