Thu Jul 19, 2018 5:30 am
Login Register Lost Password? Contact Us


How can I remove duplicate records from a file?

Questions around writing code and queries

Tue Jul 12, 2011 1:53 pm Change Time Zone

I have a file that contains a number of duplicate records in the first column, the other 4 columns contain data about each record in the first column but I only want to see one record for each category in the first column to get an accurate count. For example: I have one category called “A” that consists of 3 “A” records but I only want to see 1 “A” record.

Before duplicate “A” records are filtered from column1
Column1 Column2 Column3 Column4 Column5
A
A
A

After duplicate “A” records are filtered from Column2
Column1 Column2 Column3 Column4 Column5
A

How can this be accomplished using ECL?
John.Freibaum
 
Posts: 4
Joined: Fri Jul 08, 2011 3:02 pm

Tue Jul 12, 2011 2:17 pm Change Time Zone

That's what the DEDUP function does. You simply need to SORT the dataset by column 1, then DEDUP by column 1, like this:

s := SORT(ds,Column1);
d := DEDUP(s,LEFT.Column1 = RIGHT.Column1);
OUTPUT(d);

This will keep the first record for each duplicate Column1 value.
richard.taylor@lexisnexis.com
 
Posts: 11
Joined: Wed Jun 15, 2011 6:00 pm

Wed Jul 13, 2011 3:31 pm Change Time Zone

If you really want a count based on column1, TABLE is the way to go:

ECL:
theRecord := {
string1 column1;
string1 column2;
string1 column3;
string1 column4;
string1 column5;
};

inlineData := dataset([
{'A','B','C','D','E'},
{'A','C','D','E','F'},
{'A','D','E','F','G'},
{'B','B','C','D','E'},
{'B','C','D','E','F'}],theRecord);

tableRecord := {
inlineData.column1;
unsigned4 cnt := count(group);
};

column1Counts := table(inlineData,tableRecord,column1);
output(column1Counts);

Result:

column1 cnt
B 2
A 3
joecella
 
Posts: 5
Joined: Fri Jun 17, 2011 4:28 am

Tue Jul 19, 2011 3:09 pm Change Time Zone

Thank you for your help.
John.Freibaum
 
Posts: 4
Joined: Fri Jul 08, 2011 3:02 pm


Return to Programming

Who is online

Users browsing this forum: No registered users and 1 guest

cron