Sun May 19, 2019 1:09 pm
Login Register Lost Password? Contact Us


Fuzzy Match

Comments and questions related to the Enterprise Control Language

Mon Mar 11, 2019 12:13 pm Change Time Zone

Hi,
Richard, there is any way to do fuzzy matched and return the full company name.
return the full company name and address
all most all row are same.

Code: Select all
CompanyRecSet:= dataset([{'HPCC system','101 Sussex Street'},
                                    {'HPCC','101 Sussex Street'},
                                    {'AET','1900 West Loop South, Suite 920'},
                                    {'AET UK LIMITED','1900 West Loop South, Suite 920'}],
                                    {varstring ComapnyName,varstring Address});

CompanyRecSet;



thanks and regard
Suleman shreef
suleman Shreef
 
Posts: 14
Joined: Wed Feb 27, 2019 9:15 am

Mon Mar 11, 2019 3:52 pm Change Time Zone

Suleman,

Here are a couple of ways:
Code: Select all
IMPORT Std;
CompanyRecSet:= DATASET([{'HPCC system','101 Sussex Street'},
                         {'HPCC','101 Sussex Street'},
                         {'HPCC Systems Inc','101 Sussex Street'},
                         {'HPCsystems Limited','201 Sussex Street'},
                         {'AET','1900 West Loop South, Suite 920'},
                         {'AET UK LIMITED','1900 West Loop South, Suite 920'}],
                         {STRING CompanyName,STRING Address});

CompanyRecSet;
CompanyRecSet(Std.Str.StartsWith(CompanyName,'HPCC'));
CompanyRecSet(Std.Str.WildMatch(CompanyName,'*System*',TRUE));
The Standard Library Reference contains the documentation for these (and many other) string handling functions. Press F1 in the ECL IDE and you'll see that book is part of the online Help file.

BTW, you will note that I changed your VARSTRINGs to STRINGs. In HPCC, STRING is the default string type. VARSTRING is only used for data coming in or going out that has/needs actual null terminators -- internally within HPCC there is no advantage to using VARSTRING over STRING.

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1452
Joined: Wed Oct 26, 2011 7:40 pm

Tue Mar 12, 2019 5:15 am Change Time Zone

Richard, I have done using string build in function, but based on matched score like 80 matched need to select.
suleman Shreef
 
Posts: 14
Joined: Wed Feb 27, 2019 9:15 am

Tue Mar 12, 2019 2:39 pm Change Time Zone

You can also use the metaphone library support in the Standard Library to create "sounds like" fuzzy matching.

See http://cdn.hpccsystems.com/releases/CE- ... f#page=111

HTH,

Jim
JimD
 
Posts: 143
Joined: Wed May 18, 2011 1:35 pm


Return to ECL

Who is online

Users browsing this forum: No registered users and 1 guest

cron