Skip to main content

REGEXFIND

REGEXFIND(regex, text [, flag ] [, NOCASE])

regexA standard Perl regular expression.
textThe text to parse.
flagOptional. Specifies the text to return. If omitted, REGEXFIND returns TRUE or FALSE as to whether the regex was found within the text. If 0, the portion of the text the regex was matched is returned. If >= 1, the text matched by the nth group in the regex is returned.
NOCASEOptional. Specifies a case insensitive search.
Return:REGEXFIND returns a single value.

The REGEXFIND function uses the regex to parse through the text and find matches. The regex must be a standard Perl regular expression. We use third-party libraries to support this, so for non-unicode text, see boost docs at http://www.boost.org/doc/libs/1_58_0/libs/regex/doc/html/index.html. Note that the version of Boost library may vary depending on your distro. For unicode text, see the ICU docs, the sections 'Regular Expression Metacharacters' and 'Regular Expression Operators' at http://userguide.icu-project.org/strings/regexp and the links from there, in particular the section 'UnicodeSet patterns' at http://userguide.icu-project.org/strings/unicodeset. We use version 2.6 which should support all listed features.

Example:

namesRecord := RECORD
STRING20 surname;
STRING10 forename;
STRING10 userdate;
END;
namesTbl := DATASET([ {'Halligan','Kevin','10/14/1998'},
{'Halligan','Liz','12/01/1998'},
{'Halligan','Jason','01/01/2000'},
{'MacPherson','Jimmy','03/14/2003'} ],
namesRecord);
searchpattern := '^(.*)/(.*)/(.*)$';
search := '10/14/1998';

filtered := namesTbl(REGEXFIND('^(Mc|Mac)', surname));

OUTPUT(filtered); //1 record -- MacPherson
OUTPUT(namesTbl,{(string30)REGEXFIND(searchpattern,userdate,0),
(string30)REGEXFIND(searchpattern,userdate,1),
(string30)REGEXFIND(searchpattern,userdate,2),
(string30)REGEXFIND(searchpattern,userdate,3)});

REGEXFIND(searchpattern, search, 0); //returns
        '10/14/1998'
REGEXFIND(searchpattern, search, 1); //returns '10'
REGEXFIND(searchpattern, search, 2); //returns '14'
REGEXFIND(searchpattern, search, 3); //returns '1998'

See Also: PARSE, REGEXFINDSET, REGEXREPLACE