Skip to main content

REGEXFINDSET

REGEXFINDSET(regex, text [, NOCASE])

regexA standard Perl regular expression.
textThe text to parse.
NOCASEOptional. Specifies a case insensitive search.
Return:REGEXFINDSET returns a set of strings.

The REGEXFINDSET function uses the regex to parse through the text and find matches. The regex must be a standard Perl regular expression. We use third-party libraries to support this, so for non-unicode text, see boost docs at http://www.boost.org/doc/libs/1_58_0/libs/regex/doc/html/index.html. Note that the version of Boost library may vary depending on your distro. For unicode text, see the ICU docs, the sections 'Regular Expression Metacharacters' and 'Regular Expression Operators' at http://userguide.icu-project.org/strings/regexp and the links from there, in particular the section 'UnicodeSet patterns' at http://userguide.icu-project.org/strings/unicodeset. We use version 2.6 which should support all listed features.

Example:

sampleStr := 
  'To: jane@example.com From: john@example.com This is the winter of our discontent.';
eMails:=REGEXFINDSET('\\w+@[a-zA-Z_]+?\\.[a-zA-Z]{2,3}' , sampleStr);
OUTPUT(eMails);

UNICODE sampleStr2:= 
  U'To: janë@example.com From john@example.com This is the winter of our discontent.';
eMails2:= REGEXFINDSET(U'\\w+@[a-zA-Z_]+?\\.[a-zA-Z]{2,3}', sampleStr2);
OUTPUT(eMails2);

See Also: PARSE, REGEXFIND, REGEXREPLACE