Fri Dec 03, 2021 1:21 am
Login Register Lost Password? Contact Us


Request help parsing text.

Questions around writing code and queries

Fri Dec 20, 2019 11:40 am Change Time Zone

Hi,

I've hit the buffers on my understanding of parsing text.
I have text of the forms:
Code: Select all
item=Allan and Anna
item='Allan and Anna'
item="Nina Colin"
item=R'allan and anna'
item=R"bill and Megan"

Much like 'Python' I want to have an 'R' qualifier tied to strings.
however the 'R' is matching the normal unquoted string.
I've read up about 'pattern1' NOT IN 'pattern2' but after many experiments failed to
match R'text' to a single Pattern.

Err please help?

Yours

Allan
Allan
 
Posts: 442
Joined: Sat Oct 01, 2011 7:26 pm

Mon Dec 23, 2019 8:31 am Change Time Zone

Allan,

OK, I just looked up what a Python R string is, but I'm not quite understanding what your problem is. Can you show me example input text and the result you'd like to produce from that input, please?

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1606
Joined: Wed Oct 26, 2011 7:40 pm

Mon Dec 30, 2019 9:00 am Change Time Zone

Happy new year Richard!

The application I'm writing has to generate text with, or without the inputs enclosing quotes (if there are any). It's not as simple as just defining 'always output quotes if a string is quoted'. I need a BOOLEAN attached to a string to indicate 'enclose in quotes'.
So if my input is:
Code: Select all
An unquoted string
'A quoted, string'
R'A quoted, string'

The output is:
Code: Select all
Quoted     Text
FALSE      An unquoted string
TRUE       A quoted, string
FALSE      A quoted, string


Actually this is now just an academic exercise as I've now implemented a completely different regime, but I would still be interested in understanding a parsing solution to this example.

Many thanks
Allan
Allan
 
Posts: 442
Joined: Sat Oct 01, 2011 7:26 pm

Tue Dec 31, 2019 8:00 am Change Time Zone

Allan,

OK, purely as an academic exercise ( 8-) ), here's how I would approach it:
Code: Select all
ds := DATASET([
  {1,'item=Allan and Anna'},
  {2,'item=\'Allan and Anna\''},
  {3,'item="Nina Colin"'},
  {4,'item=R\'allan and anna\''},
  {5,'item=R"bill and Megan"'}],
{unsigned1 UID, string line});

PATTERN alpha := PATTERN('[A-Za-z ]')+;
PATTERN qChar := ['\'','"'];
PATTERN quote1 := OPT('R') qChar;
PATTERN start := 'item=';
RULE quoterule := start OPT(quote1) alpha OPT(qChar);

Rec := {unsigned1 UID, BOOLEAN quoted,STRING  Txt};
Rec XF(ds L) := TRANSFORM
  SELF.UID := L.UID;
  SELF.quoted := MATCHED(quote1);
  SELF.Txt := MATCHTEXT(alpha);
END;

PARSE(ds,line,quoterule,XF(LEFT),FIRST);

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1606
Joined: Wed Oct 26, 2011 7:40 pm

Fri Jan 03, 2020 12:14 pm Change Time Zone

Thanks Richard,

Nice example, I'll have to attempt to adjust to distinguish between an R' quoted string and an ' quoted string. (as the 'R' is OPT) Your example does not make the distinction.

Yours

Allan
Allan
 
Posts: 442
Joined: Sat Oct 01, 2011 7:26 pm

Fri Jan 03, 2020 12:22 pm Change Time Zone

This does the job:
Code: Select all
ds := DATASET([
  {1,'item=Allan and Anna'},
  {2,'item=\'Allan and Anna\''},
  {3,'item="Nina Colin"'},
  {4,'item=R\'allan and anna\''},
  {5,'item=R"bill and Megan"'}],
{unsigned1 UID, string line});

PATTERN alpha := PATTERN('[A-Za-z ]')+;
PATTERN qChar := ['\'','"'];
PATTERN qStr   := PATTERN('R');
PATTERN quote1 := OPT(qStr) qChar;
PATTERN start := 'item=';
RULE quoterule := start OPT(quote1) alpha OPT(qChar);

Rec := {unsigned1 UID, BOOLEAN quoted,BOOLEAN Qualified,STRING  Txt};
Rec XF(ds L) := TRANSFORM
  SELF.UID := L.UID;
  SELF.quoted := MATCHED(quote1);
  SELF.Qualified := MATCHED(qStr);
  SELF.Txt := MATCHTEXT(alpha);
END;

PARSE(ds,line,quoterule,XF(LEFT),FIRST);
Allan
 
Posts: 442
Joined: Sat Oct 01, 2011 7:26 pm


Return to Programming

Who is online

Users browsing this forum: No registered users and 1 guest