Mailing List

Nested
Need a regexp superhero
User: mcholste
Date: 12/18/2012 1:19 pm
Views: 779
Rating: 0
Ok, this is getting way harder than it should be, trying to parse something like this:

"funcOne(funcTwo | funcThree(a,b)),c,d)"

parsed into:
"funcOne", "funcTwo | funcThree(a,b))", "c", "d"

I've got a hack working where I use one regex:

/([^\(]+)\(?( [^()]*+ | (?0) )\)?$/x

to capture the first function name and detect the inner nested parens, then I use that to create a "mask" to replace those strings within the larger string, (to remove the commas), then do a normal split to find the last two params ("c", "d").  There's got to be a better way, suggestions?

Thanks,

Martin
Re: Need a regexp superhero
User: tmurray
Date: 12/18/2012 2:37 pm
Views: 0
Rating: 0
Perl's regex engine might be technically capable of this, but depending on
how generalized a solution you need, it might be easier to use Marpa or
Parse::RecDescent.

>
> mcholste wrote:

Ok, this is getting way harder than it should be,
> trying to parse something like this:
>
> "funcOne(funcTwo | funcThree(a,b)),c,d)"
>
> parsed into:
>   "funcOne", "funcTwo | funcThree(a,b))", "c", "d"
>
> I've got a hack working where I use one regex:
>
> /([^\(]+)\(?( [^()]*+ | (?0) )\)?$/x
>
> to capture the first function name and detect the inner nested parens,
> then I use that to create a "mask" to replace those strings within the
> larger string, (to remove the commas), then do a normal split to find the
> last two params ("c", "d").  There's got to be a better way,
> suggestions?
>
> Thanks,
>
> Martin
>
> View Online
>
>  Madison Area Perl Mongers - MadMongers
>  http://www.madmongers.org
>

Re: Need a regexp superhero
User: afbach
Date: 12/18/2012 3:55 pm
Views: 133
Rating: 0
On Tue, Dec 18, 2012 at 1:19 PM, <mcholste@gmail.com> wrote:
trying to parse something like this:

"funcOne(funcTwo | funcThree(a,b)),c,d)"

How consistent is your spacing/formatting there? What sort of variance (number of params etc.) are you working with.

I can't get your RE to work - did it get munged (I added some more space, but "?0" isn't something I could figure out):

/([^\(]+)    \(?  ( [^()]*+ | (?0) )  \)? $  /x

--

a

Andy Bach,
afbach@gmail.com
608 658-1890 cell
608 261-5738 wk
Re: Need a regexp superhero
User: mcholste
Date: 12/18/2012 4:23 pm
Views: 0
Rating: 0
It's parsing user input, so the spacing and number of params are completely variable.  From what I can tell (didn't dig into any man pages), "?0" will refer to the other paren in the paren pairs when matching, but that's more of a guess.  That particular RE will get ou the first function name followed by everything after, (including the trailing paren).

perl -le 'my $re = qr/([^\(]+)    \(?  ( [^()]*+ | (?0) )  \)? $  /x; $str = "funcOne(funcTwo | funcThree(a,b)),c,d)"; if (@m = $str =~ $re){ print "y: " . join("#", @m); }'


On Tue, Dec 18, 2012 at 3:55 PM, <afbach@gmail.com> wrote:

afbach wrote:

On Tue, Dec 18, 2012 at 1:19 PM, <mcholste@gmail.com> wrote:
trying to parse something like this:

"funcOne(funcTwo | funcThree(a,b)),c,d)"

How consistent is your spacing/formatting there? What sort of variance (number of params etc.) are you working with.

I can't get your RE to work - did it get munged (I added some more space, but "?0" isn't something I could figure out):

/([^\(]+)    \(?  ( [^()]*+ | (?0) )  \)? $  /x

--

a

Andy Bach,
afbach@gmail.com
608 658-1890 cell
608 261-5738 wk

View Online



Madison Area Perl Mongers - MadMongers
http://www.madmongers.org

Re: Need a regexp superhero
User: miner
Date: 12/18/2012 4:25 pm
Views: 0
Rating: 0
On 12/18/12 2:37 PM, tmurray@wumpus-cave.net wrote:

tmurray wrote:

Perl's regex engine might be technically capable of this, but depending on
how generalized a solution you need, it might be easier to use Marpa or
Parse::RecDescent.

Agreed, not something that will easily be solved by a RegEx.  Better solved with a parser.

Martin, do you have a typo in your string to be parsed?
"funcOne(funcTwo | funcThree(a,b)),c,d)"

Has unbalanced parens, unless my mind is playing tricks on me.

jon


-- 
.Jonathan J. Miner----------------------------------------------------.
|  jon@jjminer.org  |      photos - http://photos.jjminer.org/        |
|                   | R.A.W. #1629 - http://www.reggaeambassadors.org |
|                   | LOCS Webmaster - http://www.locs-buffett.org    |
|  jabber/gchat: camrycurbhopper@gmail.com      AIM: camrycurbhopper  |
`---------------------------------------------------------------------'

"We don't have a town drunk...   We all take turns!"
    -- James Slater, "Key West Address"
Re: Need a regexp superhero
User: david-delikat
Date: 12/18/2012 4:25 pm
Views: 0
Rating: 0

here's a solution...

perl -le ' my $parm = qr/((?:(?:\w+(?:\([\w,]+\))?)|[\s|]+)+)/;' \
' print q/"/,join( q/" "/, ("funcOne(funcTwo | funcThree(a,b),c,d)" =~ /^(\w+)\($parm(?:,$parm(?:,$parm))\)$/)), q/"/'

   ==>

"funcOne" "funcTwo | funcThree(a,b)" "c" "d"

problem is that you have to have '(?:,$parm)' for every possible parameter
otherwise you only get the last one.

your solution would be simpler if you did something like:

my $parm = qr/((?:(?:\w+(?:\([\w,]+\))?)|[\s|]+)+)/gc;
my $res;
if( $line =~ /\w+\(/gc ) {
     $res->{name} = $1;
     while( $line !~ /^\)/gc ) {
           if( $line =~ $parm ) {
push @{$res->{parms}}, $1;
      }
     }
}

or something like that… it would handle any number of parameters.

or better yet, try Marpa or Parse::RecDescent...

On Dec 18, 2012, at 1:19 PM, <mcholste@gmail.com> <mcholste@gmail.com> wrote:

mcholste wrote:

Ok, this is getting way harder than it should be, trying to parse something like this:

"funcOne(funcTwo | funcThree(a,b)),c,d)"

parsed into:
"funcOne", "funcTwo | funcThree(a,b))", "c", "d"

I've got a hack working where I use one regex:

/([^\(]+)\(?( [^()]*+ | (?0) )\)?$/x

to capture the first function name and detect the inner nested parens, then I use that to create a "mask" to replace those strings within the larger string, (to remove the commas), then do a normal split to find the last two params ("c", "d").  There's got to be a better way, suggestions?

Thanks,

Martin

View Online



Madison Area Perl Mongers - MadMongers
http://www.madmongers.org

Re: Need a regexp superhero
User: tmurray
Date: 12/18/2012 4:49 pm
Views: 0
Rating: 0
If it's from user input, I'd definitely look into a proper parser. I'd go
for Marpa, though Parse::RecDescent is also a popular choice.  See my
parsing talk from last month:

https://github.com/frezik/parsing-talk

>
> mcholste wrote:

It&#39;s parsing user input, so the spacing and number
> of params are completely variable.  From what I can tell (didn&#39;t dig
> into any man pages), "?0" will refer to the other paren in the paren pairs
> when matching, but that&#39;s more of a guess.  That particular RE will
> get ou the first function name followed by everything after, (including
> the trailing paren).
>
> perl -le &#39;my $re = qr/([&#94;\(]+)    \(?  ( [&#94;()]*+ | (?0) )  \)?
> $  /x; $str = "funcOne(funcTwo | funcThree(a,b)),c,d)"; if (@m = $str =~
> $re){ print "y: " . join("#", @m); }&#39;
>
>
> On Tue, Dec 18, 2012 at 3:55 PM,   wrote:
>
> afbach wrote:     On Tue, Dec 18, 2012 at 1:19 PM,   wrote:
>    trying to parse something like this:
>
> "funcOne(funcTwo | funcThree(a,b)),c,d)"
> How consistent is your spacing/formatting there? What sort of variance
> (number of params etc.) are you working with.
> I can&#39;t get your RE to work - did it get munged (I added some more
> space, but "?0" isn&#39;t something I could figure out):
>
> /([&#94;\(]+)    \(?  ( [&#94;()]*+ | (?0) )  \)? $  /x
>
> --
>
> a
>
> Andy Bach,
> afbach@gmail.com
> 608 658-1890 cell
>   608 261-5738 wk
>
> View Online
>
>  Madison Area Perl Mongers - MadMongers
>  http://www.madmongers.org
>
>
> View Online
>
>  Madison Area Perl Mongers - MadMongers
>  http://www.madmongers.org
>

Re: Need a regexp superhero
User: afbach
Date: 12/18/2012 6:39 pm
Views: 115
Rating: 0
On Tue, Dec 18, 2012 at 4:23 PM, <mcholste@gmail.com> wrote:
It's parsing user input, so the spacing and number of params are completely variable.  From what I can tell (didn't dig into any man pages), "?0" will refer to the other paren in the paren pairs when matching, but that's more of a guess.

Cool!  I did not know about ?0 etc - but my reading is a bit different. But 
qr/([^\(]+)    \(?  ( [^()]*+ | (?0) )  \)? $  /x;

Nope. "[^()]*+" seems a syntax error (to me) yet I see [^()]++ in perlre [1] and while i sort of get the first example there ... and I can't see why you're using "\(?" - thought you'd need one literal paren (not zero or one) but I can't get it to work, though I get:
$ perl -le 'my $re = qr/([^\(]+)    \(?  ( [^()]*+ | (?0) )  \)? $  /x; $str = "funcOne(funcTwo | funcThree(a,b),c,d)"; if (@m = $str =~ $re){ print "y: " . join("#", @m); }'
y: funcOne#funcTwo | funcThree(a,b),c,d)

Sorry, no help.

a

[1]
 perldoc perlre says:
  "(?PARNO)" "(?−PARNO)" "(?+PARNO)" "(?R)" "(?0)"
                 Similar to "(??{ code })" except it does not involve
                 compiling any code, instead it treats the contents of a
                 capture buffer as an independent pattern that must match at
                 the current position.  Capture buffers contained by the
                 pattern will have the value as determined by the outermost
                 recursion.

                 PARNO is a sequence of digits (not starting with 0) whose
                 value reflects the paren‐number of the capture buffer to
                 recurse to. "(?R)" recurses to the beginning of the whole
                 pattern. "(?0)" is an alternate syntax for "(?R)". If PARNO
                 is preceded by a plus or minus sign then it is assumed to be
                 relative, with negative numbers indicating preceding capture
                 buffers and positive ones following. Thus "(?−1)" refers to
                 the most recently declared buffer, and "(?+1)" indicates the
                 next buffer to be declared.  Note that the counting for
                 relative recursion differs from that of relative
                 backreferences, in that with recursion unclosed buffers are
                 included.

                 The following pattern matches a function foo() which may
                 contain balanced parentheses as the argument.

                   $re = qr{ (                    # paren group 1 (full function)
                               foo
                               (                  # paren group 2 (parens)
                                 \(
                                   (              # paren group 3 (contents of parens)
                                   (?:
                                    (?> [^()]+ )  # Non−parens without backtracking
                                   |
                                    (?2)          # Recurse to start of paren group 2
                                   )*
                                   )
                                 \)
                               )
                             )
                           }x;

                 If the pattern was used as follows

                     'foo(bar(baz)+baz(bop))'=~/$re/
                         and print "\$1 = $1\n",
                                   "\$2 = $2\n",
                                   "\$3 = $3\n";


                 the output produced should be the following:

                     $1 = foo(bar(baz)+baz(bop))
                     $2 = (bar(baz)+baz(bop))
                     $3 = bar(baz)+baz(bop)


             If there is no corresponding capture buffer defined, then it
                 is a fatal error.  Recursing deeper than 50 times without
                 consuming any input string will also result in a fatal error.
                 The maximum depth is compiled into perl, so changing it
                 requires a custom build.

                 The following shows how using negative indexing can make it
                 easier to embed recursive patterns inside of a "qr//"
                 construct for later use:

                     my $parens = qr/(\((?:[^()]++|(?−1))*+\))/;
                     if (/foo $parens \s+ + \s+ bar $parens/x) {
                        # do something here...
                     }

                 Note that this pattern does not behave the same way as the
                 equivalent PCRE or Python construct of the same form. In Perl
                 you can backtrack into a recursed group, in PCRE and Python
                 the recursed into group is treated as atomic. Also, modifiers
                 are resolved at compile time, so constructs like (?i:(?1)) or
                 (?:(?i)(?1)) do not affect how the sub‐pattern will be
                 processed.








--

a

Andy Bach,
afbach@gmail.com
608 658-1890 cell
608 261-5738 wk
Re: Need a regexp superhero
User: chrisdolan
Date: 12/18/2012 9:08 pm
Views: 107
Rating: 0
If you want to parse with regexps, then a good pattern to consider is:

  m/ \G ... /c

That's the continued match, which lets you mix a collection of regexp snippets with code to do the stuff that doesn't come naturally to regexps. I used that pattern extensively in CAM::PDF which needed to support arbitrarily deep recursive data structures.

Chris


On Dec 18, 2012, at 6:39 PM, <afbach@gmail.com> <afbach@gmail.com> wrote:

afbach wrote:

On Tue, Dec 18, 2012 at 4:23 PM, <mcholste@gmail.com> wrote:
It's parsing user input, so the spacing and number of params are completely variable.  From what I can tell (didn't dig into any man pages), "?0" will refer to the other paren in the paren pairs when matching, but that's more of a guess.

Cool!  I did not know about ?0 etc - but my reading is a bit different. But 
qr/([^\(]+)    \(?  ( [^()]*+ | (?0) )  \)? $  /x;

Nope. "[^()]*+" seems a syntax error (to me) yet I see [^()]++ in perlre [1] and while i sort of get the first example there ... and I can't see why you're using "\(?" - thought you'd need one literal paren (not zero or one) but I can't get it to work, though I get:
$ perl -le 'my $re = qr/([^\(]+)    \(?  ( [^()]*+ | (?0) )  \)? $  /x; $str = "funcOne(funcTwo | funcThree(a,b),c,d)"; if (@m = $str =~ $re){ print "y: " . join("#", @m); }'
y: funcOne#funcTwo | funcThree(a,b),c,d)

Sorry, no help.

a

[1]
 perldoc perlre says:
  "(?PARNO)" "(?−PARNO)" "(?+PARNO)" "(?R)" "(?0)"
                 Similar to "(??{ code })" except it does not involve
                 compiling any code, instead it treats the contents of a
                 capture buffer as an independent pattern that must match at
                 the current position.  Capture buffers contained by the
                 pattern will have the value as determined by the outermost
                 recursion.

                 PARNO is a sequence of digits (not starting with 0) whose
                 value reflects the paren‐number of the capture buffer to
                 recurse to. "(?R)" recurses to the beginning of the whole
                 pattern. "(?0)" is an alternate syntax for "(?R)". If PARNO
                 is preceded by a plus or minus sign then it is assumed to be
                 relative, with negative numbers indicating preceding capture
                 buffers and positive ones following. Thus "(?−1)" refers to
                 the most recently declared buffer, and "(?+1)" indicates the
                 next buffer to be declared.  Note that the counting for
                 relative recursion differs from that of relative
                 backreferences, in that with recursion unclosed buffers are
                 included.

                 The following pattern matches a function foo() which may
                 contain balanced parentheses as the argument.

                   $re = qr{ (                    # paren group 1 (full function)
                               foo
                               (                  # paren group 2 (parens)
                                 \(
                                   (              # paren group 3 (contents of parens)
                                   (?:
                                    (?> [^()]+ )  # Non−parens without backtracking
                                   |
                                    (?2)          # Recurse to start of paren group 2
                                   )*
                                   )
                                 \)
                               )
                             )
                           }x;

                 If the pattern was used as follows

                     'foo(bar(baz)+baz(bop))'=~/$re/
                         and print "\$1 = $1\n",
                                   "\$2 = $2\n",
                                   "\$3 = $3\n";


                 the output produced should be the following:

                     $1 = foo(bar(baz)+baz(bop))
                     $2 = (bar(baz)+baz(bop))
                     $3 = bar(baz)+baz(bop)


             If there is no corresponding capture buffer defined, then it
                 is a fatal error.  Recursing deeper than 50 times without
                 consuming any input string will also result in a fatal error.
                 The maximum depth is compiled into perl, so changing it
                 requires a custom build.

                 The following shows how using negative indexing can make it
                 easier to embed recursive patterns inside of a "qr//"
                 construct for later use:

                     my $parens = qr/(\((?:[^()]++|(?−1))*+\))/;
                     if (/foo $parens \s+ + \s+ bar $parens/x) {
                        # do something here...
                     }

                 Note that this pattern does not behave the same way as the
                 equivalent PCRE or Python construct of the same form. In Perl
                 you can backtrack into a recursed group, in PCRE and Python
                 the recursed into group is treated as atomic. Also, modifiers
                 are resolved at compile time, so constructs like (?i:(?1)) or
                 (?:(?i)(?1)) do not affect how the sub‐pattern will be
                 processed.








--

a

Andy Bach,
afbach@gmail.com
608 658-1890 cell
608 261-5738 wk

View Online



Madison Area Perl Mongers - MadMongers
http://www.madmongers.org

Re: Need a regexp superhero
User: mcholste
Date: 12/18/2012 11:02 pm
Views: 113
Rating: 0
Wow, a lot to go through here.  Yes, Jon, there was a typo in the example with the extra closing paren.  I think that Marpa and a full grammar may be a bit overkill, but then again, maybe not.  I'm trying to get Chris's continued match to work in a demo but am thus far unsuccessful.


On Tue, Dec 18, 2012 at 9:08 PM, <chris@chrisdolan.net> wrote:

chrisdolan wrote:

If you want to parse with regexps, then a good pattern to consider is:

  m/ \G ... /c

That's the continued match, which lets you mix a collection of regexp snippets with code to do the stuff that doesn't come naturally to regexps. I used that pattern extensively in CAM::PDF which needed to support arbitrarily deep recursive data structures.

Chris


On Dec 18, 2012, at 6:39 PM, <afbach@gmail.com> <afbach@gmail.com> wrote:

afbach wrote:

On Tue, Dec 18, 2012 at 4:23 PM, <mcholste@gmail.com> wrote:
It's parsing user input, so the spacing and number of params are completely variable.  From what I can tell (didn't dig into any man pages), "?0" will refer to the other paren in the paren pairs when matching, but that's more of a guess.

Cool!  I did not know about ?0 etc - but my reading is a bit different. But 
qr/([^\(]+)    \(?  ( [^()]*+ | (?0) )  \)? $  /x;

Nope. "[^()]*+" seems a syntax error (to me) yet I see [^()]++ in perlre [1] and while i sort of get the first example there ... and I can't see why you're using "\(?" - thought you'd need one literal paren (not zero or one) but I can't get it to work, though I get:
$ perl -le 'my $re = qr/([^\(]+)    \(?  ( [^()]*+ | (?0) )  \)? $  /x; $str = "funcOne(funcTwo | funcThree(a,b),c,d)"; if (@m = $str =~ $re){ print "y: " . join("#", @m); }'
y: funcOne#funcTwo | funcThree(a,b),c,d)

Sorry, no help.

a

[1]
 perldoc perlre says:
  "(?PARNO)" "(?−PARNO)" "(?+PARNO)" "(?R)" "(?0)"
                 Similar to "(??{ code })" except it does not involve
                 compiling any code, instead it treats the contents of a
                 capture buffer as an independent pattern that must match at
                 the current position.  Capture buffers contained by the
                 pattern will have the value as determined by the outermost
                 recursion.

                 PARNO is a sequence of digits (not starting with 0) whose
                 value reflects the paren‐number of the capture buffer to
                 recurse to. "(?R)" recurses to the beginning of the whole
                 pattern. "(?0)" is an alternate syntax for "(?R)". If PARNO
                 is preceded by a plus or minus sign then it is assumed to be
                 relative, with negative numbers indicating preceding capture
                 buffers and positive ones following. Thus "(?−1)" refers to
                 the most recently declared buffer, and "(?+1)" indicates the
                 next buffer to be declared.  Note that the counting for
                 relative recursion differs from that of relative
                 backreferences, in that with recursion unclosed buffers are
                 included.

                 The following pattern matches a function foo() which may
                 contain balanced parentheses as the argument.

                   $re = qr{ (                    # paren group 1 (full function)
                               foo
                               (                  # paren group 2 (parens)
                                 \(
                                   (              # paren group 3 (contents of parens)
                                   (?:
                                    (?> [^()]+ )  # Non−parens without backtracking
                                   |
                                    (?2)          # Recurse to start of paren group 2
                                   )*
                                   )
                                 \)
                               )
                             )
                           }x;

                 If the pattern was used as follows

                     'foo(bar(baz)+baz(bop))'=~/$re/
                         and print "\$1 = $1\n",
                                   "\$2 = $2\n",
                                   "\$3 = $3\n";


                 the output produced should be the following:

                     $1 = foo(bar(baz)+baz(bop))
                     $2 = (bar(baz)+baz(bop))
                     $3 = bar(baz)+baz(bop)


             If there is no corresponding capture buffer defined, then it
                 is a fatal error.  Recursing deeper than 50 times without
                 consuming any input string will also result in a fatal error.
                 The maximum depth is compiled into perl, so changing it
                 requires a custom build.

                 The following shows how using negative indexing can make it
                 easier to embed recursive patterns inside of a "qr//"
                 construct for later use:

                     my $parens = qr/(\((?:[^()]++|(?−1))*+\))/;
                     if (/foo $parens \s+ + \s+ bar $parens/x) {
                        # do something here...
                     }

                 Note that this pattern does not behave the same way as the
                 equivalent PCRE or Python construct of the same form. In Perl
                 you can backtrack into a recursed group, in PCRE and Python
                 the recursed into group is treated as atomic. Also, modifiers
                 are resolved at compile time, so constructs like (?i:(?1)) or
                 (?:(?i)(?1)) do not affect how the sub‐pattern will be
                 processed.








--

a

Andy Bach,
afbach@gmail.com
608 658-1890 cell
608 261-5738 wk

View Online



Madison Area Perl Mongers - MadMongers
http://www.madmongers.org

View Online



Madison Area Perl Mongers - MadMongers
http://www.madmongers.org

Re: Need a regexp superhero
User: chrisdolan
Date: 12/19/2012 7:39 am
Views: 0
Rating: 0
If you need an example, go here: http://cpansearch.perl.org/src/CDOLAN/CAM-PDF-1.58/lib/CAM/PDF.pm
and search for "sub parseAny" and look at the methods it calls.

I should have said "m/ \G ... /cg". The "g" is a critical piece and the "\G" position down't work without it. The "c" flag's purpose is "don't reset pos on failed matches when using /g"

Chris


On Dec 18, 2012, at 11:02 PM, <mcholste@gmail.com> wrote:

mcholste wrote:

Wow, a lot to go through here.  Yes, Jon, there was a typo in the example with the extra closing paren.  I think that Marpa and a full grammar may be a bit overkill, but then again, maybe not.  I'm trying to get Chris's continued match to work in a demo but am thus far unsuccessful.


On Tue, Dec 18, 2012 at 9:08 PM, <chris@chrisdolan.net> wrote:

chrisdolan wrote:

If you want to parse with regexps, then a good pattern to consider is:

  m/ \G ... /c

That's the continued match, which lets you mix a collection of regexp snippets with code to do the stuff that doesn't come naturally to regexps. I used that pattern extensively in CAM::PDF which needed to support arbitrarily deep recursive data structures.

Chris


On Dec 18, 2012, at 6:39 PM, <afbach@gmail.com> <afbach@gmail.com> wrote:

afbach wrote:

On Tue, Dec 18, 2012 at 4:23 PM, <mcholste@gmail.com> wrote:
It's parsing user input, so the spacing and number of params are completely variable.  From what I can tell (didn't dig into any man pages), "?0" will refer to the other paren in the paren pairs when matching, but that's more of a guess.

Cool!  I did not know about ?0 etc - but my reading is a bit different. But 
qr/([^\(]+)    \(?  ( [^()]*+ | (?0) )  \)? $  /x;

Nope. "[^()]*+" seems a syntax error (to me) yet I see [^()]++ in perlre [1] and while i sort of get the first example there ... and I can't see why you're using "\(?" - thought you'd need one literal paren (not zero or one) but I can't get it to work, though I get:
$ perl -le 'my $re = qr/([^\(]+)    \(?  ( [^()]*+ | (?0) )  \)? $  /x; $str = "funcOne(funcTwo | funcThree(a,b),c,d)"; if (@m = $str =~ $re){ print "y: " . join("#", @m); }'
y: funcOne#funcTwo | funcThree(a,b),c,d)

Sorry, no help.

a

[1]
 perldoc perlre says:
  "(?PARNO)" "(?−PARNO)" "(?+PARNO)" "(?R)" "(?0)"
                 Similar to "(??{ code })" except it does not involve
                 compiling any code, instead it treats the contents of a
                 capture buffer as an independent pattern that must match at
                 the current position.  Capture buffers contained by the
                 pattern will have the value as determined by the outermost
                 recursion.

                 PARNO is a sequence of digits (not starting with 0) whose
                 value reflects the paren‐number of the capture buffer to
                 recurse to. "(?R)" recurses to the beginning of the whole
                 pattern. "(?0)" is an alternate syntax for "(?R)". If PARNO
                 is preceded by a plus or minus sign then it is assumed to be
                 relative, with negative numbers indicating preceding capture
                 buffers and positive ones following. Thus "(?−1)" refers to
                 the most recently declared buffer, and "(?+1)" indicates the
                 next buffer to be declared.  Note that the counting for
                 relative recursion differs from that of relative
                 backreferences, in that with recursion unclosed buffers are
                 included.

                 The following pattern matches a function foo() which may
                 contain balanced parentheses as the argument.

                   $re = qr{ (                    # paren group 1 (full function)
                               foo
                               (                  # paren group 2 (parens)
                                 \(
                                   (              # paren group 3 (contents of parens)
                                   (?:
                                    (?> [^()]+ )  # Non−parens without backtracking
                                   |
                                    (?2)          # Recurse to start of paren group 2
                                   )*
                                   )
                                 \)
                               )
                             )
                           }x;

                 If the pattern was used as follows

                     'foo(bar(baz)+baz(bop))'=~/$re/
                         and print "\$1 = $1\n",
                                   "\$2 = $2\n",
                                   "\$3 = $3\n";


                 the output produced should be the following:

                     $1 = foo(bar(baz)+baz(bop))
                     $2 = (bar(baz)+baz(bop))
                     $3 = bar(baz)+baz(bop)


             If there is no corresponding capture buffer defined, then it
                 is a fatal error.  Recursing deeper than 50 times without
                 consuming any input string will also result in a fatal error.
                 The maximum depth is compiled into perl, so changing it
                 requires a custom build.

                 The following shows how using negative indexing can make it
                 easier to embed recursive patterns inside of a "qr//"
                 construct for later use:

                     my $parens = qr/(\((?:[^()]++|(?−1))*+\))/;
                     if (/foo $parens \s+ + \s+ bar $parens/x) {
                        # do something here...
                     }

                 Note that this pattern does not behave the same way as the
                 equivalent PCRE or Python construct of the same form. In Perl
                 you can backtrack into a recursed group, in PCRE and Python
                 the recursed into group is treated as atomic. Also, modifiers
                 are resolved at compile time, so constructs like (?i:(?1)) or
                 (?:(?i)(?1)) do not affect how the sub‐pattern will be
                 processed.








--

a

Andy Bach,
afbach@gmail.com
608 658-1890 cell
608 261-5738 wk

View Online



Madison Area Perl Mongers - MadMongers
http://www.madmongers.org

View Online



Madison Area Perl Mongers - MadMongers
http://www.madmongers.org

View Online



Madison Area Perl Mongers - MadMongers
http://www.madmongers.org

PreviousNext
Madison Area Perl Mongers