New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regexp bugs. #10011
Comments
From @AbigailThis is a bug report for perl from abigail@abigail.be, [Please describe your issue here] While trying to work around bug #59792, I found some different problems: "a" =~ m {(?|(?<A>a)|(?<B>b))} and say $+ {B}; The code about prints 'a'. This shouldn't happen, as the (?<B>b) alternation "a b" =~ m {(?|(?:(?<A>(?<B>a)(?<C>[^ab]*)(?<D>b))) | Afterwards, %- is equal to: (A => ["a b"], Instead of the expected: (A => ["a b"], Replacing (?|) with (?:) solves it partially (at the expense of having "0" =~ m {(?:(?<A>0)|(?<B>1))}; All these issues are still present in 5.11.2. |
From ben@morrow.me.ukQuoth perl5-porters@perl.org:
I can't speak to the rest, but that at least is as documented. 'keys %+' Ben |
The RT System itself - Status changed from 'new' to 'open' |
From @davidnicolperldoc perlre: Inside (?|pattern) the "branch reset" pattern, the capture buffers are Apparently the branch reset pattern is implemented by aliasing the possible paths forward: 1: document that using named capture buffers within the branch reset 2: modify the branch reset pattern to respect the uniqueness of named 3: make use of a named capture buffer within a branch reset pattern a 4: ? -- |
From @AbigailOn Wed, Dec 09, 2009 at 12:01:41PM -0600, David Nicol wrote:
I would hate for 1) or 3) to happen. The regexp I had that made this problem surface had the form: /(?| (?<Open>PAT1)(?<Body>PAT2) (?<Close>PAT3) If I make both alternations have the same number and order of named /(?|(?<Outer> (?<Open>PAT3)(?<Body>PAT4(?&Outer)?)(?<Close>PAT6))) Not optimal (note the necessary switch of the alternations), but now Abigail |
From @demerphq2009/12/9 David Nicol <davidnicol@gmail.com>:
no
Well there is the rub: we dont "assign" capture capture buffers into We maintain two arrays integer offsets into the strings with n+1 items Each OPEN regop and CLOSE regop is hard bound to a particular index in The named capture buffer hack basically just adds a HoA to this, where When you access $1 you are using a tied interface that uses the The (?|...) hack merely plays around with the internal buffer counter Anyway the point here is that untangling or changing ANY of these is
no. Number 4 could just be "document the weird way it works, and offer cheers, -- |
From @AbigailOn Thu, Dec 10, 2009 at 11:47:18AM +0100, demerphq wrote:
I disagree. Consider (sub) patterns (you don't contol the code that generates my $pat1 = "(?:(?<A>a)1|(?<A>b)2|(?<A>c)3)"; and you create larger patterns $PAT1 = "$pat1 $pat1"; $PAT2 = "$pat2 $pat2"; Suppose you use this larger pattern to match something, and you want to see if ($str =~ /$PAT1/) { or if ($str =~ /$PAT2/) { Not only is $pat2 easier to deal with, it's also a lot easier to document. Furthermore, despite the captures being named, you just may want to my @a1 = $str =~ /$PAT1/; # 6 elements, only 2 defined. IMO, (?|) in combination with named captures is still very useful.
And hence unlikely (if at all) to happen before 5.12. I'll understand.
I will accept that challenge. Abigail |
From @demerphq2009/12/10 Abigail <abigail@abigail.be>:
Well i dont have any time for it for sure. :-) In the new year i will probably.
Cool. However if you can figure out a simple description of how it cheers, -- |
From @davidnicolOn Thu, Dec 10, 2009 at 6:02 AM, demerphq <demerphq@gmail.com> wrote:
currently %- and %+ are named aliases for slots in @- and @+, while @- What if %- and %+ held the offsets directly? Ideally the part that -- |
From @demerphq2009/12/10 David Nicol <davidnicol@gmail.com>:
That would mean that all users would pay a price for named captures The way we do it is because we could update a slot in these array cheers, -- |
From @AbigailOn Thu, Dec 10, 2009 at 12:50:41PM +0100, Abigail wrote:
Done so as commit ab10618. Abigail |
From @doyIs there anything more we want to do here? -doy |
No response in 10 years; the docs were updated. Closing |
Migrated from rt.perl.org#71136 (status was 'open')
Searchable as RT71136$
The text was updated successfully, but these errors were encountered: