Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-word-boundary doesn't match EOS in 5.20 #13917

Closed
p5pRT opened this issue Jun 12, 2014 · 15 comments
Closed

Non-word-boundary doesn't match EOS in 5.20 #13917

p5pRT opened this issue Jun 12, 2014 · 15 comments

Comments

@p5pRT
Copy link

p5pRT commented Jun 12, 2014

Migrated from rt.perl.org#122090 (status was 'resolved')

Searchable as RT122090$

@p5pRT
Copy link
Author

p5pRT commented Jun 12, 2014

From @mauke

With 5.20​:

C​:\>perl -v

This is perl 5, version 20, subversion 0 (v5.20.0) built for MSWin32-x64-multi-thread

...

C​:\>perl -wE "q{} =~ /\B/ or die"
Died at -e line 1.


With 5.10 / 5.12​:
$ perl -wE "q{} =~ /\B/ or die"
$

I don't have 5.14/5.16/5.18 here to test.

I think the 5.10/5.12 behavior is correct​: beginning-of-string/end-of-string count as non-word-characters for \b and \B. The empty string "" has a non-word-boundary between beginning-of-string and end-of-string (both virtual \W).

Was this an intentional change in 5.20? I've skimmed the perldeltas but haven't found anything related.

@p5pRT
Copy link
Author

p5pRT commented Jun 12, 2014

From dennis@kaarsemaker.net

On do, 2014-06-12 at 08​:27 -0700, via RT wrote​:

# New Ticket Created by l.mai@​web.de
# Please include the string​: [perl #122090]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=122090 >

With 5.20​:

C​:\>perl -v

This is perl 5, version 20, subversion 0 (v5.20.0) built for MSWin32-x64-multi-thread

...

C​:\>perl -wE "q{} =~ /\B/ or die"
Died at -e line 1.

-----------------------------------------------

With 5.10 / 5.12​:
$ perl -wE "q{} =~ /\B/ or die"
$

I don't have 5.14/5.16/5.18 here to test.

I think the 5.10/5.12 behavior is correct​: beginning-of-string/end-of-string count as non-word-characters for \b and \B. The empty string "" has a non-word-boundary between beginning-of-string and end-of-string (both virtual \W).

Was this an intentional change in 5.20? I've skimmed the perldeltas but haven't found anything related.

It changed in 5.14​:

$ perl -v | grep subv
This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux
$ perl -wE "q{} =~ /\B/ or die"
Died at -e line 1.

--
Dennis Kaarsemaker
www.kaarsemaker.net

@p5pRT
Copy link
Author

p5pRT commented Jun 12, 2014

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jun 12, 2014

From @jkeenan

On Thu Jun 12 11​:12​:33 2014, dennis@​kaarsemaker.net wrote​:

On do, 2014-06-12 at 08​:27 -0700, via RT wrote​:

# New Ticket Created by l.mai@​web.de
# Please include the string​: [perl #122090]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=122090 >

With 5.20​:

C​:\>perl -v

This is perl 5, version 20, subversion 0 (v5.20.0) built for MSWin32-
x64-multi-thread

...

C​:\>perl -wE "q{} =~ /\B/ or die"
Died at -e line 1.

-----------------------------------------------

With 5.10 / 5.12​:
$ perl -wE "q{} =~ /\B/ or die"
$

I don't have 5.14/5.16/5.18 here to test.

I think the 5.10/5.12 behavior is correct​: beginning-of-string/end-
of-string count as non-word-characters for \b and \B. The empty
string "" has a non-word-boundary between beginning-of-string and
end-of-string (both virtual \W).

Was this an intentional change in 5.20? I've skimmed the perldeltas
but haven't found anything related.

It changed in 5.14​:

$ perl -v | grep subv
This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-
linux
$ perl -wE "q{} =~ /\B/ or die"
Died at -e line 1.

Confirmed​:

#####
[] 15 $ perlbrew use perl-5.14.4
[
] 16 $ perl -wE "q{} =~ /\B/ or die"
Died at -e line 1.
#####

@p5pRT
Copy link
Author

p5pRT commented Jun 12, 2014

From @khwilliamson

On 06/12/2014 04​:41 PM, James E Keenan via RT wrote​:

On Thu Jun 12 11​:12​:33 2014, dennis@​kaarsemaker.net wrote​:

On do, 2014-06-12 at 08​:27 -0700, via RT wrote​:

# New Ticket Created by l.mai@​web.de
# Please include the string​: [perl #122090]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=122090 >

With 5.20​:

C​:\>perl -v

This is perl 5, version 20, subversion 0 (v5.20.0) built for MSWin32-
x64-multi-thread

...

C​:\>perl -wE "q{} =~ /\B/ or die"
Died at -e line 1.

-----------------------------------------------

With 5.10 / 5.12​:
$ perl -wE "q{} =~ /\B/ or die"
$

I bisected it to
63ac0da is the first bad commit
commit 63ac0da
Author​: Karl Williamson <public@​khwilliamson.com>
Date​: Tue Dec 28 16​:13​:49 2010 -0700

  regex​: Use BOUNDU regnodes

  This refactors one area in regexec.c to use BOUNDU, NBOUNDU for
  efficiciency, and easier adding of the future BOUNDA.

bisect run success
That took 1097 seconds

@p5pRT
Copy link
Author

p5pRT commented Jun 19, 2014

From @khwilliamson

On 06/12/2014 09​:27 AM, l.mai@​web.de (via RT) wrote​:

# New Ticket Created by l.mai@​web.de
# Please include the string​: [perl #122090]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=122090 >

With 5.20​:

C​:\>perl -v

This is perl 5, version 20, subversion 0 (v5.20.0) built for MSWin32-x64-multi-thread

...

C​:\>perl -wE "q{} =~ /\B/ or die"
Died at -e line 1.

-----------------------------------------------

With 5.10 / 5.12​:
$ perl -wE "q{} =~ /\B/ or die"
$

I don't have 5.14/5.16/5.18 here to test.

I think the 5.10/5.12 behavior is correct​: beginning-of-string/end-of-string count as non-word-characters for \b and \B. The empty string "" has a non-word-boundary between beginning-of-string and end-of-string (both virtual \W).

Was this an intentional change in 5.20? I've skimmed the perldeltas but haven't found anything related.

Is this a distillation of a more complex example? If so, please give me
that, or something somewhat more complex than this one. (I'm tempted to
make a special case for the empty string, but shouldn't do that if this
problem occurs for non-empty strings)

@p5pRT
Copy link
Author

p5pRT commented Jun 19, 2014

From @ikegami

On Wed, Jun 18, 2014 at 10​:51 PM, Karl Williamson <public@​khwilliamson.com>
wrote​:

Is this a distillation of a more complex example? If so, please give me
that, or something somewhat more complex than this one. (I'm tempted to
make a special case for the empty string, but shouldn't do that if this
problem occurs for non-empty strings)

$ perl -wE 'say "..." =~ s/\B/!/rg or die'
!.!.!.

I would expect

$ perl -wE 'say "..." =~ s/(?<=\w)(?=\w)|(?<!\w)(?!\w)/!/rg or die'
!.!.!.!

@p5pRT
Copy link
Author

p5pRT commented Jun 19, 2014

From Eirik-Berg.Hanssen@allverden.no

On Thu, Jun 19, 2014 at 3​:47 PM, Eric Brine <ikegami@​adaelis.com> wrote​:

On Wed, Jun 18, 2014 at 10​:51 PM, Karl Williamson <public@​khwilliamson.com

wrote​:

Is this a distillation of a more complex example? If so, please give me
that, or something somewhat more complex than this one. (I'm tempted to
make a special case for the empty string, but shouldn't do that if this
problem occurs for non-empty strings)

$ perl -wE 'say "..." =~ s/\B/!/rg or die'
!.!.!.

I would expect

$ perl -wE 'say "..." =~ s/(?<=\w)(?=\w)|(?<!\w)(?!\w)/!/rg or die'
!.!.!.!

  Huh. It does match, if part of an alternation, even with a
never-matching pattern​:

eirik@​greencat[19​:24​:16]$ perl -wE 'say "..." = s/\B/!/rg or die'
!.!.!.
eirik@​greencat[19​:24​:19]$ perl -wE 'say "..." = s/(?!)/!/rg or die'
...
eirik@​greencat[19​:24​:21]$ perl -wE 'say "..." = s/\B|(?!)/!/rg or die'
!.!.!.!
eirik@​greencat[19​:24​:22]~$

  Ah, yes; this was "for efficiency", right? The alternation probably
rules out that "for efficiency" alternative. :)

  Still, I figured I'd share, just in case it helps.

  For {the record,your information,fun}, I found this while playing around
with \B (I never used it before)​:

eirik@​greencat[19​:29​:28]$ perl -wE 'say "..foo.." =
s/\B|(\b)/defined($1)?"="​:"!"/erg or die'
!.!.=f!o!o=.!.!
eirik@​greencat[19​:29​:31]~$

Eirik

@p5pRT
Copy link
Author

p5pRT commented Jun 27, 2014

From @khwilliamson

Thanks for finding this
Fixed by c8519dc
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jun 27, 2014

@khwilliamson - Status changed from 'open' to 'resolved'

@p5pRT
Copy link
Author

p5pRT commented Jun 27, 2014

From @khwilliamson

Reopening so I can change the resolution to what I should have made it to begin with​: Pending release
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jun 27, 2014

@khwilliamson - Status changed from 'resolved' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jun 27, 2014

@khwilliamson - Status changed from 'open' to 'pending release'

@p5pRT
Copy link
Author

p5pRT commented Jun 2, 2015

From @khwilliamson

Thanks for submitting this ticket

The issue should be resolved with the release today of Perl v5.22. If you find that the problem persists, feel free to reopen this ticket

--
Karl Williamson for the Perl 5 porters team

@p5pRT
Copy link
Author

p5pRT commented Jun 2, 2015

@khwilliamson - Status changed from 'pending release' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant