Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing warning for /[[:digit]]/ #5305

Closed
p5pRT opened this issue Apr 3, 2002 · 29 comments
Closed

Missing warning for /[[:digit]]/ #5305

p5pRT opened this issue Apr 3, 2002 · 29 comments

Comments

@p5pRT
Copy link

p5pRT commented Apr 3, 2002

Migrated from rt.perl.org#8904 (status was 'resolved')

Searchable as RT8904$

@p5pRT
Copy link
Author

p5pRT commented Apr 3, 2002

From @Abigail

Created by @Abigail

  $ /opt/bleadperl/bin/perl -wle 'print "Found" if "9" =~ /[[​:digit]]/'
  $

The missing '​:' causes Perl to think you want the equivalent of /[[​:dgit]/.
Since this is highly unlikely the programmer really wants this, I think
this should generate at least a warning.

Abigail

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl v5.7.3:

Configured by abigail at Tue Apr  2 20:17:38 CEST 2002.

Summary of my perl5 (revision 5.0 version 7 subversion 3 patch 15679) configuration:
  Platform:
    osname=linux, osvers=2.4.5, archname=i686-linux-64int-ld
    uname='linux hermione 2.4.5 #6 fri jun 22 01:38:20 pdt 2001 i686 unknown '
    config_args='-des -Dusedevel -Dprefix=/opt/bleadperl -Dusemorebits -Doptimize=-g -Duse64bitall -Uversiononly -Dmydomain=.foad.org -Dcf_email=abigail@foad.org -Dperladmin=abigail@foad.org'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=define use64bitall=undef uselongdouble=define
    usemymalloc=n, bincompat5005=define
  Compiler:
    cc='cc', ccflags ='-DDEBUGGING -fno-strict-aliasing -I/usr/local/include -I/opt/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-g',
    cppflags='-DDEBUGGING -fno-strict-aliasing -I/usr/local/include -I/opt/local/include'
    ccversion='', gccversion='2.95.3 20010315 (release)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long long', ivsize=8, nvtype='long double', nvsize=12, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib -L/opt/local/lib'
    libpth=/usr/local/lib /opt/local/lib /lib /usr/lib
    libs=-lnsl -lndbm -lgdbm -ldl -lm -lc -lcrypt -lutil
    perllibs=-lnsl -ldl -lm -lc -lcrypt -lutil
    libc=/lib/libc-2.2.3.so, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib -L/opt/local/lib'

Locally applied patches:
    DEVEL15661


@INC for perl v5.7.3:
    /home/abigail/Perl
    /home/abigail/Sybase
    /opt/bleadperl/lib/5.7.3/i686-linux-64int-ld
    /opt/bleadperl/lib/5.7.3
    /opt/bleadperl/lib/site_perl/5.7.3/i686-linux-64int-ld
    /opt/bleadperl/lib/site_perl/5.7.3
    /opt/bleadperl/lib/site_perl/5.7.2/i686-linux-64int-ld
    /opt/bleadperl/lib/site_perl/5.7.2
    /opt/bleadperl/lib/site_perl
    .


Environment for perl v5.7.3:
    HOME=/home/abigail
    LANG (unset)
    LANGUAGE (unset)
    LC_ALL=POSIX
    LD_LIBRARY_PATH=/home/abigail/Lib:/usr/local/lib:/usr/lib:/lib:/usr/X11R6/lib:/opt/gnome/lib
    LOGDIR (unset)
    PATH=/home/abigail/Bin:/opt/perl/bin:/usr/local/bin:/usr/local/X11/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/X11R6/bin:/usr/games:/opt/povray/bin:/opt/gnome/bin:/opt/opera/bin:/usr/share/texmf/bin:/opt/Acrobat4/bin:/opt/java/blackdown/j2sdk1.3.1/bin:/usr/local/games/bin:/opt/gnuplot/bin:/opt/mysql/bin
    PERL5LIB=/home/abigail/Perl:/home/abigail/Sybase
    PERLDIR=/opt/perl
    PERL_BADLANG (unset)
    SHELL=/usr/bin/bash


@p5pRT
Copy link
Author

p5pRT commented Nov 28, 2002

From @jhi

To satisfy Robert's jugu^Wregular expression :-)

8904​: Missing warning for /[[​:digit]]/.

$ /opt/bleadperl/bin/perl -wle 'print "Found" if "9" =~ /[[​:digit]]/'
$

The missing '​:' causes Perl to think you want the equivalent of /[[​:dgit]/.
Since this is highly unlikely the programmer really wants this, I think
this should generate at least a warning.

I'm uncertain of how to proceed with this since there's explicit code
(and tests) to "grandfather" /[[​:]/ to mean "'[' or '​:'", and figuring
out here what the user "intended" to write seems to me be akin to AI,
or lacking that, fuzzy matching against the known POSIX character
class names. Note that there maybe more stuff after the first character
class, e.g. /[[​:]foobar[]]/, so parsing this syntax is definitely unfun.

I think someone needs to sit down, and go through all the possible
combinations and permutations, and decide which ones mean what, and
which ones are syntax errors, and only then we can wrote robust
parsing code for it.

--
Jarkko Hietaniemi <jhi@​iki.fi> http​://www.iki.fi/jhi/ "There is this special
biologist word we use for 'stable'. It is 'dead'." -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Oct 21, 2012

From @jkeenan

On Thu Nov 28 14​:03​:37 2002, jhi wrote​:

To satisfy Robert's jugu^Wregular expression :-)

8904​: Missing warning for /[[​:digit]]/.

$ /opt/bleadperl/bin/perl -wle 'print "Found" if "9" =~
/[[​:digit]]/'
$

The missing '​:' causes Perl to think you want the equivalent of
/[[​:dgit]/.
Since this is highly unlikely the programmer really wants this, I
think
this should generate at least a warning.

I'm uncertain of how to proceed with this since there's explicit
code
(and tests) to "grandfather" /[[​:]/ to mean "'[' or '​:'", and
figuring
out here what the user "intended" to write seems to me be akin to
AI,
or lacking that, fuzzy matching against the known POSIX character
class names. Note that there maybe more stuff after the first
character
class, e.g. /[[​:]foobar[]]/, so parsing this syntax is definitely
unfun.

I think someone needs to sit down, and go through all the possible
combinations and permutations, and decide which ones mean what, and
which ones are syntax errors, and only then we can wrote robust
parsing code for it.

Abigail, Jarkko​:

This ticket was last commented on nearly ten years ago. Is there still
an issue here that warrants keeping the RT open?

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Jan 15, 2013

From @jkeenan

On Sun Oct 21 16​:17​:30 2012, jkeenan wrote​:

On Thu Nov 28 14​:03​:37 2002, jhi wrote​:

To satisfy Robert's jugu^Wregular expression :-)

8904​: Missing warning for /[[​:digit]]/.

$ /opt/bleadperl/bin/perl -wle 'print "Found" if "9" =~
/[[​:digit]]/'
$

The missing '​:' causes Perl to think you want the equivalent of
/[[​:dgit]/.
Since this is highly unlikely the programmer really wants this, I
think
this should generate at least a warning.

I'm uncertain of how to proceed with this since there's explicit
code
(and tests) to "grandfather" /[[​:]/ to mean "'[' or '​:'", and
figuring
out here what the user "intended" to write seems to me be akin to
AI,
or lacking that, fuzzy matching against the known POSIX character
class names. Note that there maybe more stuff after the first
character
class, e.g. /[[​:]foobar[]]/, so parsing this syntax is definitely
unfun.

I think someone needs to sit down, and go through all the possible
combinations and permutations, and decide which ones mean what, and
which ones are syntax errors, and only then we can wrote robust
parsing code for it.

Abigail, Jarkko​:

This ticket was last commented on nearly ten years ago. Is there still
an issue here that warrants keeping the RT open?

Thank you very much.
Jim Keenan

We've heard nothing about this ticket since my last post in October.

I propose we close this ticket in 14 days unless someone objects *and*
is willing to take on an investigation of the issues discussed. (I will
take the ticket to remind myself to close it.)

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Jan 15, 2013

From @jhi

On Monday-201301-14 19​:39, James E Keenan via RT wrote​:

On Sun Oct 21 16​:17​:30 2012, jkeenan wrote​:

On Thu Nov 28 14​:03​:37 2002, jhi wrote​:

To satisfy Robert's jugu^Wregular expression :-)

8904​: Missing warning for /[[​:digit]]/.

$ /opt/bleadperl/bin/perl -wle 'print "Found" if "9" =~
/[[​:digit]]/'
$

The missing '​:' causes Perl to think you want the equivalent of
/[[​:dgit]/.
Since this is highly unlikely the programmer really wants this, I
think
this should generate at least a warning.

I'm uncertain of how to proceed with this since there's explicit
code
(and tests) to "grandfather" /[[​:]/ to mean "'[' or '​:'", and
figuring
out here what the user "intended" to write seems to me be akin to
AI,
or lacking that, fuzzy matching against the known POSIX character
class names. Note that there maybe more stuff after the first
character
class, e.g. /[[​:]foobar[]]/, so parsing this syntax is definitely
unfun.

I think someone needs to sit down, and go through all the possible
combinations and permutations, and decide which ones mean what, and
which ones are syntax errors, and only then we can wrote robust
parsing code for it.

Abigail, Jarkko​:

This ticket was last commented on nearly ten years ago. Is there still
an issue here that warrants keeping the RT open?

Thank you very much.
Jim Keenan

We've heard nothing about this ticket since my last post in October.

I propose we close this ticket in 14 days unless someone objects *and*
is willing to take on an investigation of the issues discussed. (I will
take the ticket to remind myself to close it.)

I CCed Karl since he's pretty much the Regex Imperator now.

Thank you very much.
Jim Keenan

---
via perlbug​: queue​: perl5 status​: open
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=8904

@p5pRT
Copy link
Author

p5pRT commented Jan 15, 2013

From @bulk88

On Mon Jan 14 16​:39​:24 2013, jkeenan wrote​:

We've heard nothing about this ticket since my last post in October.

I propose we close this ticket in 14 days unless someone objects *and*
is willing to take on an investigation of the issues discussed. (I will
take the ticket to remind myself to close it.)

Thank you very much.
Jim Keenan

The warning sounds useful from a superficial read. I am not going to
investigate it. I have no opinion on closing it.

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT
Copy link
Author

p5pRT commented Jan 15, 2013

From @khwilliamson

I thought I had commented on this. I think it would have been fixed if
porters had thought there wasn't a back compat issue.

But the new (?[ ]) experimental syntax already tests for this particular
issue. My take is to tell people that this is coming, and is available
experimentally now. I don't think it's going to go in plain [], but
they can start writing their classes using the new syntax and get this
check.

@p5pRT
Copy link
Author

p5pRT commented Jan 19, 2013

From @jkeenan

On Mon Jan 14 20​:47​:08 2013, khw wrote​:

I thought I had commented on this. I think it would have been fixed if
porters had thought there wasn't a back compat issue.

But the new (?[ ]) experimental syntax already tests for this particular
issue. My take is to tell people that this is coming, and is available
experimentally now. I don't think it's going to go in plain [], but
they can start writing their classes using the new syntax and get this
check.

Well, since there are still a variety of opinions being expressed, I
don't feel qualified to be the closer of the ticket. Un-taking it.

@p5pRT
Copy link
Author

p5pRT commented Mar 11, 2014

From @khwilliamson

As mentioned in the messages above, we are unlikely to add this message that might break long-standing behavior for regular bracketed character classes. However, the extended bracketed classes (?[ ]) already do the requested check. This is an experimental feature, which may become non-experimental as soon as 5.22. Hence, I am marking this as stalled and adding to the blockers for 5.22. If the feature remains experimental, this ticket can be moved to a later release. Otherwise, we close the ticket when 5.22 ships, as the core will then have the requested behavior available
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Mar 11, 2014

@khwilliamson - Status changed from 'open' to 'stalled'

@p5pRT
Copy link
Author

p5pRT commented Mar 22, 2015

From @rjbs

I have un-blocker-ed this.

It is true that a facility that improves on the state of things is in 5.22, even if it is now still experimental. On the other hand, I *do* think this is a place to consider adding a new warning. I know several people who were bitten by this mistake in the last year alone. Obviously, this is pure anecdote, but we should consider whether there is a pattern on which we'd warn, to improve the "unknown posix class" warning condition.

I don't know how we go about parsing this, but if [[​:askii​:]] warns, then why not [[​:ascii​::]] or [[​:ascii]]? They seem like obvious mistakes.

Obviously, nothing here should change for 5.22!

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Mar 22, 2015

The RT System itself - Status changed from 'stalled' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Mar 23, 2015

From @khwilliamson

I plan to address this and similar in 5.23

--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Sep 15, 2015

From @khwilliamson

On 09/14/2015 01​:02 PM, Karl Williamson wrote​:

Can you tell what the attached pattern would match without trying it?
The answer and commentary will be posted tomorrow or so.

Several people were on the right track, but no one knew that the comment
itself affected what gets compiled.

I discovered this while working on
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=8904
"Missing warning for /[[​:digit]]/"

The problem is that [[​:posix​:]] classes have a very strict syntax, any
deviation from which is silently interpreted as something else. Abigail
asked for a warning in this ticket from 13 years ago. Abigail usually
does not like new warnings.

I thought we'd agreed to add the warning to be in effect only if use re
'strict' is in effect, but the discovery of the case presented in the
puzzle has made me think this needs more discussion.

The reason the hint was somewhat misleading is that the hint itself (in
a comment no less) changes the behavior of the perl interpreter. If you
change the '[=]' in the hint to '(=)', you get very different behavior.

There are actually three kinds of posix classes, denoted​: [.posix.]
[=posix=] and [​:posix​:]. perl only implements the last, but looks for
the first two in order to warn that they aren't supported.

The '[=' in the line

[abc[=,+*;def]

looks to perl like it could contain the start of a [=posix=] class, and
it starts looking for the termination. It looks ahead in the pattern as
far as it takes to find one, and in the puzzle example, it finds what
appears to be that termination an arbitrary number of lines down, and in
this case, in a comment. It ignores the '[' in the comment, but sees
the '=]' and thinks that is the termination of the '[=' many lines
before. It then looks for the termination of the first '[' in

[abc[=,+*;def]

but there isn't one, so it croaks with

Unmatched [ in regex; marked by <-- HERE in m/ # This is a
pattern wth a bracketed character class
  [ <-- HERE abc[=,+*;def]

On the other hand if there is no '=]' before the end of the pattern, it
thinks that no [=posix=] class was intended, and the first ']' in the
pattern terminates the character class, and the whole thing compiles
(without a warning) into​:

Final program​:
  1​: ANYOFD[*+,;=[a-f] (12)
  12​: EXACT <ghi> (14)
  14​: END (0)

Comments should not affect the behavior of code. So at a minimum this
should be fixed. But what else qualifies for cleaning up this mess
without someone having to 'use re "strict"' ?

@p5pRT
Copy link
Author

p5pRT commented Sep 15, 2015

From @ikegami

On Tue, Sep 15, 2015 at 9​:52 AM, Karl Williamson <public@​khwilliamson.com>
wrote​:

Comments should not affect the behavior of code.

You cannot place comments in a character class. Therefore, there are no
comments in that code.

The problem is that the meaning of `[=` changes based on whether `=]` is
found or not. We've recently made changes to the re engine that enforces
the escaping of characters such as `{` where it was treated literally
before, and it seems that's that the appropriate fix here too. That or drop
support for [= =] and [. .] completely.

@p5pRT
Copy link
Author

p5pRT commented Sep 15, 2015

From perl5-porters@perl.org

Karl Williamson

Comments should not affect the behavior of code. So at a minimum this
should be fixed. But what else qualifies for cleaning up this mess
without someone having to 'use re "strict"' ?

The search for =] should be restricted to one line. We don't actually
support line breaks in POSIX char classes in any meaningful way.

$ perl5.20.2 -Mre=debug -e '/[[​:alpha​:]]/'
Compiling REx "[[​:alpha​:]]"
Final program​:
  1​: POSIXD[​:alpha​:] (2)
  2​: END (0)
stclass POSIXD[​:alpha​:] minlen 1
Freeing REx​: "[[​:alpha​:]]"
$ perl5.20.2 -Mre=debug -e '/[[​:alpha' -e '​:]]/'
Compiling REx "[[​:alpha%n​:]]"
POSIX class [​:alpha
:] unknown in regex; marked by <-- HERE in m/[[​:alpha
:] <-- HERE ]/ at -e line 2.

@p5pRT
Copy link
Author

p5pRT commented Sep 16, 2015

From @ikegami

On Tue, Sep 15, 2015 at 12​:19 PM, Father Chrysostomos <sprout@​cpan.org>
wrote​:

Karl Williamson

Comments should not affect the behavior of code. So at a minimum this
should be fixed. But what else qualifies for cleaning up this mess
without someone having to 'use re "strict"' ?

The search for =] should be restricted to one line. We don't actually
support line breaks in POSIX char classes in any meaningful way.

The problem has nothing to do with newlines or /x.

$ perl -e'qr/[abc[=]/'

$ perl -e'qr/[abc[=] =]/'
Unmatched [ in regex; marked by <-- HERE in m/[ <-- HERE abc[=] =]/ at -e
line 1.

@p5pRT
Copy link
Author

p5pRT commented Sep 16, 2015

From @ikegami

On Wed, Sep 16, 2015 at 12​:26 PM, Eric Brine <ikegami@​adaelis.com> wrote​:

On Tue, Sep 15, 2015 at 12​:19 PM, Father Chrysostomos <sprout@​cpan.org>
wrote​:

Karl Williamson

Comments should not affect the behavior of code. So at a minimum this
should be fixed. But what else qualifies for cleaning up this mess
without someone having to 'use re "strict"' ?

The search for =] should be restricted to one line. We don't actually
support line breaks in POSIX char classes in any meaningful way.

The problem has nothing to do with newlines or /x.

$ perl -e'qr/[abc[=]/'

$ perl -e'qr/[abc[=] =]/'
Unmatched [ in regex; marked by <-- HERE in m/[ <-- HERE abc[=] =]/ at -e
line 1.

Better yet, the following demonstrates how concatenation two valid regex
patterns producing an invalid one​:

$ perl -e'
  $re = qr/[abc[=]/ . qr/ =]/;
  qr/$re/;
'
Unmatched [ in regex; marked by <-- HERE in m/(?^u​:[ <-- HERE abc[=])(?^u​:
=])/ at -e line 3.

@p5pRT
Copy link
Author

p5pRT commented Sep 16, 2015

From zefram@fysh.org

Eric Brine wrote​:

Better yet, the following demonstrates how concatenation two valid regex
patterns producing an invalid one​:

It's even worse than that​:

$ perl -e '$a=qr/[abc[=]/; $b=qr/=]/; qr/$a$b/'
Unmatched [ in regex; marked by <-- HERE in m/(?^​:[ <-- HERE abc[=])(?^​:=])/ at -e line 1.

Fails without any explicit string concatenation. Interpolation of a
regexp object into qr// normally interpolates the meaning of the regexp
as a syntactically atomic element, and the stringification of a regexp
object has parens matching that behaviour. But here it turns out it's
doing a dirty old string concatenation underneath.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Sep 16, 2015

From @Abigail

On Tue, Sep 15, 2015 at 07​:52​:41AM -0600, Karl Williamson wrote​:

On 09/14/2015 01​:02 PM, Karl Williamson wrote​:

Can you tell what the attached pattern would match without trying it?
The answer and commentary will be posted tomorrow or so.

Several people were on the right track, but no one knew that the comment
itself affected what gets compiled.

I discovered this while working on
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=8904
"Missing warning for /[[​:digit]]/"

The problem is that [[​:posix​:]] classes have a very strict syntax, any
deviation from which is silently interpreted as something else. Abigail
asked for a warning in this ticket from 13 years ago. Abigail usually
does not like new warnings.

I don't, but I may have thought differently 13 years ago. I had completely
forgotten about this ticket.

I thought we'd agreed to add the warning to be in effect only if use re
'strict' is in effect, but the discovery of the case presented in the
puzzle has made me think this needs more discussion.

[ Snip ]

Comments should not affect the behavior of code. So at a minimum this
should be fixed. But what else qualifies for cleaning up this mess
without someone having to 'use re "strict"' ?

Perhaps we should not special case [= =] and [. .]. Perl doesn't support
it, and warning that we don't support it sounds a bit silly. There are
millions of constructs Perl doesn't support, and we don't warn about
that either.

Has there every been a serious request that Perl should support the
[= =] and [. .] classes?

Abigail

@p5pRT
Copy link
Author

p5pRT commented Sep 16, 2015

From @khwilliamson

On 09/16/2015 10​:40 AM, Zefram wrote​:

Eric Brine wrote​:

Better yet, the following demonstrates how concatenation two valid regex
patterns producing an invalid one​:

It's even worse than that​:

$ perl -e '$a=qr/[abc[=]/; $b=qr/=]/; qr/$a$b/'
Unmatched [ in regex; marked by <-- HERE in m/(?^​:[ <-- HERE abc[=])(?^​:=])/ at -e line 1.

Fails without any explicit string concatenation. Interpolation of a
regexp object into qr// normally interpolates the meaning of the regexp
as a syntactically atomic element, and the stringification of a regexp
object has parens matching that behaviour. But here it turns out it's
doing a dirty old string concatenation underneath.

-zefram

Hmm. Actually it's not doing a string concatenation, if you look at the
message it shows that $b was stringified into (?^​:
The problem is that the parsing code is using a defective pattern to
find its expected closing delimiter, a pattern that unfortunately is
used in various places in perl's parsing. You can't generally just use
strchr or memchr, or in this case a while loop to look ahead without
bothering to see what's in the middle. strchr suffers from the
additional problem that NULs are legitimate in Perl strings.

@p5pRT
Copy link
Author

p5pRT commented Sep 16, 2015

From @ap

* Abigail <abigail@​abigail.be> [2015-09-16 20​:40]​:

Perl doesn't support it, and warning that we don't support it sounds
a bit silly. There are millions of constructs Perl doesn't support,
and we don't warn about that either.

It makes sense when you assume that someone will port code from another
language (which has POSIX regexps) to Perl somewhat mechanically. Say,
with a2p.

The same notion must have been what led to the ~150 stubs in POSIX.pm
which just die saying “this function is not supported” (and “maybe you
are looking for this instead”).

This mindset makes sense for an ascendant language looking to establish
itself by stealing existing codebases from languages in the same niche.

As a point of comparison, consider the tons of “this looks like you were
writing Perl 5 here; did you want this other construct maybe?” warning/
error messages in Perl 6.

So it made sense for Perl 5 to do this back when it was very young.

But it’s long past being in that position.

So it makes sense for it to stop doing that now.

Regards,
--
Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

p5pRT commented Jan 23, 2016

From @jkeenan

On Wed Sep 16 11​:40​:33 2015, public@​khwilliamson.com wrote​:

On 09/16/2015 10​:40 AM, Zefram wrote​:

Eric Brine wrote​:

Better yet, the following demonstrates how concatenation two valid
regex
patterns producing an invalid one​:

It's even worse than that​:

$ perl -e '$a=qr/[abc[=]/; $b=qr/=]/; qr/$a$b/'
Unmatched [ in regex; marked by <-- HERE in m/(?^​:[ <-- HERE
abc[=])(?^​:=])/ at -e line 1.

Fails without any explicit string concatenation. Interpolation of a
regexp object into qr// normally interpolates the meaning of the
regexp
as a syntactically atomic element, and the stringification of a
regexp
object has parens matching that behaviour. But here it turns out
it's
doing a dirty old string concatenation underneath.

-zefram

Hmm. Actually it's not doing a string concatenation, if you look at
the
message it shows that $b was stringified into (?^​:
The problem is that the parsing code is using a defective pattern to
find its expected closing delimiter, a pattern that unfortunately is
used in various places in perl's parsing. You can't generally just
use
strchr or memchr, or in this case a while loop to look ahead without
bothering to see what's in the middle. strchr suffers from the
additional problem that NULs are legitimate in Perl strings.

khw​: Do you have any thoughts as to how we might resolve this ticket?

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Jan 23, 2016

From @khwilliamson

On 1/22/2016 8​:42 PM, James E Keenan via RT wrote​:

On Wed Sep 16 11​:40​:33 2015, public@​khwilliamson.com wrote​:

On 09/16/2015 10​:40 AM, Zefram wrote​:

Eric Brine wrote​:

Better yet, the following demonstrates how concatenation two valid
regex
patterns producing an invalid one​:

It's even worse than that​:

$ perl -e '$a=qr/[abc[=]/; $b=qr/=]/; qr/$a$b/'
Unmatched [ in regex; marked by <-- HERE in m/(?^​:[ <-- HERE
abc[=])(?^​:=])/ at -e line 1.

Fails without any explicit string concatenation. Interpolation of a
regexp object into qr// normally interpolates the meaning of the
regexp
as a syntactically atomic element, and the stringification of a
regexp
object has parens matching that behaviour. But here it turns out
it's
doing a dirty old string concatenation underneath.

-zefram

Hmm. Actually it's not doing a string concatenation, if you look at
the
message it shows that $b was stringified into (?^​:
The problem is that the parsing code is using a defective pattern to
find its expected closing delimiter, a pattern that unfortunately is
used in various places in perl's parsing. You can't generally just
use
strchr or memchr, or in this case a while loop to look ahead without
bothering to see what's in the middle. strchr suffers from the
additional problem that NULs are legitimate in Perl strings.

khw​: Do you have any thoughts as to how we might resolve this ticket?

Thank you very much.

I am well along on a fix to this.

@p5pRT
Copy link
Author

p5pRT commented Feb 10, 2016

From @khwilliamson

Fixed by 46d34d0
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Feb 10, 2016

@khwilliamson - Status changed from 'open' to 'pending release'

@p5pRT
Copy link
Author

p5pRT commented Feb 10, 2016

From @rjbs

* Karl Williamson via RT <perlbug-followup@​perl.org> [2016-02-10T01​:48​:11]

Fixed by 46d34d0

Fantastic, thanks Karl! This has bitten me on more than one occasion.

Also, just nice to see an actual fix for a bug that has lingered for 14 years.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented May 13, 2016

From @khwilliamson

Thank you for submitting this report. You have helped make Perl better.
 
With the release of Perl 5.24.0 on May 9, 2016, this and 149 other issues have been resolved.

Perl 5.24.0 may be downloaded via https://metacpan.org/release/RJBS/perl-5.24.0

@p5pRT
Copy link
Author

p5pRT commented May 13, 2016

@khwilliamson - Status changed from 'pending release' to 'resolved'

@p5pRT p5pRT closed this as completed May 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant