Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecating unescaped literal { in regex breaks CPAN #13947

Closed
p5pRT opened this issue Jun 20, 2014 · 21 comments
Closed

Deprecating unescaped literal { in regex breaks CPAN #13947

p5pRT opened this issue Jun 20, 2014 · 21 comments

Comments

@p5pRT
Copy link

p5pRT commented Jun 20, 2014

Migrated from rt.perl.org#122146 (status was 'resolved')

Searchable as RT122146$

@p5pRT
Copy link
Author

p5pRT commented Jun 20, 2014

From @andk

git bisect


The first bad commit could be any of​:
4a7e65a
412f55b
We cannot bisect more!
bisect run cannot continue any more

diagnostics


http​://www.cpantesters.org/cpan/report/7f2c5b9a-f790-11e3-a27f-315a0a370852

perl -V


Summary of my perl5 (revision 5 version 21 subversion 1) configuration​:
  Commit id​: 46f0524
  Platform​:
  osname=linux, osvers=3.14-1-amd64, archname=x86_64-linux
  uname='linux k83 3.14-1-amd64 #1 smp debian 3.14.5-1 (2014-06-05) x86_64 gnulinux '
  config_args='-Dprefix=/home/sand/src/perl/repoperls/installed-perls/perl/v5.21.0-474-g46f0524/165a -Dmyhostname=k83 -Dinstallusrbinperl=n -Uversiononly -Dusedevel -des -Ui_db -Uuseithreads -Uuselongdouble -DDEBUGGING=-g'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=undef, usemultiplicity=undef
  use64bitint=define, use64bitall=define, uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize='-O2 -g',
  cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='4.8.3', gccosandvers=''
  intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
  libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
  perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
  libc=libc-2.19.so, so=so, useshrplib=false, libperl=libperl.a
  gnulibc_version='2.19'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
  cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​:
  Compile-time options​: HAS_TIMES PERLIO_LAYERS PERL_DONT_CREATE_GVSV
  PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_MALLOC_WRAP
  PERL_NEW_COPY_ON_WRITE PERL_PRESERVE_IVUV
  PERL_USE_DEVEL USE_64_BIT_ALL USE_64_BIT_INT
  USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE
  USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_PERLIO
  USE_PERL_ATOF
  Built under linux
  Compiled at Jun 18 2014 20​:58​:06
  %ENV​:
  PERL5LIB="/tmp/loop_over_bdir-5541-k56ri1/FusionInventory-Agent-2.3.8-ZicBdA/blib/arch​:/tmp/loop_over_bdir-5541-k56ri1/FusionInventory-Agent-2.3.8-ZicBdA/blib/lib"
  PERL5OPT=""
  PERL5_CPANPLUS_IS_RUNNING="29623"
  PERL5_CPAN_IS_RUNNING="29623"
  PERL_MM_USE_DEFAULT="1"
  @​INC​:
  /tmp/loop_over_bdir-5541-k56ri1/FusionInventory-Agent-2.3.8-ZicBdA/blib/arch
  /tmp/loop_over_bdir-5541-k56ri1/FusionInventory-Agent-2.3.8-ZicBdA/blib/lib
  /home/sand/src/perl/repoperls/installed-perls/perl/v5.21.0-474-g46f0524/165a/lib/site_perl/5.21.1/x86_64-linux
  /home/sand/src/perl/repoperls/installed-perls/perl/v5.21.0-474-g46f0524/165a/lib/site_perl/5.21.1
  /home/sand/src/perl/repoperls/installed-perls/perl/v5.21.0-474-g46f0524/165a/lib/5.21.1/x86_64-linux
  /home/sand/src/perl/repoperls/installed-perls/perl/v5.21.0-474-g46f0524/165a/lib/5.21.1
  .
--
andreas

@p5pRT
Copy link
Author

p5pRT commented Jun 20, 2014

From @khwilliamson

On 06/20/2014 01​:33 PM, (Andreas J. Koenig) (via RT) wrote​:

# New Ticket Created by (Andreas J. Koenig)
# Please include the string​: [perl #122146]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=122146 >

git bisect
----------
The first bad commit could be any of​:
4a7e65a
412f55b
We cannot bisect more!
bisect run cannot continue any more

The code that is failing comes down to this​:

perl -le 'my $escaped_filename= quotemeta
"/var/tmp/devel-hdb-test-ytvh​:10"; qr(\(eval \d+\)[$escaped_filename​:\d+])'

In earlier Perls, escaping the () with backslashes left them as if there
were no backslashes and since 5.16 raised the warning
Useless use of '\'; doesn't escape metacharacter '(' at -e line 1

Now that that warning has been out for two releases, Perl has changed so
that the backslashes now cause the () to be treated as literals, and
that is causing this failure.

I'd send a patch to the distribution's bug queue, but this whole regular
expression is so screwed up that I'm not sure what is intended. What
you might think without looking very hard is that it wants the filename
followed by a colon followed by some digits. But since all that is
enclosed in brackets, it is asking for a single character. This is what
the regex compiles to in 5.18​:
Final program​:
  1​: OPEN1 (3)
  3​: EXACT <eval > (6)
  6​: PLUS (8)
  7​: POSIXD[\d] (0)
  8​: CLOSE1 (10)
  10​: ANYOF[+\-/-​:abdehlmprstvy][{unicode}+utf8​::XPosixDigit ] (21)
  21​: END (0)

and this is what it compiles to in 5.21
Final program​:
  1​: EXACT <(eval > (4)
  4​: PLUS (6)
  5​: POSIXD[\d] (0)
  6​: EXACT <)> (8)
  8​: ANYOF[+\-\x{2F}-\x{3A}abdehlmprstvy][{utf8}0660-0669 06F0-06F9
07C0-07C9 0966-096F 09E6-09EF 0A66-0A6F 0AE6-0AEF 0B66-0B6F 0BE6-0BEF
0C66-0C6F 0CE6-0CEF 0D66-0D6F 0DE6-0DEF 0E50-0E59 0ED0-0ED9 0F20-0F29
1040-1049 1090-1099 17E0-17E9 1810-1819 1946-194F 19D0-19D9 1A80-1A89
1A90-1A99 1B50-1B59 1BB0-1BB9...] (19)
  19​: END (0)

The differences are the literal () and that what \d can match is now
known at compile time instead of being deferred to runtime.

I'm inclined to just submit a bug to the distribution and reject this one.

@p5pRT
Copy link
Author

p5pRT commented Jun 20, 2014

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jun 20, 2014

From @jkeenan

On Fri Jun 20 14​:54​:58 2014, public@​khwilliamson.com wrote​:
[snip]

The code that is failing comes down to this​:

perl -le 'my $escaped_filename= quotemeta
"/var/tmp/devel-hdb-test-ytvh​:10"; qr(\(eval \d+\)[$escaped_filename​:\d+])'

In earlier Perls, escaping the () with backslashes left them as if there
were no backslashes and since 5.16 raised the warning
Useless use of '\'; doesn't escape metacharacter '(' at -e line 1

Now that that warning has been out for two releases, Perl has changed so
that the backslashes now cause the () to be treated as literals, and
that is causing this failure.

I'd send a patch to the distribution's bug queue, but this whole regular
expression is so screwed up that I'm not sure what is intended. What
you might think without looking very hard is that it wants the filename
followed by a colon followed by some digits. But since all that is
enclosed in brackets, it is asking for a single character. This is what
the regex compiles to in 5.18​:
Final program​:
1​: OPEN1 (3)
3​: EXACT <eval > (6)
6​: PLUS (8)
7​: POSIXD[\d] (0)
8​: CLOSE1 (10)
10​: ANYOF[+\-/-​:abdehlmprstvy][{unicode}+utf8​::XPosixDigit ] (21)
21​: END (0)

and this is what it compiles to in 5.21
Final program​:
1​: EXACT <(eval > (4)
4​: PLUS (6)
5​: POSIXD[\d] (0)
6​: EXACT <)> (8)
8​: ANYOF[+\-\x{2F}-\x{3A}abdehlmprstvy][{utf8}0660-0669 06F0-06F9
07C0-07C9 0966-096F 09E6-09EF 0A66-0A6F 0AE6-0AEF 0B66-0B6F 0BE6-0BEF
0C66-0C6F 0CE6-0CEF 0D66-0D6F 0DE6-0DEF 0E50-0E59 0ED0-0ED9 0F20-0F29
1040-1049 1090-1099 17E0-17E9 1810-1819 1946-194F 19D0-19D9 1A80-1A89
1A90-1A99 1B50-1B59 1BB0-1BB9...] (19)
19​: END (0)

The differences are the literal () and that what \d can match is now
known at compile time instead of being deferred to runtime.

I'm inclined to just submit a bug to the distribution and reject this one.

I have reproduced the failure and agree with your recommendation as to how to proceed.

@p5pRT
Copy link
Author

p5pRT commented Jun 21, 2014

From @andk

Also affected​:

DMAKI/Data-Localize-0.00025.tar.gz (but only when BerkeleyDB or YAML​::XS
is already installed)

--
andreas

@p5pRT
Copy link
Author

p5pRT commented Jun 21, 2014

From @khwilliamson

On 06/20/2014 10​:25 PM, Andreas Koenig wrote​:

Also affected​:

DMAKI/Data-Localize-0.00025.tar.gz (but only when BerkeleyDB or YAML​::XS
is already installed)

Is there some way I can look at the report for this? I was unable to
browse cpantesters for this. I presume that the BBC ones are hidden
from normal view.

@p5pRT
Copy link
Author

p5pRT commented Jun 21, 2014

From @andk

On Sat, 21 Jun 2014 08​:44​:24 -0700, "karl williamson via RT" <perlbug-followup@​perl.org> said​:

On 06/20/2014 10​:25 PM, Andreas Koenig wrote​:

Also affected​:

DMAKI/Data-Localize-0.00025.tar.gz (but only when BerkeleyDB or YAML​::XS
is already installed)

Is there some way I can look at the report for this? I was unable to
browse cpantesters for this. I presume that the BBC ones are hidden
from normal view.

They are not hidden. They sometimes take a while before they are in
place. But three of these four are there for days now​:

http​://www.cpantesters.org/cpan/report/9fb99cfc-f85c-11e3-b74a-1c9f0a370852 2014-06-20 09​:24
http​://www.cpantesters.org/cpan/report/9725595e-f4cf-11e3-9315-479a0a370852 2014-06-15 20​:56
http​://www.cpantesters.org/cpan/report/49bae282-f40d-11e3-b2c8-4cc00a370852 2014-06-14 21​:46
http​://www.cpantesters.org/cpan/report/6b01e306-f362-11e3-bcc2-6fce0a370852 2014-06-14 01​:22

Imho best tool to find them is usually the matrix​:
http​://matrix.cpantesters.org/?dist=Data-Localize

--
andreas

@p5pRT
Copy link
Author

p5pRT commented Jun 23, 2014

From @khwilliamson

On 06/20/2014 10​:25 PM, Andreas Koenig wrote​:

Also affected​:

DMAKI/Data-Localize-0.00025.tar.gz (but only when BerkeleyDB or YAML​::XS
is already installed)

We're going to get some of these where suddenly there is a new
deprecation warning about unescaped left brace "{" being understood as a
literal. From looking at the smoke report, that appears to be what is
going on for this one.

In 5.17 we tried to do this deprecation as well, and many CPAN
distributions changed, but we had to back the change out before shipping
5.18, because it turned out that the obvious solution for escaping the
curly (preceding it with a backslash) didn't work when the regular
expression delimiters were {}. Fixing that required a 2-release
deprecation cycle, which is now done, so I'm trying to institute the
original deprecation once more.

It shouldn't be nearly as many as 5.17, but there will be some. I
intend to make patches for the affected distros with pull requests,
though if someone wants to help out, that would be great.

It didn't occur to me back then that there was available another simple
way to escape a curly​: qr{[{]}. But the backslash is more legible.

@p5pRT
Copy link
Author

p5pRT commented Jun 27, 2014

From @khwilliamson

I changed the title of this ticket to "Deprecating unescaped literal { in regex breaks CPAN"
There will be other affected distributions, so their names can just be added on to those already here.

We've concluded that the first one found was a bug in the test. So, the maintainer should be notified. I'm not sure of the best procedure to do so. I'm guessing an email from me with a summary of what we've found?

The second one should be patched, and I will submit a pull request on its github to accomplish this

@p5pRT
Copy link
Author

p5pRT commented Jun 27, 2014

From @ikegami

On Sun Jun 22 20​:26​:21 2014, public@​khwilliamson.com wrote​:

It didn't occur to me back then that there was available another simple
way to escape a curly​: qr{[{]}. But the backslash is more legible.

  $ perl -e'qr{[{]}' # 5.20.0
  Search pattern not terminated at -e line 1.

Delimiters must be balanced since the end of the pattern must be found before the pattern can be parsed.

@p5pRT
Copy link
Author

p5pRT commented Jul 1, 2014

From @khwilliamson

On Fri Jun 27 15​:08​:32 2014, ikegami@​adaelis.com wrote​:

On Sun Jun 22 20​:26​:21 2014, public@​khwilliamson.com wrote​:

It didn't occur to me back then that there was available another
simple
way to escape a curly​: qr{[{]}. But the backslash is more legible.

$ perl -e'qr{[{]}' # 5.20.0
Search pattern not terminated at -e line 1.

Delimiters must be balanced since the end of the pattern must be found
before the pattern can be parsed.

Sorry that I forgot to mention that detail
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jul 1, 2014

From @khwilliamson

Data​::Localize has been issued a pull request​:
lestrrat-p5/Data-Localize#4
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jul 1, 2014

From @khwilliamson

And I have submitted a bug report for Devel​::hdb
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jul 1, 2014

From @khwilliamson

On Mon Jun 30 20​:46​:06 2014, khw wrote​:

And I have submitted a bug report for Devel​::hdb

The maintainer has uploaded a new version 0.13 to CPAN. It turns out that the [] were supposed to match literally, as were the (). 5.20 is the first release where the () did match literally, so we changed Perl to be what he was expecting in this regard. The warnings in 5.16 and 5.18 that it didn't do what was likely expected went unnoticed
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jul 1, 2014

From @khwilliamson

On Mon Jun 30 20​:11​:45 2014, khw wrote​:

Data​::Localize has been issued a pull request​:
lestrrat-p5/Data-Localize#4

That has now been merged, and I presume will be in a CPAN release at some point
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Oct 17, 2014

From @jkeenan

On Tue Jul 01 14​:31​:18 2014, khw wrote​:

On Mon Jun 30 20​:11​:45 2014, khw wrote​:

Data​::Localize has been issued a pull request​:
lestrrat-p5/Data-Localize#4

That has now been merged, and I presume will be in a CPAN release at
some point

Karl, have we sufficiently addressed the CPAN breakage such that we can close this ticket?

Thank you very much.
--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Oct 17, 2014

From @khwilliamson

On Thu Oct 16 17​:58​:20 2014, jkeenan wrote​:

On Tue Jul 01 14​:31​:18 2014, khw wrote​:

On Mon Jun 30 20​:11​:45 2014, khw wrote​:

Data​::Localize has been issued a pull request​:
lestrrat-p5/Data-Localize#4

That has now been merged, and I presume will be in a CPAN release at
some point

Karl, have we sufficiently addressed the CPAN breakage such that we
can close this ticket?

Thank you very much.

Well, I think so. I haven't seen any new CPAN failures from this. I emailed Andreas about this, some time ago, but it bounced (this isn't the first time email to him has bounced), and I got distracted and didn't follow up. Thanks for keeping track for me.
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Oct 17, 2014

From @jkeenan

Marking ticket Resolved as per discussion with khw.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT p5pRT closed this as completed Oct 17, 2014
@p5pRT
Copy link
Author

p5pRT commented Oct 17, 2014

@jkeenan - Status changed from 'open' to 'resolved'

@p5pRT
Copy link
Author

p5pRT commented Dec 6, 2014

From @andk

On Fri, 17 Oct 2014 16​:36​:56 -0700, "James E Keenan via RT" <perlbug-followup@​perl.org> said​:

  > Marking ticket Resolved as per discussion with khw.

New CPAN upload affected by this​:

RRWO/Pod-Readme-v1.1.1.tar.gz
http​://www.cpantesters.org/cpan/report/a9a07ed6-7c7b-11e4-ad19-d06752b25338

--
andreas

@p5pRT
Copy link
Author

p5pRT commented Dec 8, 2014

From @khwilliamson

On Sat Dec 06 01​:58​:01 2014, andreas.koenig.7os6VVqR@​franz.ak.mind.de wrote​:

On Fri, 17 Oct 2014 16​:36​:56 -0700, "James E Keenan via RT"
<perlbug-followup@​perl.org> said​:

Marking ticket Resolved as per discussion with khw.

New CPAN upload affected by this​:

RRWO/Pod-Readme-v1.1.1.tar.gz
http​://www.cpantesters.org/cpan/report/a9a07ed6-7c7b-11e4-ad19-
d06752b25338

I didn't figure out why this was breaking without a warning message being visible, but in fact I found some unescaped left braces, and when I escaped them, the failing tests started working again. I have submitted a pull request to the maintainer
bigpresh/Pod-Readme#13

Just for clarity, the regex is like qr/{{foo/. If the very first character in a pattern is a left brace, it doesn't have to be escaped, so qr/{\{foo/ would have worked too. But it's safer to escape all of them should the pattern change in the future, and this detail be missed; and that's one less exception to have to worry about
--
Karl Williamson

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant