Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further elimination of unescaped left braces in regular expressions #17023

Closed
p5pRT opened this issue May 25, 2019 · 5 comments
Closed

Further elimination of unescaped left braces in regular expressions #17023

p5pRT opened this issue May 25, 2019 · 5 comments
Milestone

Comments

@p5pRT
Copy link

p5pRT commented May 25, 2019

Migrated from rt.perl.org#134141 (status was 'open')

Searchable as RT134141$

@p5pRT
Copy link
Author

p5pRT commented May 25, 2019

From @jkeenan

pod/perldeprecation.pod contains the following entry for perl-5.32​:

#####
Unescaped left braces in regular expressions

The simple rule to remember, if you want to match a literal
"{" character (U+007B "LEFT CURLY BRACKET") in a regular
expression pattern, is to escape each literal instance of
it in some way. Generally easiest is to precede it with a
backslash, like "\{" or enclose it in square brackets
("[{]"). If the pattern delimiters are also braces, any
matching right brace ("}") should also be escaped to avoid
confusing the parser, for example,

qr{abc\{def\}ghi}

Forcing literal "{" characters to be escaped will enable
the Perl language to be extended in various ways in future
releases. To avoid needlessly breaking existing code, the
restriction is is not enforced in contexts where there are
unlikely to ever be extensions that could conflict with the
use there of "{" as a literal. A non-deprecation warning
that the left brace is being taken literally is raised in
contexts where there could be confusion about it.

Literal uses of "{" were deprecated in Perl 5.20, and some
uses of it started to give deprecation warnings since.
These cases were made fatal in Perl 5.26. Due to an
oversight, not all cases of a use of a literal "{" got a
deprecation warning. Some cases started warning in Perl
5.26, and were made fatal in Perl 5.30. Other cases started
in Perl 5.28, and will be made fatal in 5.32.
#####

This entry was recorded in the following commit​:

#####
commit 0367231
Author​: Karl Williamson <khw@​cpan.org>
Date​: Thu Jan 4 12​:53​:29 2018 -0700

Raise deprecation for qr/(?foo})/

An unescaped left brace that is meant to be taken literally
is officially deprecated, though there are no plans to
remove it in contexts where we don't expect to use it to
mean something else, and no warning is raised in those
contexts.

reg_mesg.t tests the known set of these contexts, currently
(after this commit)​:

/^{/ /foo|{/ /foo|^{/ /foo(​:?{bar)/ /\s*{/ /a{3,4}{/

This commit deprecates this context​: /foo({bar})/

This probably should have been illegal all along when 'bar'
is a valid quantifier, as we do with the other quantifiers
that follow a left paren whose illegality we haven't
already taken advantage of to mean something else​:

qr/(+0)/ Quantifier follows nothing in regex

This deprecation will allow ({...}) to be usable for a
possible future regex extension
#####

Make it so.

@p5pRT
Copy link
Author

p5pRT commented May 25, 2019

From @jkeenan

Summary of my perl5 (revision 5 version 31 subversion 0) configuration​:
  Commit id​: 58f4626
  Platform​:
  osname=linux
  osvers=4.15.0-50-generic
  archname=x86_64-linux
  uname='linux zareason 4.15.0-50-generic #54-ubuntu smp mon may 6 18​:46​:08 utc 2019 x86_64 x86_64 x86_64 gnulinux '
  config_args='-des -Dusedevel -Dusemymalloc=y'
  hint=recommended
  useposix=true
  d_sigaction=define
  useithreads=undef
  usemultiplicity=undef
  use64bitint=define
  use64bitall=define
  uselongdouble=undef
  usemymalloc=y
  default_inc_excludes_dot=define
  bincompat5005=undef
  Compiler​:
  cc='cc'
  ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
  optimize='-O2'
  cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
  ccversion=''
  gccversion='7.4.0'
  gccosandvers=''
  intsize=4
  longsize=8
  ptrsize=8
  doublesize=8
  byteorder=12345678
  doublekind=3
  d_longlong=define
  longlongsize=8
  d_longdbl=define
  longdblsize=16
  longdblkind=3
  ivtype='long'
  ivsize=8
  nvtype='double'
  nvsize=8
  Off_t='off_t'
  lseeksize=8
  alignbytes=8
  prototype=define
  Linker and Libraries​:
  ld='cc'
  ldflags =' -fstack-protector-strong -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/7/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /lib64 /usr/lib64
  libs=-lpthread -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc
  perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
  libc=libc-2.27.so
  so=so
  useshrplib=false
  libperl=libperl.a
  gnulibc_version='2.27'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs
  dlext=so
  d_dlsymun=undef
  ccdlflags='-Wl,-E'
  cccdlflags='-fPIC'
  lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl)​:
  Compile-time options​:
  HAS_TIMES
  MYMALLOC
  PERLIO_LAYERS
  PERL_COPY_ON_WRITE
  PERL_DONT_CREATE_GVSV
  PERL_MALLOC_WRAP
  PERL_OP_PARENT
  PERL_PRESERVE_IVUV
  PERL_USE_DEVEL
  USE_64_BIT_ALL
  USE_64_BIT_INT
  USE_LARGE_FILES
  USE_LOCALE
  USE_LOCALE_COLLATE
  USE_LOCALE_CTYPE
  USE_LOCALE_NUMERIC
  USE_LOCALE_TIME
  USE_PERLIO
  USE_PERL_ATOF
  Built under linux
  Compiled at May 22 2019 14​:58​:04
  %ENV​:
  PERL2DIR="/home/jkeenan/gitwork/perl2"
  PERLBREW_HOME="/home/jkeenan/.perlbrew"
  PERLBREW_MANPATH="/home/jkeenan/perl5/perlbrew/perls/perl-5.28.0/man"
  PERLBREW_PATH="/home/jkeenan/perl5/perlbrew/bin​:/home/jkeenan/perl5/perlbrew/perls/perl-5.28.0/bin"
  PERLBREW_PERL="perl-5.28.0"
  PERLBREW_ROOT="/home/jkeenan/perl5/perlbrew"
  PERLBREW_SHELLRC_VERSION="0.84"
  PERLBREW_VERSION="0.84"
  PERL_WORKDIR="/home/jkeenan/gitwork/perl"
  @​INC​:
  lib
  /usr/local/lib/perl5/site_perl/5.31.0/x86_64-linux
  /usr/local/lib/perl5/site_perl/5.31.0
  /usr/local/lib/perl5/5.31.0/x86_64-linux
  /usr/local/lib/perl5/5.31.0

@p5pRT
Copy link
Author

p5pRT commented Aug 9, 2019

From @jkeenan

On Sat, 25 May 2019 02​:18​:57 GMT, jkeenan@​pobox.com wrote​:

pod/perldeprecation.pod contains the following entry for perl-5.32​:

#####
Unescaped left braces in regular expressions

The simple rule to remember, if you want to match a literal
"{" character (U+007B "LEFT CURLY BRACKET") in a regular
expression pattern, is to escape each literal instance of
it in some way. Generally easiest is to precede it with a
backslash, like "\{" or enclose it in square brackets
("[{]"). If the pattern delimiters are also braces, any
matching right brace ("}") should also be escaped to avoid
confusing the parser, for example,

qr{abc\{def\}ghi}

Forcing literal "{" characters to be escaped will enable
the Perl language to be extended in various ways in future
releases. To avoid needlessly breaking existing code, the
restriction is is not enforced in contexts where there are
unlikely to ever be extensions that could conflict with the
use there of "{" as a literal. A non-deprecation warning
that the left brace is being taken literally is raised in
contexts where there could be confusion about it.

Literal uses of "{" were deprecated in Perl 5.20, and some
uses of it started to give deprecation warnings since.
These cases were made fatal in Perl 5.26. Due to an
oversight, not all cases of a use of a literal "{" got a
deprecation warning. Some cases started warning in Perl
5.26, and were made fatal in Perl 5.30. Other cases started
in Perl 5.28, and will be made fatal in 5.32.
#####

This entry was recorded in the following commit​:

#####
commit 0367231
Author​: Karl Williamson <khw@​cpan.org>
Date​: Thu Jan 4 12​:53​:29 2018 -0700

Raise deprecation for qr/(?foo})/

An unescaped left brace that is meant to be taken literally
is officially deprecated, though there are no plans to
remove it in contexts where we don't expect to use it to
mean something else, and no warning is raised in those
contexts.

reg_mesg.t tests the known set of these contexts, currently
(after this commit)​:

/^{/ /foo|{/ /foo|^{/ /foo(​:?{bar)/ /\s*{/ /a{3,4}{/

This commit deprecates this context​: /foo({bar})/

This probably should have been illegal all along when 'bar'
is a valid quantifier, as we do with the other quantifiers
that follow a left paren whose illegality we haven't
already taken advantage of to mean something else​:

qr/(+0)/ Quantifier follows nothing in regex

This deprecation will allow ({...}) to be usable for a
possible future regex extension
#####

Make it so.

Karl, do you expect to be taking this ticket on soon? It's on our 5.32 blockers list.

Thanks.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Aug 9, 2019

The RT System itself - Status changed from 'new' to 'open'

@toddr toddr removed the khw label Oct 25, 2019
@p5pRT p5pRT added the 5.32.0 label Oct 25, 2019
@toddr toddr added this to the 5.32.0 milestone Oct 25, 2019
@toddr toddr removed the 5.32.0 label Oct 25, 2019
@khwilliamson
Copy link
Contributor

I'm now closing this ticket. There will not be warnings when a left brace immediately follows a parenthesis, as there is a lot of code out there that has that, and wants the brace to be treated literally. I think we got cleared up the places where we want to change the meaning of unescaped lbraces. Those contexts have been fatal for a while now. But I'm thinking of giving it all of 5.32 as well to generate errors rather than jumping in and changing the meaning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants