Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantifiers in (?(DEFINE)...) #16106

Open
p5pRT opened this issue Aug 9, 2017 · 8 comments
Open

Quantifiers in (?(DEFINE)...) #16106

p5pRT opened this issue Aug 9, 2017 · 8 comments

Comments

@p5pRT
Copy link

p5pRT commented Aug 9, 2017

Migrated from rt.perl.org#131868 (status was 'open')

Searchable as RT131868$

@p5pRT
Copy link
Author

p5pRT commented Aug 9, 2017

From @Abigail

Created by @Abigail

Consider this pattern​:

  my $pat = qr {
  (?(DEFINE)
  (?<digit> [0-9])
  (?<digits> (?&digit)+)
  )
  ^(?&digits)$
  }x;

This matches a string of digits, and Perl doesn't complain.

Now, let's make a small change; instead of trying to match
any number of digits, lets match 4​:

  my $pat = qr {
  (?(DEFINE)
  (?<digit> [0-9])
  (?<digits> (?&digit)+)
  )
  ^(?&digits)$
  }x;

This pattern works, but a warning is issued​:

  Quantifier unexpected on zero-length expression in regex m/
  (?(DEFINE)
  (?<digit> [0-9])
  (?<digits> (?&digit){4})
  )
  ^(?&digits)$
  / at /tmp/bar line 15.

The same happens if we replace {4} with {4,4} or {2,4}, and disappears
when it's replaced with {4,}. That is, a warning happens if the number
of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the
warning disappear as well.

This problem is still present in blead.

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl 5.26.0:

Configured by abigail at Wed Jun  7 23:04:17 CEST 2017.

Summary of my perl5 (revision 5 version 26 subversion 0) configuration:
   
  Platform:
    osname=darwin
    osvers=15.6.0
    archname=darwin-ld-2level
    uname='darwin athena 15.6.0 darwin kernel version 15.6.0: thu jun 23 18:25:34 pdt 2016; root:xnu-3248.60.10~1release_x86_64 x86_64 '
    config_args='-des -Uversiononly -Dperladmin=abigail@abigail.be -Dcf_email=abigail@abigail.be -Dmydomain=abigail.be -Dcc=gcc -Dprefix=/opt/perl/5.26.0 -Dusedevel -Dusemorebits'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=define
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='gcc'
    ccflags ='-fno-common -DPERL_DARWIN -no-cpp-precomp -mmacosx-version-min=10.11 -fno-strict-aliasing -pipe -fstack-protector-strong -I/opt/local/include -DPERL_USE_SAFE_PUTENV'
    optimize='-O3'
    cppflags='-no-cpp-precomp -fno-common -DPERL_DARWIN -no-cpp-precomp -mmacosx-version-min=10.11 -fno-strict-aliasing -pipe -fstack-protector-strong -I/opt/local/include'
    ccversion=''
    gccversion='4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='long double'
    nvsize=16
    Off_t='off_t'
    lseeksize=8
    alignbytes=16
    prototype=define
  Linker and Libraries:
    ld='gcc'
    ldflags =' -mmacosx-version-min=10.11 -fstack-protector-strong -L/opt/local/lib'
    libpth=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/lib /opt/local/lib /usr/lib
    libs=-lpthread -lgdbm -ldbm -ldl -lm -lutil -lc
    perllibs=-lpthread -ldl -lm -lutil -lc
    libc=
    so=dylib
    useshrplib=false
    libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=bundle
    d_dlsymun=undef
    ccdlflags=' '
    cccdlflags=' '
    lddlflags=' -mmacosx-version-min=10.11 -bundle -undefined dynamic_lookup -L/opt/local/lib -fstack-protector-strong'



@INC for perl 5.26.0:
    /Users/abigail/Perl/CPAN/Regexp-Common2/lib
    /Users/abigail/Perl/CPAN/Test-Regexp/lib
    /opt/perl/5.26.0/lib/site_perl/5.26.0/darwin-ld-2level
    /opt/perl/5.26.0/lib/site_perl/5.26.0
    /opt/perl/5.26.0/lib/5.26.0/darwin-ld-2level
    /opt/perl/5.26.0/lib/5.26.0


Environment for perl 5.26.0:
    DYLD_LIBRARY_PATH (unset)
    HOME=/Users/abigail
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH=/Users/abigail/Lib:/usr/local/lib:/usr/lib:/lib:/usr/X11R6/lib
    LOGDIR (unset)
    PATH=/Users/abigail/Bin:/opt/perl/bin:/opt/local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/X11R6/bin:/usr/games:/opt/git/bin:/Users/abigail/Perl/Photos:/Users/abigail/Perl/Bin:/opt/mysql/bin:/opt/local/bin:/Users/abigail/bin
    PERL5LIB=/Users/abigail/Perl/CPAN/Regexp-Common2/lib:/Users/abigail/Perl/CPAN/Test-Regexp/lib
    PERLDIR=/opt/perl
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Aug 19, 2017

From @demerphq

Created by @Abigail

Consider this pattern​:

  my $pat = qr {
  (?(DEFINE)
  (?<digit> [0-9])
  (?<digits> (?&digit)+)
  )
  ^(?&digits)$
  }x;

This matches a string of digits, and Perl doesn't complain.

Now, let's make a small change; instead of trying to match
any number of digits, lets match 4​:

  my $pat = qr {
  (?(DEFINE)
  (?<digit> [0-9])
  (?<digits> (?&digit)+)
  )
  ^(?&digits)$
  }x;

This pattern works, but a warning is issued​:

  Quantifier unexpected on zero-length expression in regex m/
  (?(DEFINE)
  (?<digit> [0-9])
  (?<digits> (?&digit){4})
  )
  ^(?&digits)$
  / at /tmp/bar line 15.

The same happens if we replace {4} with {4,4} or {2,4}, and disappears
when it's replaced with {4,}. That is, a warning happens if the number
of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the
warning disappear as well.

This problem is still present in blead.

It would be helpful to know if this is new behaviour or not. Something
makes me think it's newish at least.

Yves

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl 5.26.0:

Configured by abigail at Wed Jun  7 23:04:17 CEST 2017.

Summary of my perl5 (revision 5 version 26 subversion 0) configuration:

  Platform:
    osname=darwin
    osvers=15.6.0
    archname=darwin-ld-2level
    uname='darwin athena 15.6.0 darwin kernel version 15.6.0: thu jun 23
18:25:34 pdt 2016; root:xnu-3248.60.10~1release_x86_64 x86_64 '
    config_args='-des -Uversiononly -Dperladmin=abigail@abigail.be
-Dcf_email=abigail@abigail.be -Dmydomain=abigail.be -Dcc=gcc
-Dprefix=/opt/perl/5.26.0 -Dusedevel -Dusemorebits'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=define
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='gcc'
    ccflags ='-fno-common -DPERL_DARWIN -no-cpp-precomp
-mmacosx-version-min=10.11 -fno-strict-aliasing -pipe
-fstack-protector-strong -I/opt/local/include -DPERL_USE_SAFE_PUTENV'
    optimize='-O3'
    cppflags='-no-cpp-precomp -fno-common -DPERL_DARWIN -no-cpp-precomp
-mmacosx-version-min=10.11 -fno-strict-aliasing -pipe
-fstack-protector-strong -I/opt/local/include'
    ccversion=''
    gccversion='4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='long double'
    nvsize=16
    Off_t='off_t'
    lseeksize=8
    alignbytes=16
    prototype=define
  Linker and Libraries:
    ld='gcc'
    ldflags =' -mmacosx-version-min=10.11 -fstack-protector-strong
-L/opt/local/lib'
    libpth=/Applications/Xcode.app/Contents/Developer/
Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib
/Applications/Xcode.app/Contents/Developer/Toolchains/
XcodeDefault.xctoolchain/usr/lib /Applications/Xcode.app/
Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/lib
/opt/local/lib /usr/lib
    libs=-lpthread -lgdbm -ldbm -ldl -lm -lutil -lc
    perllibs=-lpthread -ldl -lm -lutil -lc
    libc=
    so=dylib
    useshrplib=false
    libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=bundle
    d_dlsymun=undef
    ccdlflags=' '
    cccdlflags=' '
    lddlflags=' -mmacosx-version-min=10.11 -bundle -undefined
dynamic_lookup -L/opt/local/lib -fstack-protector-strong'



@INC for perl 5.26.0:
    /Users/abigail/Perl/CPAN/Regexp-Common2/lib
    /Users/abigail/Perl/CPAN/Test-Regexp/lib
    /opt/perl/5.26.0/lib/site_perl/5.26.0/darwin-ld-2level
    /opt/perl/5.26.0/lib/site_perl/5.26.0
    /opt/perl/5.26.0/lib/5.26.0/darwin-ld-2level
    /opt/perl/5.26.0/lib/5.26.0


Environment for perl 5.26.0:
    DYLD_LIBRARY_PATH (unset)
    HOME=/Users/abigail
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH=/Users/abigail/Lib:/usr/local/lib:/
usr/lib:/lib:/usr/X11R6/lib
    LOGDIR (unset)
    PATH=/Users/abigail/Bin:/opt/perl/bin:/opt/local/bin:/usr/
local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/
usr/X11R6/bin:/usr/games:/opt/git/bin:/Users/abigail/Perl/
Photos:/Users/abigail/Perl/Bin:/opt/mysql/bin:/opt/local/
bin:/Users/abigail/bin
    PERL5LIB=/Users/abigail/Perl/CPAN/Regexp-Common2/lib:/
Users/abigail/Perl/CPAN/Test-Regexp/lib
    PERLDIR=/opt/perl
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Aug 19, 2017

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Aug 20, 2017

From @Abigail

On Sat, Aug 19, 2017 at 02​:58​:23PM -0700, yves orton via RT wrote​:

On 9 Aug 2017 7​:39 am, "Abigail" <perlbug-followup@​perl.org> wrote​:

# New Ticket Created by Abigail
# Please include the string​: [perl #131868]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=131868 >

This is a bug report for perl from abigail@​abigail.be,
generated with the help of perlbug 1.40 running under perl 5.26.0.

-----------------------------------------------------------------
[Please describe your issue here]

Consider this pattern​:

my $pat = qr \{
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\+\)
    \)
    ^\(?&digits\)$
\}x;

This matches a string of digits, and Perl doesn't complain.

Now, let's make a small change; instead of trying to match
any number of digits, lets match 4​:

my $pat = qr \{
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\+\)
    \)
    ^\(?&digits\)$
\}x;

This pattern works, but a warning is issued​:

Quantifier unexpected on zero\-length expression in regex m/
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\{4\}\)
    \)
    ^\(?&digits\)$
/ at /tmp/bar line 15\.

The same happens if we replace {4} with {4,4} or {2,4}, and disappears
when it's replaced with {4,}. That is, a warning happens if the number
of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the
warning disappear as well.

This problem is still present in blead.

It would be helpful to know if this is new behaviour or not. Something
makes me think it's newish at least.

It warns for me in 5.22.0, but not in 5.20.0

Abigail

@p5pRT
Copy link
Author

p5pRT commented Sep 11, 2017

From @iabyn

On Sun, Aug 20, 2017 at 10​:16​:16PM +0200, Abigail wrote​:

On Sat, Aug 19, 2017 at 02​:58​:23PM -0700, yves orton via RT wrote​:

On 9 Aug 2017 7​:39 am, "Abigail" <perlbug-followup@​perl.org> wrote​:
This pattern works, but a warning is issued​:

Quantifier unexpected on zero\-length expression in regex m/
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\{4\}\)
    \)
    ^\(?&digits\)$
/ at /tmp/bar line 15\.

The same happens if we replace {4} with {4,4} or {2,4}, and disappears
when it's replaced with {4,}. That is, a warning happens if the number
of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the
warning disappear as well.

This problem is still present in blead.

It would be helpful to know if this is new behaviour or not. Something
makes me think it's newish at least.

It warns for me in 5.22.0, but not in 5.20.0

Bisects to​:

a51d618 is the first bad commit
commit a51d618
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Sep 19 19​:57​:34 2014 +0200

  rt 122283 - do not recurse into GOSUB/GOSTART when not SCF_DO_SUBSTR
 
  See also comments in patch. A complex regex "grammar" like that in
  RT 122283 causes perl to take literally forever, and exhaust all
  memory during the pattern optimization phase.
 
  Unfortunately I could not track down exacty why this occured, but
  it was very clear that the excessive recursion was unnecessary and
  excessive. By simply eliminating the unncessary recursion performance
  goes back to being acceptable.
 
  I have not thought of a good way to test this change, so this patch
  does not include any tests. Perhaps we can test it using alarm, but
  I will follow up on that later.

--
The crew of the Enterprise encounter an alien life form which is
surprisingly neither humanoid nor made from pure energy.
  -- Things That Never Happen in "Star Trek" #22

@p5pRT
Copy link
Author

p5pRT commented Sep 13, 2017

From @demerphq

On 11 September 2017 at 09​:17, Dave Mitchell <davem@​iabyn.com> wrote​:

On Sun, Aug 20, 2017 at 10​:16​:16PM +0200, Abigail wrote​:

On Sat, Aug 19, 2017 at 02​:58​:23PM -0700, yves orton via RT wrote​:

On 9 Aug 2017 7​:39 am, "Abigail" <perlbug-followup@​perl.org> wrote​:
This pattern works, but a warning is issued​:

Quantifier unexpected on zero\-length expression in regex m/
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\{4\}\)
    \)
    ^\(?&digits\)$
/ at /tmp/bar line 15\.

The same happens if we replace {4} with {4,4} or {2,4}, and disappears
when it's replaced with {4,}. That is, a warning happens if the number
of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the
warning disappear as well.

This problem is still present in blead.

It would be helpful to know if this is new behaviour or not. Something
makes me think it's newish at least.

It warns for me in 5.22.0, but not in 5.20.0

Bisects to​:

a51d618 is the first bad commit
commit a51d618
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Sep 19 19​:57​:34 2014 +0200

rt 122283 \- do not recurse into GOSUB/GOSTART when not SCF\_DO\_SUBSTR

See also comments in patch\. A complex regex "grammar" like that in
RT 122283 causes perl to take literally forever\, and exhaust all
memory during the pattern optimization phase\.

Unfortunately I could not track down exacty why this occured\, but
it was very clear that the excessive recursion was unnecessary and
excessive\. By simply eliminating the unncessary recursion performance
goes back to being acceptable\.

I have not thought of a good way to test this change\, so this patch
does not include any tests\. Perhaps we can test it using alarm\, but
I will follow up on that later\.

Fixed in 0e3f444

Thanks for the report. Feel like writing a test? ;-)

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Sep 13, 2017

From @Abigail

On Wed, Sep 13, 2017 at 06​:05​:43PM +0200, demerphq wrote​:

On 11 September 2017 at 09​:17, Dave Mitchell <davem@​iabyn.com> wrote​:

On Sun, Aug 20, 2017 at 10​:16​:16PM +0200, Abigail wrote​:

On Sat, Aug 19, 2017 at 02​:58​:23PM -0700, yves orton via RT wrote​:

On 9 Aug 2017 7​:39 am, "Abigail" <perlbug-followup@​perl.org> wrote​:
This pattern works, but a warning is issued​:

Quantifier unexpected on zero\-length expression in regex m/
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\{4\}\)
    \)
    ^\(?&digits\)$
/ at /tmp/bar line 15\.

The same happens if we replace {4} with {4,4} or {2,4}, and disappears
when it's replaced with {4,}. That is, a warning happens if the number
of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the
warning disappear as well.

This problem is still present in blead.

It would be helpful to know if this is new behaviour or not. Something
makes me think it's newish at least.

It warns for me in 5.22.0, but not in 5.20.0

Bisects to​:

a51d618 is the first bad commit
commit a51d618
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Sep 19 19​:57​:34 2014 +0200

rt 122283 \- do not recurse into GOSUB/GOSTART when not SCF\_DO\_SUBSTR

See also comments in patch\. A complex regex "grammar" like that in
RT 122283 causes perl to take literally forever\, and exhaust all
memory during the pattern optimization phase\.

Unfortunately I could not track down exacty why this occured\, but
it was very clear that the excessive recursion was unnecessary and
excessive\. By simply eliminating the unncessary recursion performance
goes back to being acceptable\.

I have not thought of a good way to test this change\, so this patch
does not include any tests\. Perhaps we can test it using alarm\, but
I will follow up on that later\.

Fixed in 0e3f444

Thanks for the report. Feel like writing a test? ;-)

Thanks for the patch. Test added in commit
c2b4244.

Abigail

@p5pRT
Copy link
Author

p5pRT commented Sep 14, 2017

From @demerphq

Thank you! Cheers, yves

On 14 Sep 2017 00​:21, "Abigail" <abigail@​abigail.be> wrote​:

On Wed, Sep 13, 2017 at 06​:05​:43PM +0200, demerphq wrote​:

On 11 September 2017 at 09​:17, Dave Mitchell <davem@​iabyn.com> wrote​:

On Sun, Aug 20, 2017 at 10​:16​:16PM +0200, Abigail wrote​:

On Sat, Aug 19, 2017 at 02​:58​:23PM -0700, yves orton via RT wrote​:

On 9 Aug 2017 7​:39 am, "Abigail" <perlbug-followup@​perl.org> wrote​:
This pattern works, but a warning is issued​:

Quantifier unexpected on zero\-length expression in regex m/
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\{4\}\)
    \)
    ^\(?&digits\)$
/ at /tmp/bar line 15\.

The same happens if we replace {4} with {4,4} or {2,4}, and
disappears
when it's replaced with {4,}. That is, a warning happens if the
number
of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the
warning disappear as well.

This problem is still present in blead.

It would be helpful to know if this is new behaviour or not.
Something
makes me think it's newish at least.

It warns for me in 5.22.0, but not in 5.20.0

Bisects to​:

a51d618 is the first bad commit
commit a51d618
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Sep 19 19​:57​:34 2014 +0200

rt 122283 \- do not recurse into GOSUB/GOSTART when not

SCF_DO_SUBSTR

See also comments in patch\. A complex regex "grammar" like that in
RT 122283 causes perl to take literally forever\, and exhaust all
memory during the pattern optimization phase\.

Unfortunately I could not track down exacty why this occured\, but
it was very clear that the excessive recursion was unnecessary and
excessive\. By simply eliminating the unncessary recursion

performance

goes back to being acceptable\.

I have not thought of a good way to test this change\, so this patch
does not include any tests\. Perhaps we can test it using alarm\, but
I will follow up on that later\.

Fixed in 0e3f444

Thanks for the report. Feel like writing a test? ;-)

Thanks for the patch. Test added in commit
c2b4244.

Abigail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants