Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

$1 not set properly on backtrack #7407

Closed
p5pRT opened this issue Jul 6, 2004 · 6 comments
Closed

$1 not set properly on backtrack #7407

p5pRT opened this issue Jul 6, 2004 · 6 comments

Comments

@p5pRT
Copy link

p5pRT commented Jul 6, 2004

Migrated from rt.perl.org#30608 (status was 'resolved')

Searchable as RT30608$

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2004

From japhy@perlmonk.org

Created by japhy@perlmonk.org

Upon backtracking, $1 has a bad value (as do $-[1] and $+[1])​:

  "ABCDE" =~ m{
  (?​: (.+) (?{ print "<$1> $-[1]..$+[1]\n" }) ){2}
  }x;

This should print

<ABCDE> 0..5
<ABCD> 0..4
<E> 4..5

But instead it prints

<ABCDE> 0..5
<> 5..4
<E> 4..5

Perl Info

Flags:
    category=core
    severity=high

Site configuration information for perl v5.8.3:

Configured by Debian Project at Sat Mar 27 17:07:14 EST 2004.

Summary of my perl5 (revision 5.0 version 8 subversion 3) configuration:
  Platform:
    osname=linux, osvers=2.4.25-ti1211, archname=i386-linux-thread-multi
    uname='linux kosh 2.4.25-ti1211 #1 thu feb 19 18:20:12 est 2004 i686 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i386-linux -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.3 -Dsitearch=/usr/local/lib/perl/5.8.3 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.3 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O3',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -I/usr/local/include'
    ccversion='', gccversion='3.3.3 (Debian 20040314)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.3.2.so, so=so, useshrplib=true, libperl=libperl.so.5.8.3
    gnulibc_version='2.3.2'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    


@INC for perl v5.8.3:
    /home/japhy/lib/share/perl/5.8.3
    /etc/perl
    /usr/local/lib/perl/5.8.3
    /usr/local/share/perl/5.8.3
    /usr/lib/perl5
    /usr/share/perl5
    /usr/lib/perl/5.8
    /usr/share/perl/5.8
    /usr/local/lib/site_perl
    /usr/local/lib/perl/5.8.0
    /usr/local/share/perl/5.8.0
    .


Environment for perl v5.8.3:
    HOME=/home/japhy
    LANG=C
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/usr/local/bin:/usr/bin:/usr/ucb:/usr/sbin:/usr/openwin/bin:/bin:/usr/local/netscape:/usr/ccs/bin:/home/japhy/bin:.:/home/japhy/ICG/home/japhy/bin
    PERL5LIB=/home/japhy/lib/share/perl/5.8.3
    PERL_BADLANG (unset)
    SHELL=/bin/tcsh

@p5pRT
Copy link
Author

p5pRT commented Jul 7, 2004

From @hvds

"japhy@​perlmonk.org (via RT)" <perlbug-followup@​perl.org> wrote​:
:Upon backtracking, $1 has a bad value (as do $-[1] and $+[1])​:
:
: "ABCDE" =~ m{
: (?​: (.+) (?{ print "<$1> $-[1]..$+[1]\n" }) ){2}
: }x;
:
:This should print
:
:<ABCDE> 0..5
:<ABCD> 0..4
:<E> 4..5
:
:But instead it prints
:
:<ABCDE> 0..5
:<> 5..4
:<E> 4..5

It is arguable that you should not make assumptions about internal
states during the traversal of a pattern match, since satisfying all
reasonable assumptions would remove pretty much all possibility of
optimisations.

However, I think this represents a genuine bug, since an attempt to
expose similar patterns of backtracking using '\1' fails to match​:
  "abcabcabcx" =~ m{
  ^ (?​: (.+) (\1 | x) ){2}
  }x;

Strangely, replacing (.+) with (.+?) does allow it to match.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jul 7, 2004

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jul 7, 2004

From japhy@pobox.com

On Jul 7, Hugo van der Sanden via RT said​:

"japhy@​perlmonk.org (via RT)" <perlbug-followup@​perl.org> wrote​:
​:Upon backtracking, $1 has a bad value (as do $-[1] and $+[1])​:
​:
​: "ABCDE" =~ m{
​: (?​: (.+) (?{ print "<$1> $-[1]..$+[1]\n" }) ){2}
​: }x;

It is arguable that you should not make assumptions about internal
states during the traversal of a pattern match, since satisfying all
reasonable assumptions would remove pretty much all possibility of
optimisations.

However, I think this represents a genuine bug, since an attempt to
expose similar patterns of backtracking using '\1' fails to match​:
"abcabcabcx" =~ m{
^ (?​: (.+) (\1 | x) ){2}
}x;

Strangely, replacing (.+) with (.+?) does allow it to match.

I know *why* the problem is caused. Here is an example using this
pattern match​: "abc" =~ /(?​:(.+)){2}/

Matching REx `(?​:(.+)){2}' against `abc'
  Setting an EVAL scope, savestack=3
  0 <> <abc> | 1​: CURLYX[0] {2,2}
  0 <> <abc> | 9​: WHILEM
  0 out of 2..2 cc=bfffdb30
  0 <> <abc> | 3​: OPEN1
  0 <> <abc> | 5​: PLUS
  REG_ANY can match 3 times out of 2147483647...
  Setting an EVAL scope, savestack=3
  3 <abc> <> | 7​: CLOSE1
  3 <abc> <> | 9​: WHILEM
  1 out of 2..2 cc=bfffdb30

HERE, we have matched once, and (.+) absorbed the whole string.

  3 <abc> <> | 3​: OPEN1

Now we re-open $1...

  3 <abc> <> | 5​: PLUS
  REG_ANY can match 0 times out of 2147483647...
  Setting an EVAL scope, savestack=3
  failed...
  failed...
  2 <ab> <c> | 7​: CLOSE1

...and close it here, to the LEFT of where we opened it.

  2 <ab> <c> | 9​: WHILEM
  1 out of 2..2 cc=bfffdb30
  2 <ab> <c> | 3​: OPEN1
  2 <ab> <c> | 5​: PLUS
  REG_ANY can match 1 times out of 2147483647...
  Setting an EVAL scope, savestack=3
  3 <abc> <> | 7​: CLOSE1
  3 <abc> <> | 9​: WHILEM
  2 out of 2..2 cc=bfffdb30
  3 <abc> <> | 10​: NOTHING
  3 <abc> <> | 11​: END

The problem is that the backtracking is not restoring PL_regstartp[1] (or
more appropriately, PL_reg_start_tmp[1]) when the nodes in between the
OPEN1 and CLOSE1 fail to match. I'm baffled as to why. I looked at the
CURLYX code (around line 3080 in regexec.c) and it seems like it should be
saving the data correctly... but apparently not.

--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
http​://japhy.perlmonk.org/ % have long ago been overpaid?
http​://www.perlmonks.org/ % -- Meister Eckhart

@p5pRT
Copy link
Author

p5pRT commented Feb 8, 2016

From @khwilliamson

This was fixed by​:
commit 95b2444
Author​: Dave Mitchell <davem@​fdisolutions.com>
Date​: Fri Mar 24 23​:05​:11 2006 +0000

  make S_regmatch() iterative rather than recursive.
  Goodbye stack-bustng regexes!
 
  p4raw-id​: //depot/perl@​27598

--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Feb 8, 2016

@khwilliamson - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant