Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to backtrack into lookahead #17008

Open
p5pRT opened this issue May 22, 2019 · 4 comments
Open

Failure to backtrack into lookahead #17008

p5pRT opened this issue May 22, 2019 · 4 comments

Comments

@p5pRT
Copy link

p5pRT commented May 22, 2019

Migrated from rt.perl.org#134123 (status was 'new')

Searchable as RT134123$

@p5pRT
Copy link
Author

p5pRT commented May 22, 2019

From @hvds

Created by @hvds

It seems we don't backtrack into lookaheads, which can cause us to miss
valid matches if they include a capture that we then refer to later
in the pattern. For example​:
  perl -we '"010" =~ /^(?=(.+)).*?(.)(\1)/ or die'
.. should match with $1 = '0', $2 = '1', but instead dies.

(Were this to work, it would be an elegant way to find values in the
Ehrenfeucht-Mycielski sequence, https://oeis.org/A038219.)

I believe this has never worked, and think I remember that it was an
intentional optimization that failed to take into account this backref
case that breaks as a result; I'm also convinced that this has been
reported before, but I don't see any open ticket about it.

Silently giving the wrong answer is bad​: we should aim either to support
it, or if we cannot, to detect and warn about it.

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl 5.30.0:

Configured by hv at Wed May 22 13:01:07 BST 2019.

Summary of my perl5 (revision 5 version 30 subversion 0) configuration:
   
  Platform:
    osname=linux
    osvers=3.13.0-169-generic
    archname=x86_64-linux
    uname='linux shad2 3.13.0-169-generic #219-ubuntu smp wed apr 3 14:01:26 utc 2019 x86_64 x86_64 x86_64 gnulinux '
    config_args='-des -Dcc=gcc -Dprefix=/opt/perl-5.30.0-d -Doptimize=-g -O6 -DDEBUGGING -Dusedevel -Uversiononly'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='gcc'
    ccflags ='-fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    optimize='-g -O6'
    cppflags='-fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion=''
    gccversion='4.8.4'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='gcc'
    ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /lib64 /usr/lib64
    libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.19.so
    so=so
    useshrplib=false
    libperl=libperl.a
    gnulibc_version='2.19'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E'
    cccdlflags='-fPIC'
    lddlflags='-shared -g -O6 -L/usr/local/lib -fstack-protector'



@INC for perl 5.30.0:
    /opt/perl-5.30.0-d/lib/site_perl/5.30.0/x86_64-linux
    /opt/perl-5.30.0-d/lib/site_perl/5.30.0
    /opt/perl-5.30.0-d/lib/5.30.0/x86_64-linux
    /opt/perl-5.30.0-d/lib/5.30.0


Environment for perl 5.30.0:
    HOME=/home/hv
    LANG=C
    LANGUAGE=en_GB:en
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/hv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented May 25, 2019

From @hvds

Karl Williamson wrote​:

Would it be possible to know if the lookahead has a capturing group, and
if not, don't backtrack. My guess is that that should be fairly easy to
add, and the performance hit would only come during that situation.

Yes, I plan to look into this, and that is the approach I intend to attempt.

I note that your reply did not make it onto the ticket.

Hugo

@khwilliamson
Copy link
Contributor

This is still a problem in 5.37.12

@demerphq
Copy link
Collaborator

I think lookaheads have an implicit (?>...) wrapped around them. I am not sure I would consider this a code bug or a doc bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants