Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/m causing false negative #13682

Closed
p5pRT opened this issue Mar 21, 2014 · 9 comments
Closed

/m causing false negative #13682

p5pRT opened this issue Mar 21, 2014 · 9 comments

Comments

@p5pRT
Copy link

p5pRT commented Mar 21, 2014

Migrated from rt.perl.org#121484 (status was 'resolved')

Searchable as RT121484$

@p5pRT
Copy link
Author

p5pRT commented Mar 21, 2014

From zefram@fysh.org

Created by zefram@fysh.org

$ perl5.19.10 -lwe 'print "wibble" =~ /\Awibble\z/m ? "yes" : "no"'
no
$ perl5.19.9 -lwe 'print "wibble" =~ /\Awibble\z/m ? "yes" : "no"'
yes

Obviously a regression. Regexp matches correctly if /m is omitted,
or if either anchor is omitted, or if any of the letters "ble" in the
pattern are replaced with dots. But replacing "wib" with dots keeps it
falsely non-matching.

This breaks the pod-coverage test for TheSchwartz, which builds up a bunch
of patterns like this to describe exceptions to the coverage requirement.

Perl Info

Flags:
    category=core
    severity=high

Site configuration information for perl 5.19.10:

Configured by root at Fri Mar 21 09:44:41 UTC 2014.

Summary of my perl5 (revision 5 version 19 subversion 10) configuration:
   
  Platform:
    osname=linux, osvers=2.6.32-5-686, archname=i686-linux-64int-ld
    uname='linux beryllium.photobox.priv 2.6.32-5-686 #1 smp fri may 10 08:33:48 utc 2013 i686 gnulinux '
    config_args='-des -Dusedevel -Uversiononly -Duseshrplib -Duse64bitint -Duselongdouble -Uusethreads -Uusemultiplicity -Dprefix=/opt/perl-5.19.10 -Dsiteprefix=/opt/perl-5.19.10 -Dvendorprefix=/opt/perl-5.19.10/vendor -Doptimize=-ggdb -O2 -Dcccdlflags=-fPIC -O2 -pipe'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    use64bitint=define, use64bitall=undef, uselongdouble=define
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-ggdb -O2',
    cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.7.2', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long long', ivsize=8, nvtype='long double', nvsize=12, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/gcc/i486-linux-gnu/4.7/include-fixed /usr/include/i386-linux-gnu /usr/lib /lib/i386-linux-gnu /lib/../lib /usr/lib/i386-linux-gnu /usr/lib/../lib /lib
    libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.13.so, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.13'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/opt/perl-5.19.10/lib/5.19.10/i686-linux-64int-ld/CORE'
    cccdlflags='-fPIC -O2 -pipe', lddlflags='-shared -ggdb -O2 -L/usr/local/lib -fstack-protector'



@INC for perl 5.19.10:
    /opt/perl-5.19.10/lib/site_perl/5.19.10/i686-linux-64int-ld
    /opt/perl-5.19.10/lib/site_perl/5.19.10
    /opt/perl-5.19.10/vendor/lib/vendor_perl/5.19.10/i686-linux-64int-ld
    /opt/perl-5.19.10/vendor/lib/vendor_perl/5.19.10
    /opt/perl-5.19.10/lib/5.19.10/i686-linux-64int-ld
    /opt/perl-5.19.10/lib/5.19.10
    .


Environment for perl 5.19.10:
    HOME=/root
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/root/pub/i686-pc-linux-gnu/bin:/root/pub/common/bin:/usr/bin:/usr/sbin:/bin:/sbin:/usr/local/bin:/usr/local/sbin:/usr/games:/opt/geoip/bin:/opt/httpd/bin:/opt/perl/bin
    PERL_BADLANG (unset)
    SHELL=/usr/bin/zsh

@p5pRT
Copy link
Author

p5pRT commented Mar 21, 2014

From @tomhukins

On Fri, Mar 21, 2014 at 06​:01​:01AM -0700, Zefram wrote​:

$ perl5.19.10 -lwe 'print "wibble" =~ /\Awibble\z/m ? "yes" : "no"'
no
$ perl5.19.9 -lwe 'print "wibble" =~ /\Awibble\z/m ? "yes" : "no"'
yes

I've just bisected this to the following plausible-looking commit​:

d0d4464 is the first bad commit
commit d0d4464
Author​: David Mitchell <davem@​iabyn.com>
Date​: Tue Mar 18 20​:15​:27 2014 +0000

  re_intuit_start()​: reduce scope of /^...$/m test
 
  Intuit has a quick reject test for a fixed pattern that is anchored at
  both ends. For example, with the pattern /^abcd$/, only the exact strings
  "abcd" or "abcd\n" will match; anything else, and the match immediately
  fails.
 
  A fix for [perl #115242] correctly made intuit skip the test in the
  presence of //m, since in this case the $ doesn't necessarily correspond
  to the end of the string.
 
  However, the fix was too wide in scope; it caused //m patterns to skip
  searching for a known string anchored just at the start, as well as one
  anchored at both ends.
 
  With this commit, the following code now runs in a few milliseconds rather
  than a few seconds on my machine​:
 
  $s = "abcdefg" x 1_000_000;
  $s =~ /(?-m​:^)abcX?fg/m for 1..100;

Tom

@p5pRT
Copy link
Author

p5pRT commented Mar 21, 2014

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Mar 21, 2014

From @arc

Zefram <perlbug-followup@​perl.org> wrote​:

$ perl5.19.10 -lwe 'print "wibble" =~ /\Awibble\z/m ? "yes" : "no"'
no
$ perl5.19.9 -lwe 'print "wibble" =~ /\Awibble\z/m ? "yes" : "no"'
yes

This bisects to the following commit​:

commit d0d4464
Author​: David Mitchell <davem@​iabyn.com>
Date​: Tue Mar 18 20​:15​:27 2014 +0000

  re_intuit_start()​: reduce scope of /^...$/m test

  Intuit has a quick reject test for a fixed pattern that is anchored at
  both ends. For example, with the pattern /^abcd$/, only the exact strings
  "abcd" or "abcd\n" will match; anything else, and the match immediately
  fails.

  A fix for [perl #115242] correctly made intuit skip the test in the
  presence of //m, since in this case the $ doesn't necessarily correspond
  to the end of the string.

  However, the fix was too wide in scope; it caused //m patterns to skip
  searching for a known string anchored just at the start, as well as one
  anchored at both ends.

  With this commit, the following code now runs in a few milliseconds rather
  than a few seconds on my machine​:

  $s = "abcdefg" x 1_000_000;
  $s =~ /(?-m​:^)abcX?fg/m for 1..100;

AFAICT from a fairly cursory look, re_intuit_start() seems to be
failing to account for the "\n" at the end of the fixed check
substring.

This is Zefram's test case expressed in the language of t/re/re_tests​:

Inline Patch
diff --git i/t/re/re_tests w/t/re/re_tests
index 7ab7dc3..294032f 100644
--- i/t/re/re_tests
+++ w/t/re/re_tests
@@ -1884,5 +1884,21 @@ A+(*PRUNE)BC(?{})        AAABC   y       $&      AAABC
 [bcd].{2,3}aaaa        XbXaaaaa        y       -       -
 [bcd].{2,3}aaaa        Xb\x{100}aaaaa  y       -       -

+# RT #121484 - /m causing false negative
+'\Awibble\z'm  wibble  y       $&      wibble
+'\A.ibble\z'm  wibble  y       $&      wibble
+'\Aw.bble\z'm  wibble  y       $&      wibble
+'\Awi.ble\z'm  wibble  y       $&      wibble
+'\Awib.le\z'm  wibble  y       $&      wibble
+'\Awibb.e\z'm  wibble  y       $&      wibble
+'\Awibbl.\z'm  wibble  y       $&      wibble
+'\Awibble$'m   wibble  y       $&      wibble
+'\A.ibble$'m   wibble  y       $&      wibble
+'\Aw.bble$'m   wibble  y       $&      wibble
+'\Awi.ble$'m   wibble  y       $&      wibble
+'\Awib.le$'m   wibble  y       $&      wibble
+'\Awibb.e$'m   wibble  y       $&      wibble
+'\Awibbl.$'m   wibble  y       $&      wibble
+
 # Keep these lines at the end of the file
 # vim: softtabstop=0 noexpandtab

-- 

Aaron Crane ** http​://aaroncrane.co.uk/

@p5pRT
Copy link
Author

p5pRT commented Mar 21, 2014

From @demerphq

On 21 March 2014 14​:01, Zefram <perlbug-followup@​perl.org> wrote​:

# New Ticket Created by Zefram
# Please include the string​: [perl #121484]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=121484 >

This is a bug report for perl from zefram@​fysh.org,
generated with the help of perlbug 1.40 running under perl 5.19.10.

-----------------------------------------------------------------
[Please describe your issue here]

$ perl5.19.10 -lwe 'print "wibble" =~ /\Awibble\z/m ? "yes" : "no"'
no
$ perl5.19.9 -lwe 'print "wibble" =~ /\Awibble\z/m ? "yes" : "no"'
yes

Obviously a regression. Regexp matches correctly if /m is omitted,
or if either anchor is omitted, or if any of the letters "ble" in the
pattern are replaced with dots. But replacing "wib" with dots keeps it
falsely non-matching.

Why is this a regression? I call this a bug fix. It *should* have
matched with /m. There are no regexp structures affected by /m in this
pattern.

Yves

@p5pRT
Copy link
Author

p5pRT commented Mar 21, 2014

From @demerphq

On 21 March 2014 17​:03, demerphq <demerphq@​gmail.com> wrote​:

On 21 March 2014 14​:01, Zefram <perlbug-followup@​perl.org> wrote​:

# New Ticket Created by Zefram
# Please include the string​: [perl #121484]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=121484 >

This is a bug report for perl from zefram@​fysh.org,
generated with the help of perlbug 1.40 running under perl 5.19.10.

-----------------------------------------------------------------
[Please describe your issue here]

$ perl5.19.10 -lwe 'print "wibble" =~ /\Awibble\z/m ? "yes" : "no"'
no
$ perl5.19.9 -lwe 'print "wibble" =~ /\Awibble\z/m ? "yes" : "no"'
yes

Obviously a regression. Regexp matches correctly if /m is omitted,
or if either anchor is omitted, or if any of the letters "ble" in the
pattern are replaced with dots. But replacing "wib" with dots keeps it
falsely non-matching.

Why is this a regression? I call this a bug fix. It *should* have
matched with /m. There are no regexp structures affected by /m in this
pattern.

I cant read. Forget this.

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Mar 21, 2014

From @andk

Aaron Crane <arc@​cpan.org> writes​:

This bisects to the following commit​:

commit d0d4464

Affected on the CPAN​:

FANGLY/Getopt-Euclid-0.4.4.tar.gz
JKEGL/Marpa-R2-2.082000.tar.gz
THALJEF/Perl-Critic-1.121.tar.gz
DCONWAY/Smart-Comments-1.000005.tar.gz

--
andreas

@p5pRT
Copy link
Author

p5pRT commented Mar 24, 2014

From @iabyn

On Fri, Mar 21, 2014 at 02​:16​:21PM +0000, Tom Hukins wrote​:

On Fri, Mar 21, 2014 at 06​:01​:01AM -0700, Zefram wrote​:

$ perl5.19.10 -lwe 'print "wibble" =~ /\Awibble\z/m ? "yes" : "no"'
no
$ perl5.19.9 -lwe 'print "wibble" =~ /\Awibble\z/m ? "yes" : "no"'
yes

I've just bisected this to the following plausible-looking commit​:

Thanks. Now fixed with

commit 7742aa6
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Mon Mar 24 15​:36​:32 2014 +0000
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Mon Mar 24 15​:36​:32 2014 +0000

  [perl #121484] /m causing false negative
 
  My recent commit d0d4464 in re_intuit_start() reduced the
  scope of a 'skip if multiline' check, so that certain optimisations
  weren't being unnecessarily skipped. Unfortunately it didn't reduce the
  scope enough, so a vital slen-- was being skipped in the
  SvTAIL-but-don't-fail case.
 
  This commit just moves the !multiline test further down, and updates the
  commentary and condition formatting a bit.

--
The optimist believes that he lives in the best of all possible worlds.
As does the pessimist.

@p5pRT p5pRT closed this as completed Mar 24, 2014
@p5pRT
Copy link
Author

p5pRT commented Mar 24, 2014

@iabyn - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant