Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bleadperl v5.15.7-414-g2e2b257 breaks DAGOLDEN/Capture-Tiny-0.16.tar.gz #11963

Closed
p5pRT opened this issue Feb 19, 2012 · 17 comments
Closed

Bleadperl v5.15.7-414-g2e2b257 breaks DAGOLDEN/Capture-Tiny-0.16.tar.gz #11963

p5pRT opened this issue Feb 19, 2012 · 17 comments

Comments

@p5pRT
Copy link

p5pRT commented Feb 19, 2012

Migrated from rt.perl.org#111070 (status was 'resolved')

Searchable as RT111070$

@p5pRT
Copy link
Author

p5pRT commented Feb 19, 2012

From @andk

git bisect


commit 2e2b257
Author​: Karl Williamson <public@​khwilliamson.com>
Date​: Wed Feb 15 11​:31​:27 2012 -0700

  perl #77654​: quotemeta quotes non-ASCII consistently

diagnostics


Test Summary Report


t/08-stdin-closed.t (Wstat​: 46336 Tests​: 872 Failed​: 181)
  Failed tests​: 346-347, 350-351, 354-367, 380-381, 384-385
  388-389, 406-407, 410-411, 414-427, 440-441
  444-445, 448-449, 463, 466, 469-478, 488-489
  491-492, 494-495, 508, 511, 514-523, 533-534
  536-537, 539-540, 560-561, 563-564, 566-567
  578-579, 581-582, 584-585, 605-606, 608-609
  611-612, 623-624, 626-627, 629-630, 641
  643-644, 646-647, 649-650, 652-653, 655-656
  658, 686, 688-689, 691-692, 694-695, 697-698
  700-701, 703, 738, 740, 743, 745, 748, 750
  752-755, 757-760, 762-765, 782, 784, 787
  789, 792, 794, 813, 815, 818, 820, 823
  825, 827-830, 832-835, 837-840, 857, 859
  862, 864, 867, 869, 872
  Non-zero exit status​: 181

perl -V


Summary of my perl5 (revision 5 version 15 subversion 7) configuration​:
  Commit id​: 2e2b257
  Platform​:
  osname=linux, osvers=2.6.32-5-amd64, archname=x86_64-linux-ld
  uname='linux k83 2.6.32-5-amd64 #1 smp thu nov 3 03​:41​:26 utc 2011 x86_64 gnulinux '
  config_args='-Dprefix=/home/src/perl/repoperls/installed-perls/perl/v5.15.7-414-g2e2b257/127e -Dmyhostname=k83 -Dinstallusrbinperl=n -Uversiononly -Dusedevel -des -Ui_db -Uuseithreads -Duselongdouble -DDEBUGGING=-g'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=undef, usemultiplicity=undef
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=define, use64bitall=define, uselongdouble=define
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize='-O2 -g',
  cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='4.4.5', gccosandvers=''
  intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=8, nvtype='long double', nvsize=16, Off_t='off_t', lseeksize=8
  alignbytes=16, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /lib/../lib /usr/lib/../lib /lib /usr/lib /lib64 /usr/lib64
  libs=-lnsl -ldb -ldl -lm -lcrypt -lutil -lc
  perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
  libc=/lib/libc-2.11.3.so, so=so, useshrplib=false, libperl=libperl.a
  gnulibc_version='2.11.3'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
  cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​:
  Compile-time options​: HAS_TIMES PERLIO_LAYERS PERL_DONT_CREATE_GVSV
  PERL_MALLOC_WRAP PERL_PRESERVE_IVUV PERL_USE_DEVEL
  USE_64_BIT_ALL USE_64_BIT_INT USE_LARGE_FILES
  USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE
  USE_LOCALE_NUMERIC USE_LONG_DOUBLE USE_PERLIO
  USE_PERL_ATOF
  Built under linux
  Compiled at Feb 19 2012 07​:19​:36
  @​INC​:
  /home/src/perl/repoperls/installed-perls/perl/v5.15.7-414-g2e2b257/127e/lib/site_perl/5.15.7/x86_64-linux-ld
  /home/src/perl/repoperls/installed-perls/perl/v5.15.7-414-g2e2b257/127e/lib/site_perl/5.15.7
  /home/src/perl/repoperls/installed-perls/perl/v5.15.7-414-g2e2b257/127e/lib/5.15.7/x86_64-linux-ld
  /home/src/perl/repoperls/installed-perls/perl/v5.15.7-414-g2e2b257/127e/lib/5.15.7
  .

--
andreas

@p5pRT
Copy link
Author

p5pRT commented Feb 19, 2012

From @khwilliamson

On 02/18/2012 11​:40 PM, (Andreas J. Koenig) (via RT) wrote​:

# New Ticket Created by (Andreas J. Koenig)
# Please include the string​: [perl #111070]
# in the subject line of all future correspondence about this issue.
#<URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=111070>

git bisect
----------
commit 2e2b257
Author​: Karl Williamson<public@​khwilliamson.com>
Date​: Wed Feb 15 11​:31​:27 2012 -0700

 perl \#77654&#8203;: quotemeta quotes non\-ASCII consistently

diagnostics
-----------
Test Summary Report
-------------------
t/08-stdin-closed.t (Wstat​: 46336 Tests​: 872 Failed​: 181)
Failed tests​: 346-347, 350-351, 354-367, 380-381, 384-385
388-389, 406-407, 410-411, 414-427, 440-441
444-445, 448-449, 463, 466, 469-478, 488-489
491-492, 494-495, 508, 511, 514-523, 533-534
536-537, 539-540, 560-561, 563-564, 566-567
578-579, 581-582, 584-585, 605-606, 608-609
611-612, 623-624, 626-627, 629-630, 641
643-644, 646-647, 649-650, 652-653, 655-656
658, 686, 688-689, 691-692, 694-695, 697-698
700-701, 703, 738, 740, 743, 745, 748, 750
752-755, 757-760, 762-765, 782, 784, 787
789, 792, 794, 813, 815, 818, 820, 823
825, 827-830, 832-835, 837-840, 857, 859
862, 864, 867, 869, 872
Non-zero exit status​: 181

On 02/08/2012 04​:36 AM, Nicholas Clark wrote​:
"backwards" compatible or "bugwards" compatible? I'm finding it hard to
think of a use case where it's going to make a difference whether
quotemeta("£") is "£" or "\£", other than golden results in tests.

When Nicholas said that, I thought he was being metaphorical in the use
of the term "golden". I didn't realize he was being literal and
prescient in referring to "D. A. Golden's" tests. ;-)

And the character that is failing these tests isn't the Pound Sterling
sign, but WHITE SMILING FACE.

The offending blead commit changed things so that WHITE SMILING FACE is
quoted; the test is assuming it isn't quoted. No real code is affected.
  I don't believe we should back out the commit for this type of
failure, and hence I believe the test should be changed. By adding a
'\\' to the expected string, all tests passed (only one line is
affected; it's in a loop which is why 181 tests failed from just it.)

@p5pRT
Copy link
Author

p5pRT commented Feb 19, 2012

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Feb 20, 2012

From @xdg

The errors occur in a test for the case of STDIN being closed before
capture, but other tests use the same Unicode test string and don't seem
to fail. And many of the errors seem to occur with string that do not
contain Unicode.

I'm perplexed as to how this commit could be responsible for breaking
what appears to be breaking and yet not break what appears to still be
working.

@p5pRT
Copy link
Author

p5pRT commented Feb 21, 2012

From @khwilliamson

On 02/19/2012 09​:11 PM, David Golden via RT wrote​:

The errors occur in a test for the case of STDIN being closed before
capture, but other tests use the same Unicode test string and don't seem
to fail. And many of the errors seem to occur with string that do not
contain Unicode.

I'm perplexed as to how this commit could be responsible for breaking
what appears to be breaking and yet not break what appears to still be
working.

I agree that my analysis was superficial. There is something deeper
going on. But I know not what.

What the quotemeta commit did is essentially extend quoting to non-ASCII
code points in the same spirit as it works on ASCII. That means that
characters matching \w aren't quoted, but \W are. Prior to the commit,
no above-Latin1 characters were quoted. (To say that the patch quotes
\W characters is an over-generalization, as there are edge-case
exceptions, but that is the basic idea behind it.)

(In UTF-8 strings, all the non-ASCII Latin1 chars were treated the same
as above Latin-1 ones, and not quoted. In non-UTF-8 strings, they were
all treated as \W, and were quoted; hence different things happened
depending on UTF8ness, so this was a manifestation of the Unicode bug.
The commit solves the problem of the non-ASCII Latin1 characters by
having feature 'unicode_strings' apply to quotemeta. Outside
unicode_strings, they are all treated as \W. Inside unicode_strings,
the ones that match \W (with some edge case exceptions) are quoted; the
rest are not.)

The line that was failing in Capture​::Tiny's tests as a result of the
quotemeta commit is this hash initialization in t/lib/Cases.pm​:

  unicode => 'Hi! \x{263A}\n'

U+263A is "WHITE SMILING FACE", which is a \W character, and so should
be quoted under the new scheme, but not the old one. So, I just changed
that line to

  unicode => 'Hi! \\\x{263A}\n'

And all tests passed. I did not understand precisely why, but I don't
understand the module.

The module author asked me to come up with a patch that would work prior
to 5.16 as well. I thought that should be easy, just choose a \w
character instead that doesn't get quoted, and so it should be the same
in any release. Hence, I tried

  unicode => 'Hi! \x{100}\n'

\x{100} should not be quoted. But things still fail, including tests
that are testing another entry in the has 'Hello World' instead of this
one. I tried these under 5.14.0, and couldn't get it to fail.

pp_quotemeta appears to allocate enough space to allow every character
to be escaped. valgrind could be tried; I don't have time just now.
Another possibility is that the conversion to utf8 whenever the
single-quoted string gets converted is flawed.

@p5pRT
Copy link
Author

p5pRT commented Feb 21, 2012

From @xdg

More data, but no answers​: I have confirmed that simply removing "close
STDIN" is sufficient to cause tests to pass without changing the Unicode
test string.

I.e. this change is sufficient to pass tests​:

  diff --git a/t/08-stdin-closed.t b/t/08-stdin-closed.t
  index 81fd6dc..9b272b2 100644
  --- a/t/08-stdin-closed.t
  +++ b/t/08-stdin-closed.t
  @​@​ -20,7 +20,7 @​@​ my $builder = Test​::More->builder;
  binmode($builder-&gt;failure_output, '​:utf8') if $] >= 5.008;
 
  save_std(qw/stdin/);
  -ok( close STDIN, "closed STDIN" );
  +ok( 1, "closed STDIN" );
 
  my $fd = next_fd;

Capture​::Tiny does temporarily reopen a closed STDIN while setting up
capture handles, so that change does cause other code to run in
Capture​::Tiny, but I can't fathom what in 2e2b257 would cause this sort
of failure when closing STDIN.

@p5pRT
Copy link
Author

p5pRT commented Feb 21, 2012

From @xdg

Next "clue"​: with Capture​::Tiny debugging output uncommented and
enabled, I capture debugging output from t/08-stdin-closed.t, once from
adfec83 and once from
2e2b257.

Diff'ing the two and ignoring refaddr changes, somewhere in the middle I
found this​:

-# open STDIN, </dev/null as 0
-# proxied STDIN as 0
-# open GEN24, <&=STDIN as 0
-# proxy std​: stdin GLOB(0x2213ce8)
+# open STDIN, </dev/null as 4
+# proxied STDIN as 4
+# open GEN24, <&=STDIN as 4
+# proxy std​: stdin GLOB(0x20eacf8)

Note that when STDIN is closed (i.e. when C<! defined fileno STDIN>),
Capture​::Tiny opens STDIN to /dev/null to ensure FD 0 is reserved before
doing other handle duping for captures. In the good blead commit, STDIN
is correctly opened to FD 0, but in the broken blead commit, STDIN is
opened to FD 4 instead.

I have not been able to replicate this outside Capture​::Tiny yet, however.

@p5pRT
Copy link
Author

p5pRT commented Feb 22, 2012

From @khwilliamson

On 02/21/2012 01​:01 PM, David Golden via RT wrote​:

Next "clue"​: with Capture​::Tiny debugging output uncommented and
enabled, I capture debugging output from t/08-stdin-closed.t, once from
adfec83 and once from
2e2b257.

Diff'ing the two and ignoring refaddr changes, somewhere in the middle I
found this​:

-# open STDIN,</dev/null as 0
-# proxied STDIN as 0
-# open GEN24,<&=STDIN as 0
-# proxy std​: stdin GLOB(0x2213ce8)
+# open STDIN,</dev/null as 4
+# proxied STDIN as 4
+# open GEN24,<&=STDIN as 4
+# proxy std​: stdin GLOB(0x20eacf8)

Note that when STDIN is closed (i.e. when C<! defined fileno STDIN>),
Capture​::Tiny opens STDIN to /dev/null to ensure FD 0 is reserved before
doing other handle duping for captures. In the good blead commit, STDIN
is correctly opened to FD 0, but in the broken blead commit, STDIN is
opened to FD 4 instead.

I have not been able to replicate this outside Capture​::Tiny yet, however.

---
via perlbug​: queue​: perl5 status​: open
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=111070

I did verify that there were no errors found by

valgrind perl -Ilib -E "say quotemeta 'Hi! \x{263A}\n'"

However, the result was unexpected, as I forgot that quotemeta on a \x{}
doesn't do what I would think​:

Hi\!\ \\x\{263A\}\\n

It doesn't treat it as a sequence, but individual characters. Thus
there is no non-ASCII character here as far as quotemeta is concerned,
and there should be no change in behavior.

However, this is a single-quoted string. I don't know how it actually
gets expanded out so that the \x{263A} really is supposed to mean WHITE
SMILING FACE. Perhaps this is an issue between eval and evalbytes?

@p5pRT
Copy link
Author

p5pRT commented Feb 22, 2012

From @xdg

n Tue, Feb 21, 2012 at 7​:55 PM, Karl Williamson
<public@​khwilliamson.com> wrote​:

I did verify that there were no errors found by

valgrind perl -Ilib -E "say quotemeta 'Hi! \x{263A}\n'"

FWIW, the text in the Capture​::Tiny tests gets run through eval and qq
in a way more equivalent to this​:

$ /opt/perl/perl5.15.7-afterbreak/bin/perl -C -wE 'say quotemeta eval
"qq{Hi! \x{263A}\n}"'
Hi\!\ \☺\

(There are various reasons why the tests work in that convoluted
manner, mostly having to do with printing the same test string through
a subroutine or printing it as part of an argument to an external
program.)

-- David

@p5pRT
Copy link
Author

p5pRT commented Feb 22, 2012

From @khwilliamson

On 02/21/2012 08​:16 PM, David Golden wrote​:

n Tue, Feb 21, 2012 at 7​:55 PM, Karl Williamson
<public@​khwilliamson.com> wrote​:

I did verify that there were no errors found by

valgrind perl -Ilib -E "say quotemeta 'Hi! \x{263A}\n'"

FWIW, the text in the Capture​::Tiny tests gets run through eval and qq
in a way more equivalent to this​:

$ /opt/perl/perl5.15.7-afterbreak/bin/perl -C -wE 'say quotemeta eval
"qq{Hi! \x{263A}\n}"'
Hi\!\ \☺\

(There are various reasons why the tests work in that convoluted
manner, mostly having to do with printing the same test string through
a subroutine or printing it as part of an argument to an external
program.)

-- David

So, I later tested

valgrind ./perl -Ilib -E 'my $x = "Hi! \x{263A}\n"; say quotemeta $x'

And it works correctly, valgrind gave no buffer read/write errors

@p5pRT
Copy link
Author

p5pRT commented Feb 22, 2012

From @xdg

I narrowed down the source of error. This change (disabling the
quoting) is enough to make the Capture​::Tiny test pass​:

Inline Patch
diff --git a/pp.c b/pp.c
index 93e59fa..e8a3db6 100644
--- a/pp.c
+++ b/pp.c
@@ -4102,7 +4102,7 @@ PP(pp_quotemeta)
            to_quote = TRUE;
            }
        }
-       else if (_is_utf8_quotemeta(s)) {
+       else if (0) {
            to_quote = TRUE;
        }

But this change does not fix it:
Inline Patch
diff --git a/pp.c b/pp.c
index 93e59fa..5e370c4 100644
--- a/pp.c
+++ b/pp.c
@@ -4106,7 +4106,7 @@ PP(pp_quotemeta)
            to_quote = TRUE;
        }
 
-       if (to_quote) {
+       if (0) {
            *d++ = '\\';
        }
            if (ulen > len)

So it's something in _is_utf8_quotemeta.

-- David

@p5pRT
Copy link
Author

p5pRT commented Feb 22, 2012

From @xdg

On Tue Feb 21 21​:14​:22 2012, dagolden@​cpan.org wrote​:

So it's something in _is_utf8_quotemeta.

My hypothesis/wild-guess is that something in the on-demand swash
loading is opening a file descriptor and not closing it properly, which
is why after the patch, Capture​::Tiny proxies STDIN on FD 4 instead of
FD 0. And then when Capture​::Tiny starts tee-ing, the FDs are messed up
and the captured output is not as expected.

-- David

@p5pRT
Copy link
Author

p5pRT commented Feb 22, 2012

From @xdg

More triangulation​: in the failing Capture​::Tiny test file, calling
"quotemeta" on a wide character before the tee tests causes tests to
pass. This is true regardless of whether it is called before or after
STDIN is closed. This seems to confirm that something in the call to
quotemeta is leaking a filehandle, but only if filehandles have already
been dup'd for capture.

It *does* give a workaround to get Capture​::Tiny passing again on blead,
however, if a more general fix is not evident.

Inline Patch
diff --git a/t/08-stdin-closed.t b/t/08-stdin-closed.t
index 81fd6dc..c0606aa 100644
--- a/t/08-stdin-closed.t
+++ b/t/08-stdin-closed.t
@@ -19,6 +19,8 @@ plan 'no_plan';
 my $builder = Test::More->builder;
 binmode($builder->failure_output, ':utf8') if $] >= 5.008;
 
+my $qm = quotemeta("\x{263a}");
+
 save_std(qw/stdin/);
 ok( close STDIN, "closed STDIN" );

@p5pRT
Copy link
Author

p5pRT commented Feb 22, 2012

From @khwilliamson

On 02/21/2012 10​:42 PM, David Golden via RT wrote​:

On Tue Feb 21 21​:14​:22 2012, dagolden@​cpan.org wrote​:

So it's something in _is_utf8_quotemeta.

My hypothesis/wild-guess is that something in the on-demand swash
loading is opening a file descriptor and not closing it properly, which
is why after the patch, Capture​::Tiny proxies STDIN on FD 4 instead of
FD 0. And then when Capture​::Tiny starts tee-ing, the FDs are messed up
and the captured output is not as expected.

-- David

---
via perlbug​: queue​: perl5 status​: open
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=111070

This makes some sense. The code that does that load is in
utf8_heavy.pl. Here it is​:

  local $@​;
  local $!;
  $list = do $file; die $@​ if $@​;

where $file has been set up to point to the file that contains the
quotemeta information. I would guess that if there were bugs in 'do'
with regard to this, that they would have shown up elsewhere and been
fixed. Changing DEBUG in utf8_heavy.pl will print debugging info to STDERR.

However, that jogged my memory with these lines from the pod of charnames​:

"Since evaluation of the translation function (see L</CUSTOM
TRANSLATORS>) happens in the middle of compilation (of a string
literal), the translation function should not do any C<eval>s or
C<require>s. This restriction should be lifted (but is low priority) in
a future version of Perl."

I don't know enough about this area of Perl to know if that statement is
at all relevant or not. I tend to think not.

@p5pRT
Copy link
Author

p5pRT commented Feb 22, 2012

From @xdg

Issues were determined on IRC to be related to an existing bug​:

https://rt-archive.perl.org/perl5/Ticket/Display.html?id=37033

I'm closing this ticket as Capture​::Tiny 0.17 has a passable workaround
for now.

@p5pRT
Copy link
Author

p5pRT commented Feb 22, 2012

From [Unknown Contact. See original ticket]

Issues were determined on IRC to be related to an existing bug​:

https://rt-archive.perl.org/perl5/Ticket/Display.html?id=37033

I'm closing this ticket as Capture​::Tiny 0.17 has a passable workaround
for now.

@p5pRT
Copy link
Author

p5pRT commented Feb 22, 2012

@xdg - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant