Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bleadperl v5.25.5-100-g2b5e7bc2e6 breaks JDDPAUSE/re-engine-GNU-0.021.tar.gz #15911

Closed
p5pRT opened this issue Mar 5, 2017 · 13 comments
Closed

Comments

@p5pRT
Copy link

p5pRT commented Mar 5, 2017

Migrated from rt.perl.org#130921 (status was 'resolved')

Searchable as RT130921$

@p5pRT
Copy link
Author

p5pRT commented Mar 5, 2017

From @andk

rt-cpan


https://rt.cpan.org/Ticket/Display.html?id=120498

bisect


commit 2b5e7bc
Author​: Karl Williamson <khw@​cpan.org>
Date​: Wed Oct 5 19​:09​:02 2016 -0600

  utf8n_to_uvchr()​: Note multiple malformations

perl -V


% /home/sand/src/perl/repoperls/installed-perls/host/k93msid/v5.25.5-100-g2b5e7bc2e6/f11c/bin/perl -V
Summary of my perl5 (revision 5 version 25 subversion 6) configuration​:
  Derived from​: 2b5e7bc
  Platform​:
  osname=linux
  osvers=4.8.0-2-amd64
  archname=x86_64-linux-thread-multi
  uname='linux k93msid 4.8.0-2-amd64 #1 smp debian 4.8.11-1 (2016-12-02) x86_64 gnulinux '
  config_args='-Dprefix=/home/sand/src/perl/repoperls/installed-perls/host/k93msid/v5.25.5-100-g2b5e7bc2e6/f11c -Dmyhostname=k93msid -Dinstallusrbinperl=n -Uversiononly -Dusedevel -des -Ui_db -Dlibswanted=cl pthread socket inet nsl gdbm dbm malloc dl ld sun m crypt sec util c cposix posix ucb BSD gdbm_compat -Duseithreads -Uuselongdouble -DDEBUGGING=-g'
  hint=recommended
  useposix=true
  d_sigaction=define
  useithreads=define
  usemultiplicity=define
  use64bitint=define
  use64bitall=define
  uselongdouble=undef
  usemymalloc=n
  bincompat5005=undef
  Compiler​:
  cc='cc'
  ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
  optimize='-O2 -g'
  cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
  ccversion=''
  gccversion='6.2.1 20161124'
  gccosandvers=''
  intsize=4
  longsize=8
  ptrsize=8
  doublesize=8
  byteorder=12345678
  doublekind=3
  d_longlong=define
  longlongsize=8
  d_longdbl=define
  longdblsize=16
  longdblkind=3
  ivtype='long'
  ivsize=8
  nvtype='double'
  nvsize=8
  Off_t='off_t'
  lseeksize=8
  alignbytes=8
  prototype=define
  Linker and Libraries​:
  ld='cc'
  ldflags =' -fstack-protector-strong -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/6/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
  libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
  perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
  libc=libc-2.24.so
  so=so
  useshrplib=false
  libperl=libperl.a
  gnulibc_version='2.24'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs
  dlext=so
  d_dlsymun=undef
  ccdlflags='-Wl,-E'
  cccdlflags='-fPIC'
  lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl)​:
  Compile-time options​:
  HAS_TIMES
  MULTIPLICITY
  PERLIO_LAYERS
  PERL_COPY_ON_WRITE
  PERL_DONT_CREATE_GVSV
  PERL_HASH_FUNC_ONE_AT_A_TIME_HARD
  PERL_IMPLICIT_CONTEXT
  PERL_MALLOC_WRAP
  PERL_OP_PARENT
  PERL_PRESERVE_IVUV
  PERL_USE_DEVEL
  USE_64_BIT_ALL
  USE_64_BIT_INT
  USE_ITHREADS
  USE_LARGE_FILES
  USE_LOCALE
  USE_LOCALE_COLLATE
  USE_LOCALE_CTYPE
  USE_LOCALE_NUMERIC
  USE_LOCALE_TIME
  USE_PERLIO
  USE_PERL_ATOF
  USE_REENTRANT_API
  Locally applied patches​:
  uncommitted-changes
  Built under linux
  Compiled at Mar 4 2017 17​:39​:36
  @​INC​:
  /home/sand/src/perl/repoperls/installed-perls/host/k93msid/v5.25.5-100-g2b5e7bc2e6/f11c/lib/site_perl/5.25.6/x86_64-linux-thread-multi
  /home/sand/src/perl/repoperls/installed-perls/host/k93msid/v5.25.5-100-g2b5e7bc2e6/f11c/lib/site_perl/5.25.6
  /home/sand/src/perl/repoperls/installed-perls/host/k93msid/v5.25.5-100-g2b5e7bc2e6/f11c/lib/5.25.6/x86_64-linux-thread-multi
  /home/sand/src/perl/repoperls/installed-perls/host/k93msid/v5.25.5-100-g2b5e7bc2e6/f11c/lib/5.25.6
  .

--
andreas

@p5pRT
Copy link
Author

p5pRT commented Mar 6, 2017

From @iabyn

On Sat, Mar 04, 2017 at 09​:49​:12PM -0800, Andreas J. Koenig via RT wrote​:

rt-cpan
-------
https://rt.cpan.org/Ticket/Display.html?id=120498

bisect
------
commit 2b5e7bc
Author​: Karl Williamson <khw@​cpan.org>
Date​: Wed Oct 5 19​:09​:02 2016 -0600

utf8n\_to\_uvchr\(\)&#8203;: Note multiple malformations

Hi Karl,
that commit added this to Perl_utf8n_to_uvchr_error​:

  Newx(adjusted_s0, OFFUNISKIP(min_uv) + 1, U8);
  SAVEFREEPV((U8 *) adjusted_s0);

The problem is that re-engine-GNU is calling utf8n_to_uvchr() during
regex cloning which is called during thread cloning. At this point in
cloning a new thread, a savestack hasn't been allocated for the new
thread, so PL_savestack is NULL< so the SAVEFREEPV() crashes.

AFAIKT, the Newx is just just being used to create a temporary buffer big
enough to hold one utf8 char; can't you instead just use a small buffer
on the stack, e.g.

  Perl_utf8n_to_uvchr_error(...)
  {
  ...
  U8 adjusted_s0_buf[UTF8_MAXBYTES];
  ...
  if (...)
  ...
  adjusted_s0 = adjusted_s0_buf;
  ...
  }

--
Never do today what you can put off till tomorrow.

@p5pRT
Copy link
Author

p5pRT commented Mar 6, 2017

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Mar 6, 2017

From @khwilliamson

On 03/06/2017 05​:39 AM, Dave Mitchell wrote​:

On Sat, Mar 04, 2017 at 09​:49​:12PM -0800, Andreas J. Koenig via RT wrote​:

rt-cpan
-------
https://rt.cpan.org/Ticket/Display.html?id=120498

bisect
------
commit 2b5e7bc
Author​: Karl Williamson <khw@​cpan.org>
Date​: Wed Oct 5 19​:09​:02 2016 -0600

utf8n\_to\_uvchr\(\)&#8203;: Note multiple malformations

Hi Karl,
that commit added this to Perl_utf8n_to_uvchr_error​:

Newx\(adjusted\_s0\, OFFUNISKIP\(min\_uv\) \+ 1\, U8\);
SAVEFREEPV\(\(U8 \*\) adjusted\_s0\);

The problem is that re-engine-GNU is calling utf8n_to_uvchr() during
regex cloning which is called during thread cloning. At this point in
cloning a new thread, a savestack hasn't been allocated for the new
thread, so PL_savestack is NULL< so the SAVEFREEPV() crashes.

AFAIKT, the Newx is just just being used to create a temporary buffer big
enough to hold one utf8 char; can't you instead just use a small buffer
on the stack, e.g.

Perl\_utf8n\_to\_uvchr\_error\(\.\.\.\)
\{
    \.\.\.
    U8 adjusted\_s0\_buf\[UTF8\_MAXBYTES\];
    \.\.\.
    if \(\.\.\.\)
        \.\.\.
        adjusted\_s0 = adjusted\_s0\_buf;
        \.\.\.
    \}

I agree, and I probably should have done it your way to begin with.
Attached is a patch to this effect, though I changed the buffer name to
be more general in case it might be of future use elsewhere. Since you
have already got things set please verify that this fixes the problem.
We will need to get approval before applying it for 5.26. Note that
this is a regression in 5.25.

@p5pRT
Copy link
Author

p5pRT commented Mar 6, 2017

From @khwilliamson

0002-PATCH-perl-130921-Don-t-use-Newx-in-decoding-UTF-8.patch
From bfb3432aec08361b344a6cdf1b80160bf53e981f Mon Sep 17 00:00:00 2001
From: Karl Williamson <khw@cpan.org>
Date: Mon, 6 Mar 2017 12:25:21 -0700
Subject: [PATCH 2/2] PATCH: [perl #130921]: Don't use Newx in decoding UTF-8

The bottom level UTF-8 decoding routine can be used during periods when
using Newx is prohibited, as diagnosed by Dave Mitchell for this ticket.
The particular use of Newx was unnecessary, as it is just large enough
to hold a single character, and that can be done by an automatic
variable on the C stack, and probably more efficiently.
---
 utf8.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/utf8.c b/utf8.c
index 89c8413..4949bf6 100644
--- a/utf8.c
+++ b/utf8.c
@@ -1079,6 +1079,8 @@ Perl_utf8n_to_uvchr_error(pTHX_ const U8 *s,
     U8 * adjusted_s0 = (U8 *) s0;
     U8 * adjusted_send = NULL;  /* (Initialized to silence compilers' wrong
                                    warning) */
+    U8 temp_char_buf[UTF8_MAXBYTES + 1]; /* Used to avoid a Newx in this
+                                            routine; see [perl #130921] */
     UV uv_so_far = 0;   /* (Initialized to silence compilers' wrong warning) */
 
     PERL_ARGS_ASSERT_UTF8N_TO_UVCHR_ERROR;
@@ -1245,10 +1247,7 @@ Perl_utf8n_to_uvchr_error(pTHX_ const U8 *s,
                                      I8_TO_NATIVE_UTF8(UTF_CONTINUATION_MARK));
             }
 
-            Newx(adjusted_s0, OFFUNISKIP(min_uv) + 1, U8);
-            SAVEFREEPV((U8 *) adjusted_s0);    /* Needed because we may not get
-                                                  to free it ourselves if
-                                                  warnings are made fatal */
+            adjusted_s0 = temp_char_buf;
             adjusted_send = uvoffuni_to_utf8_flags(adjusted_s0, min_uv, 0);
         }
     }
-- 
2.7.4

@p5pRT
Copy link
Author

p5pRT commented Mar 6, 2017

From @iabyn

On Mon, Mar 06, 2017 at 12​:31​:09PM -0700, Karl Williamson wrote​:

I agree, and I probably should have done it your way to begin with. Attached
is a patch to this effect, though I changed the buffer name to be more
general in case it might be of future use elsewhere. Since you have already
got things set please verify that this fixes the problem. We will need to
get approval before applying it for 5.26. Note that this is a regression in
5.25.

It fails in a different way now - the test script no longer crashes while
creating the thread, but instead assert fails later when running the
thread, while trying to set a regex's capture string after a successful
match.

I strongly suspect its unrelated to to the utf8n_to_uvch() issue/regression.
http​://matrix.cpantesters.org/?dist=re-engine-GNU shows it failing on
threaded builds since 5.25.5 - so it's probably a different regression we
need to look at.

re-engine-GNU appears to be using the non-documented, non-API function
sv_setsv_cow() to set regex->saved_copy from an SV which isn't cowable​:

SV = PV(0x18b03b8) at 0x18acc20
  REFCNT = 1
  FLAGS = (PADSTALE,POK,READONLY,PROTECT,pPOK,UTF8)
  PV = 0x18b43e8 "\341\200\200\341\200\273\341\200\275\341\200\224\341\200\272\341\200\257\341\200\225\341\200\272xy"\0 [UTF8 "\x{1000}\x{103b}\x{103d}\x{1014}\x{103a}\x{102f}\x{1015}\x{103a}xy"]
  CUR = 26
  LEN = 28

I don't know yet whether re-engine-GNU or sv_setsv_cow() has unrealistic
expectations. I'll look further tomorrow.

--
The crew of the Enterprise encounter an alien life form which is
surprisingly neither humanoid nor made from pure energy.
  -- Things That Never Happen in "Star Trek" #22

@p5pRT
Copy link
Author

p5pRT commented Mar 7, 2017

From @iabyn

On Mon, Mar 06, 2017 at 08​:33​:49PM +0000, Dave Mitchell wrote​:

On Mon, Mar 06, 2017 at 12​:31​:09PM -0700, Karl Williamson wrote​:

I agree, and I probably should have done it your way to begin with. Attached
is a patch to this effect, though I changed the buffer name to be more
general in case it might be of future use elsewhere. Since you have already
got things set please verify that this fixes the problem. We will need to
get approval before applying it for 5.26. Note that this is a regression in
5.25.

It fails in a different way now - the test script no longer crashes while
creating the thread, but instead assert fails later when running the
thread, while trying to set a regex's capture string after a successful
match.

So I think your fix is good to apply to blead.

re-engine-GNU appears to be using the non-documented, non-API function
sv_setsv_cow() to set regex->saved_copy from an SV which isn't cowable​:

SV = PV(0x18b03b8) at 0x18acc20
REFCNT = 1
FLAGS = (PADSTALE,POK,READONLY,PROTECT,pPOK,UTF8)
PV = 0x18b43e8 "\341\200\200\341\200\273\341\200\275\341\200\224\341\200\272\341\200\257\341\200\225\341\200\272xy"\0 [UTF8 "\x{1000}\x{103b}\x{103d}\x{1014}\x{103a}\x{102f}\x{1015}\x{103a}xy"]
CUR = 26
LEN = 28

I don't know yet whether re-engine-GNU or sv_setsv_cow() has unrealistic
expectations. I'll look further tomorrow.

It fails as far back as at least 5.24.0 on threaded builds with debugging
enabled. The function GNU_exec_set_capture_string() in GNU.xs
is trying to make a copy of the string being matched, to be stored in the
regex's saved_copy field. It seems to be cargo-culting code similar to
that in perl core's S_reg_set_capture_string(), but the GNU.xs version
ends up trying to call sv_setsv_cow() with a src SV which is *not*
cowable.

It can be reduced to

  use threads;
  #use re​::engine​::GNU;

  sub f {
  #use re​::engine​::GNU;
  "x" =~ /x/;
  }
  threads->new(\&f)->join;
  #f();

when the match is run within a child thread, the const SV holding "x" gets
cloned and loses its COW flag and so ends up as shown above, i.e.

  FLAGS = (PADSTALE,POK,READONLY,PROTECT,pPOK,UTF8)

This is not COWable, so sv_setsv_cow() triggers an assertion.

--
Art is anything that has a label (especially if the label is "untitled 1")

@p5pRT
Copy link
Author

p5pRT commented Mar 7, 2017

From @khwilliamson

On 03/07/2017 04​:15 AM, Dave Mitchell wrote​:

So I think your fix is good to apply to blead.

Now done, as e9f2c44 with the
pumpking's approval on #irc.

Aaron Crane had an interesting observation. Since this was a regression
in 5.25 and if we didn't fix it now, it would require 3 committers to
vote for fixing it in 5.26.1. He added his vote to yours and mine to
come up with the requisite 3. We might as well not wait until 5.26.1,
and put it into .0

@p5pRT
Copy link
Author

p5pRT commented Mar 7, 2017

From @khwilliamson

On Tue, 07 Mar 2017 09​:59​:16 -0800, public@​khwilliamson.com wrote​:

On 03/07/2017 04​:15 AM, Dave Mitchell wrote​:

So I think your fix is good to apply to blead.

Now done, as e9f2c44 with the
pumpking's approval on #irc.

Aaron Crane had an interesting observation. Since this was a regression
in 5.25 and if we didn't fix it now, it would require 3 committers to
vote for fixing it in 5.26.1. He added his vote to yours and mine to
come up with the requisite 3. We might as well not wait until 5.26.1,
and put it into .0

I was careless with my commit message for e9f2c44. The problem is not the Newx, but the SAVEFREEPV of the string allocated by the Newx

--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Mar 18, 2017

From @jkeenan

On Tue, 07 Mar 2017 23​:08​:35 GMT, khw wrote​:

On Tue, 07 Mar 2017 09​:59​:16 -0800, public@​khwilliamson.com wrote​:

On 03/07/2017 04​:15 AM, Dave Mitchell wrote​:

So I think your fix is good to apply to blead.

Now done, as e9f2c44 with the
pumpking's approval on #irc.

Aaron Crane had an interesting observation. Since this was a
regression
in 5.25 and if we didn't fix it now, it would require 3 committers to
vote for fixing it in 5.26.1. He added his vote to yours and mine to
come up with the requisite 3. We might as well not wait until
5.26.1,
and put it into .0

I was careless with my commit message for
e9f2c44. The problem is not the
Newx, but the SAVEFREEPV of the string allocated by the Newx

This ticket is marked as a blocker for 5.26.0. Can we get an update on its status?

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Mar 18, 2017

From @xsawyerx

On 03/18/2017 01​:09 AM, James E Keenan via RT wrote​:

On Tue, 07 Mar 2017 23​:08​:35 GMT, khw wrote​:

On Tue, 07 Mar 2017 09​:59​:16 -0800, public@​khwilliamson.com wrote​:

On 03/07/2017 04​:15 AM, Dave Mitchell wrote​:

So I think your fix is good to apply to blead.
Now done, as e9f2c44 with the
pumpking's approval on #irc.

Aaron Crane had an interesting observation. Since this was a
regression
in 5.25 and if we didn't fix it now, it would require 3 committers to
vote for fixing it in 5.26.1. He added his vote to yours and mine to
come up with the requisite 3. We might as well not wait until
5.26.1,
and put it into .0

I was careless with my commit message for
e9f2c44. The problem is not the
Newx, but the SAVEFREEPV of the string allocated by the Newx
This ticket is marked as a blocker for 5.26.0. Can we get an update on its status?

Karl, will it be possible to merge please?

@p5pRT
Copy link
Author

p5pRT commented Mar 19, 2017

From @iabyn

On Sat, Mar 18, 2017 at 06​:40​:35PM +0100, Sawyer X wrote​:

On 03/18/2017 01​:09 AM, James E Keenan via RT wrote​:

On Tue, 07 Mar 2017 23​:08​:35 GMT, khw wrote​:

On Tue, 07 Mar 2017 09​:59​:16 -0800, public@​khwilliamson.com wrote​:

On 03/07/2017 04​:15 AM, Dave Mitchell wrote​:

So I think your fix is good to apply to blead.
Now done, as e9f2c44 with the
pumpking's approval on #irc.

Aaron Crane had an interesting observation. Since this was a
regression
in 5.25 and if we didn't fix it now, it would require 3 committers to
vote for fixing it in 5.26.1. He added his vote to yours and mine to
come up with the requisite 3. We might as well not wait until
5.26.1,
and put it into .0

I was careless with my commit message for
e9f2c44. The problem is not the
Newx, but the SAVEFREEPV of the string allocated by the Newx
This ticket is marked as a blocker for 5.26.0. Can we get an update on its status?

Karl, will it be possible to merge please?

The fix for blead was merged by Karl back on 7 March. The secondary issue
remaining was re-engine-GNU is a bug in that distribution, and I've
updated its rt.cpan.org ticket with further info.

So I'll close this ticket now.

--
Justice is when you get what you deserve.
Law is when you get what you pay for.

@p5pRT p5pRT closed this as completed Mar 19, 2017
@p5pRT
Copy link
Author

p5pRT commented Mar 19, 2017

@iabyn - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant