Recursive comparison routine can cause segfault in sort #8162

p5pRT · 2005-10-22T23:37:00Z

Migrated from rt.perl.org#37508 (status was 'resolved')

Searchable as RT37508$

p5pRT · 2005-10-22T23:37:02Z

From houstorx@rpc142.cs.man.ac.uk

Created by houstorx@rpc142.cs.man.ac.uk

The following code causes a segfault on my machine, with every
version of perl that I have to hand (including blead):

print sort foo (1..2);
sub foo {
warn "Entered foo\n";
my $y = [1..999];
$x++ || foo();
$b <=> $a
}

Presumably it's caused by the fact that pp_sort overwrites the
root op of the comparison routine's CV with an OP_NULL while the
sort is running. (It would ordinarily be an OP_LEAVESUB.)

Robin

Perl Info


Flags:
    category=core
    severity=low

This perlbug was built using Perl v5.9.3 - Wed Oct 19 16:49:33 BST 2005
It is being executed now by  Perl v5.9.3 - Mon Aug  8 15:00:42 BST 2005.

Site configuration information for perl v5.9.3:

Configured by houstorx at Mon Aug  8 15:00:42 BST 2005.

Summary of my perl5 (revision 5 version 9 subversion 3 patch 25277) configuration:
  Platform:
    osname=linux, osvers=2.4.20-31.9, archname=i686-linux-thread-multi
    uname='linux rpc142 2.4.20-31.9 #1 tue apr 13 17:38:16 edt 2004 i686 athlon i386 gnulinux '
    config_args='-Dusethreads -Dusedevel -Dprefix=/local/perl -ders'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -fno-strict-aliasing -pipe -I/usr/local/include -I/opt/gnu/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O2',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -fno-strict-aliasing -pipe -I/usr/local/include -I/opt/gnu/include -I/usr/include/gdbm'
    ccversion='', gccversion='3.2.2 20030222 (Red Hat Linux 3.2.2-5)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib -L/opt/gnu/lib'
    libpth=/usr/local/lib /opt/gnu/lib /lib /usr/lib
    libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=/lib/libc-2.3.2.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.3.2'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib -L/opt/gnu/lib'

Locally applied patches:
    


@INC for perl v5.9.3:
    /local/perl/lib/5.9.3/i686-linux-thread-multi
    /local/perl/lib/5.9.3
    /local/perl/lib/site_perl/5.9.3/i686-linux-thread-multi
    /local/perl/lib/site_perl/5.9.3
    /local/perl/lib/site_perl
    .


Environment for perl v5.9.3:
    HOME=/home/X02/houstorx
    LANG=en_GB
    LANGUAGE (unset)
    LC_COLLATE=C
    LD_LIBRARY_PATH=/lib:/usr/lib:/opt/sfw/lib:/opt/gnu/lib:/home/X02/houstorx/lib
    LOGDIR (unset)
    PATH=.
    PERL_BADLANG (unset)
    SHELL=/bin/bash

p5pRT · 2005-10-23T11:20:06Z

From @iabyn

On Sat, Oct 22, 2005 at 04:37:02PM -0700, houstorx @ rpc142. cs. man. ac. uk wrote:

The following code causes a segfault on my machine, with every
version of perl that I have to hand (including blead):

print sort foo (1..2);
sub foo {
warn "Entered foo\n";
my $y = [1..999];
$x++ || foo();
$b <=> $a
}

Presumably it's caused by the fact that pp_sort overwrites the
root op of the comparison routine's CV with an OP_NULL while the
sort is running. (It would ordinarily be an OP_LEAVESUB.)

Yes, the current behaviour is deeply unsatisfactory. It's also not
threead-safe. The only fix I can think of for this that keeps the sort
code fast, is to include an indication in the pushed sub context that this is
a sort contaxt, then moduify pp_leavesub, pp_return and pp_leavesublv to
check for this and immediately return Nullop. This of course then slows
down normal sub execution :-(

--
"Foul and greedy Dwarf - you have eaten the last candle."
-- "Hordes of the Things", BBC Radio.

p5pRT · 2005-10-23T11:20:10Z

The RT System itself - Status changed from 'new' to 'open'

p5pRT · 2005-10-23T11:43:35Z

From maddingue@free.fr

The following code causes a segfault on my machine, with every
version of perl that I have to hand (including blead):

print sort foo (1..2);
sub foo {
warn "Entered foo\n";
my $y = [1..999];
$x++ || foo();
$b <=> $a
}

Presumably it's caused by the fact that pp_sort overwrites the
root op of the comparison routine's CV with an OP_NULL while the
sort is running. (It would ordinarily be an OP_LEAVESUB.)

FWIW, it doesn't crash on my default Perl:

$ perl -l rt-37508.pl
Entered foo
Entered foo
21

$ perl -V
Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
Platform:
osname=linux, osvers=2.6.3-25mdk-i686-up-4gb,
archname=i386-linux-thread-multi
uname='linux mercury.mandriva.com 2.6.3-25mdk-i686-up-4gb #1 fri jan 14
03:39:39 mst 2005 i686 intel(r) pentium(r) 4 cpu 3.00ghz unknown gnulinux '
config_args='-des -Dinc_version_list=5.8.4/i386-linux-thread-multi 5.8.4
5.8.3/i386-linux-thread-multi 5.8.3 5.8.2/i386-linux-thread-multi 5.8.2
5.8.1/i386-linux-thread-multi 5.8.1 5.8.0/i386-linux-thread-multi 5.8.0 5.6.1
5.6.0 -Darchname=i386-linux -Dcc=gcc -Doptimize=-O2 -fomit-frame-pointer -pipe
-march=i586 -mtune=pentiumpro -Dprefix=/usr -Dvendorprefix=/usr
-Dsiteprefix=/usr -Dman3ext=3pm -Dcf_by=Mandrakesoft -Dmyhostname=localhost
-Dperladmin=root@localhost -Dd_dosuid -Ud_csh -Duseshrplib
-Accflags=-DPERL_DISABLE_PMC -Dusethreads'
hint=recommended, useposix=true, d_sigaction=define
usethreads=define use5005threads=undef useithreads=define
usemultiplicity=define
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
-DPERL_DISABLE_PMC -fno-strict-aliasing -pipe -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
optimize='-O2 -fomit-frame-pointer -pipe -march=i586 -mtune=pentiumpro ',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DPERL_DISABLE_PMC
-fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
ccversion='', gccversion='3.4.1 (Mandrakelinux 10.1 3.4.1-4mdk)',
gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='gcc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lnsl -lndbm -lgdbm -ldl -lm -lcrypt -lutil -lpthread -lc
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
libc=/lib/libc-2.3.3.so, so=so, useshrplib=true, libperl=libperl.so
gnulibc_version='2.3.3'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E
-Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

Characteristics of this binary (from libperl):
Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES
PERL_IMPLICIT_CONTEXT
Locally applied patches:
MandrakeSoft patches (cf the source RPM)
Built under linux
Compiled at Apr 26 2005 15:08:54
@INC:
/usr/lib/perl5/5.8.5/i386-linux-thread-multi
/usr/lib/perl5/5.8.5
/usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.5
/usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.5
/usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.4
/usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.3
/usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.1
/usr/lib/perl5/vendor_perl
.

--
Sébastien Aperghis-Tramoni

Close the world, txEn eht nepO.

p5pRT · 2005-10-23T12:45:55Z

From @iabyn

On Sun, Oct 23, 2005 at 01:13:36PM +0100, Robin Houston wrote:

I can think of a different fix that doesn't slow down normal
execution, but you might not like it. While the comparison
routine is running, point its CvSTART at a bogus op which
just dies.

That still doesn't solve the threads problem, namely that the same op
tree is shared by CVs in multiple threads. When one thread calls sort, the
leavesub in the op tree is (non-thread-safely) diddled with, meaning that
a second thread might
* corrupt the pp_leavesub op
* corrupt that op's refcount
* still call the sub and crash on exit

--
My get-up-and-go just got up and went.

p5pRT · 2005-10-23T13:44:56Z

From @hvds

Dave Mitchell <davem@iabyn.com> wrote:
:On Sat, Oct 22, 2005 at 04:37:02PM -0700, houstorx @ rpc142. cs. man. ac. uk wrote:
:> The following code causes a segfault on my machine, with every
:> version of perl that I have to hand (including blead):
:>
:> print sort foo (1..2);
:> sub foo {
:> warn "Entered foo\n";
:> my $y = [1..999];
:> $x++ || foo();
:> $b <=> $a
:> }
:>
:> Presumably it's caused by the fact that pp_sort overwrites the
:> root op of the comparison routine's CV with an OP_NULL while the
:> sort is running. (It would ordinarily be an OP_LEAVESUB.)
:
:Yes, the current behaviour is deeply unsatisfactory. It's also not
:threead-safe. The only fix I can think of for this that keeps the sort
:code fast, is to include an indication in the pushed sub context that this is
:a sort contaxt, then moduify pp_leavesub, pp_return and pp_leavesublv to
:check for this and immediately return Nullop. This of course then slows
:down normal sub execution :-(

Would it be possible to mitigate that by:
- adding a 'sortsub' attribute, compile attributed subs with different
ops (in particular with 'die if called as normal function')
- compile inline sort subs with the attribute
- implicitly generate a :sortsub wrapper for outline sortsubs

I think this would allow zero-cost to normal sub execution and shifts
it instead to nonzero cost on any existing code specifying outline
sort subs, which can be grabbed back by adding a ':sortsub' sttribute.
But I have no idea if it is practical, or whether it actually solves
the problem.

Hugo

p5pRT · 2005-10-23T17:06:38Z

From maddingue@free.fr

Robin Houston <robin@cpan.org> wrote:

Sébastien Aperghis-Tramoni wrote:

FWIW, it doesn't crash on my default Perl:

I guess you're just "lucky": I'm sure 5.8.5 has the same bug.
Perhaps you can make it crash by increasing the '999' to a
larger number, or even increasing the '2'?

I increased the values to, respectively, 1_000_000 and 1_000 but
still cannot make it crash. I confirm that it crashes with bleadperl.

Note that this is a Perl 5.8.5 packaged by Rafael Garcia-Suarez for
Mandrake Linux 10.1, so it includes a few vendor patches, but none
of them seem related to this issue.

--
Sébastien Aperghis-Tramoni

Close the world, txEn eht nepO.

p5pRT · 2005-10-23T19:33:15Z

From robin@cpan.org

On Sun, Oct 23, 2005 at 01:46:13PM +0100, Dave Mitchell wrote:

That still doesn't solve the threads problem,

No, it doesn't. That would be really hard. :-)

Could we / should we document the fact that, if you want to safely
use sort with threads, you need to do

sort {my_cmp($a,$b)} @foo;

instead of

sort my_cmp @foo;

?

We could even issue a warning in the latter case, I imagine.
(Presumably it's possible to tell whether there's more than
one active thread?)

Robin

p5pRT · 2005-10-23T19:42:01Z

From robin@cpan.org

On Sun, Oct 23, 2005 at 01:55:20PM +0100, Robin Houston wrote:

We could even issue a warning in the latter case, I imagine.

Indeed we can. Patch below.

It seems to be the case that (PL_main_root->op_targ > 1) just
when multiple threads are active, but I might have missed another
way that could happen. Anyone?

Robin

Inline Patch

--- ./pod/perldiag.pod.1	2005-10-23 15:30:21.000000000 +0100
+++ ./pod/perldiag.pod	2005-10-23 15:37:04.000000000 +0100
@@ -3219,6 +3219,22 @@
 really meant to write this, disable the warning, or, better, put the
 parentheses explicitly and write C<$x & ($y == 0)>).
 
+=item Possible race condition in sort
+
+(W threads) If you call the sort() function with a custom comparison
+function, as
+
+    @result = sort my_cmp @input;
+
+the Perl interpreter will temporarily modify the internal representation
+of the my_cmp() function. If another thread calls my_cmp() during the
+sort, Perl's internal stack will become corrupt and your program will
+crash or otherwise misbehave. The solution is to rewrite the above as
+
+    @result = sort { &mycmp } @input;
+
+which avoids the problem (but is a little slower).
+
 =item Possible unintended interpolation of %s in string
 
 (W ambiguous) You said something like `@foo' in a double-quoted string
--- ./t/op/threads.t.1	2005-10-23 15:38:07.000000000 +0100
+++ ./t/op/threads.t	2005-10-23 16:07:00.000000000 +0100
@@ -17,7 +17,7 @@
        print "1..0 # Skip: no dynamic loading on miniperl, no threads\n";
        exit 0;
      }
-     plan(3);
+     plan(8);
 }
 use threads;
 
@@ -59,3 +59,42 @@
 threads->new(sub { $ref = $object } )->join; # $ref = $object causes problems
 print "ok";
 EOI
+
+# test that sort() warns when it's used in a thread-unsafe way
+# (and doesn't when it isn't)
+use warnings FATAL => 'threads';
+sub my_cmp {$a <=> $b}
+my @result;
+eval {
+    @result = sort my_cmp (1..10);
+};
+ok($@ eq "", "No sort warning with a single thread", $@);
+
+my $th = threads->new(sub{1});
+eval {
+    @result = sort my_cmp (1..10);
+};
+ok($@ =~ /^Possible race condition/,
+   "Sort warning with multiple threads");
+
+eval {
+    @result = sort {&my_cmp} (1..10);
+};
+ok($@ eq "", "But no warning when called indirectly", $@);
+
+$th->join();
+
+eval {
+    @result = sort my_cmp (1..10);
+};
+ok($@ eq "", "No sort warning with a single thread, again", $@);
+
+my $th = threads->new(sub{1});
+$th->detach();
+eval {
+    @result = sort my_cmp (1..10);
+};
+ok($@ =~ /^Possible race condition/,
+   "Sort warning even when thread is detached");
+
+threads->yield(); # Give the detached thread a chance to finish
--- ./pp_sort.c.1	2005-10-23 15:07:19.000000000 +0100
+++ ./pp_sort.c	2005-10-23 15:28:53.000000000 +0100
@@ -1552,6 +1552,11 @@
 	    else {
 		OP *o;
 
+#ifdef USE_ITHREADS
+		if (PL_main_root->op_targ > 1 && ckWARN(WARN_THREADS))
+		Perl_warner(aTHX_ packWARN(WARN_THREADS),
+		    "Possible race condition in sort");
+#endif
 		PL_sortcop = CvSTART(cv);
 		SAVEVPTR(CvROOT(cv)->op_ppaddr);
 		CvROOT(cv)->op_ppaddr = PL_ppaddr[OP_NULL];

p5pRT · 2005-10-23T19:53:19Z

From robin@cpan.org

Dave Mitchell wrote:

The only fix I can think of for this that keeps the sort code fast,
is to include an indication in the pushed sub context that this is a
sort contaxt, then modify pp_leavesub, pp_return and pp_leavesublv
to check for this and immediately return Nullop. This of course then
slows down normal sub execution :-(

I can think of a different fix that doesn't slow down normal
execution, but you might not like it. While the comparison
routine is running, point its CvSTART at a bogus op which
just dies.

Patch is below. I've also fixed #7579.

Note that this patch introduces the informal convention that,
if CvSTART(cv) is a NULLOP, the cv must be an active sort
comparison routine. AFAICT that is "safe", but it's potentially
confusing.

List::Util also has the same problems. I'll patch that too,
if you think this approach is okay.

Robin

Inline Patch

--- ./pod/perldiag.pod.orig	2005-10-23 12:15:17.000000000 +0100
+++ ./pod/perldiag.pod	2005-10-23 12:58:57.000000000 +0100
@@ -491,6 +491,20 @@
 (F) Only hard references may be blessed.  This is how Perl "enforces"
 encapsulation of objects.  See L<perlobj>.
 
+=item Can't call a comparison function recursively
+
+(F) A comparison routine used by the sort() function is not
+allowed to call itself, either directly or indirectly. You
+can, of course, call a recursive function I<from> a comparison
+routine, so you can sort with a recursive comparison using
+the following idiom:
+
+  sub recursive_cmp { ... }
+  sort {recursive_cmp($a,$b)} @list;
+
+(That might be useful if you're sorting binary trees, for
+example.)
+
 =item Can't call method "%s" in empty package "%s"
 
 (F) You called a method correctly, and it correctly indicated a package
@@ -761,6 +775,11 @@
 you tried to jump out of a sort() block or subroutine, which is a no-no.
 See L<perlfunc/goto>.
 
+=item Can't goto subroutine from a comparison function
+
+(F) The "goto subroutine" call can't be used to jump out of a
+comparison routine used by the sort() function.
+
 =item Can't goto subroutine from an eval-%s
 
 (F) The "goto subroutine" call can't be used to jump out of an eval
--- ./t/op/sort.t.orig	2005-10-23 12:01:41.000000000 +0100
+++ ./t/op/sort.t	2005-10-23 12:56:43.000000000 +0100
@@ -5,7 +5,7 @@
     @INC = '../lib';
 }
 use warnings;
-print "1..129\n";
+print "1..131\n";
 
 # these shouldn't hang
 {
@@ -670,3 +670,25 @@
 
 @output = reverse (0, sort(qw(C A B)));
 ok "@output", "C B A 0", 'reversed sort with leading argument';
+
+# Bug #37508
+$x = 0;
+sub recursive {
+    recursive() if !$x++;
+    $a <=> $b;
+}
+
+eval { @output = sort recursive (1..5) };
+print(($@ =~ /^Can't call a comparison function recursively/
+    ? "ok "
+    : "not ok "),
+    $test++, " - Can't call a comparison function recursively\n");
+
+# Bug #7579
+sub going_away { goto &gone_away }
+sub gone_away {$a <=> $b}
+eval { @output = sort going_away (1..5) };
+print(($@ =~ /^Can't goto subroutine from a comparison function/
+    ? "ok "
+    : "not ok "),
+    $test++, " - Can't goto subroutine from a comparison function\n");
--- ./pp_sort.c.orig	2005-10-23 11:18:09.000000000 +0100
+++ ./pp_sort.c	2005-10-23 11:29:16.000000000 +0100
@@ -1477,6 +1477,13 @@
     sortsvp(aTHX_ array, nmemb, cmp, 1);
 }
 
+static
+OP *
+S_pp_sort_no_recurse(pTHX)
+{
+    DIE(aTHX_ "Can't call a comparison function recursively");
+}
+
 #define SvNSIOK(sv) ((SvFLAGS(sv) & SVf_NOK) || ((SvFLAGS(sv) & (SVf_IOK|SVf_IVisUV)) == SVf_IOK))
 #define SvSIOK(sv) ((SvFLAGS(sv) & (SVf_IOK|SVf_IVisUV)) == SVf_IOK)
 #define SvNSIV(sv) ( SvNOK(sv) ? SvNVX(sv) : ( SvSIOK(sv) ? SvIVX(sv) : sv_2nv(sv) ) )
@@ -1543,10 +1550,24 @@
 	    if (is_xsub)
 		PL_sortcop = (OP*)cv;
 	    else {
+		OP *o;
+
 		PL_sortcop = CvSTART(cv);
 		SAVEVPTR(CvROOT(cv)->op_ppaddr);
 		CvROOT(cv)->op_ppaddr = PL_ppaddr[OP_NULL];
 
+		/* This prevents the comparison routine
+		   from being called recursively. (Bug #37508) */
+		NewOp(1101, o, 1, OP);
+		SAVEFREEOP(o);
+		o->op_type = OP_NULL;
+		o->op_ppaddr = &S_pp_sort_no_recurse;
+		o->op_flags = 0;
+		o->op_private = 0;
+
+		SAVEVPTR(CvSTART(cv));
+		CvSTART(cv) = o;
+
 		SAVECOMPPAD();
 		PAD_SET_CUR_NOSAVE(CvPADLIST(cv), 1);
             }
--- ./op.c.orig	2005-10-23 11:39:49.000000000 +0100
+++ ./op.c	2005-10-23 11:40:51.000000000 +0100
@@ -4355,7 +4355,8 @@
 		goto done;
 	    }
 	    /* ahem, death to those who redefine active sort subs */
-	    if (PL_curstackinfo->si_type == PERLSI_SORT && PL_sortcop == CvSTART(cv))
+	    if (PL_curstackinfo->si_type == PERLSI_SORT
+		&& CvSTART(cv)->op_type == OP_NULL)
 		Perl_croak(aTHX_ "Can't redefine active sort subroutine %s", name);
 	    if (block) {
 		if (ckWARN(WARN_REDEFINE)
--- ./sv.c.orig	2005-10-23 11:39:52.000000000 +0100
+++ ./sv.c	2005-10-23 11:58:14.000000000 +0100
@@ -3868,7 +3868,7 @@
 	    }
 	    /* ahem, death to those who redefine active sort subs */
 	    else if (PL_curstackinfo->si_type == PERLSI_SORT
-		     && GvCV(dstr) && PL_sortcop == CvSTART(GvCV(dstr)))
+		     && GvCV(dstr) && CvSTART(GvCV(dstr))->op_type == OP_NULL)
 		Perl_croak(aTHX_ "Can't redefine active sort subroutine %s",
 		      GvNAME(dstr));
 
@@ -3976,7 +3976,7 @@
 				/* ahem, death to those who redefine
 				 * active sort subs */
 				if (PL_curstackinfo->si_type == PERLSI_SORT &&
-				      PL_sortcop == CvSTART(cv))
+				      CvSTART(cv)->op_type == OP_NULL)
 				    Perl_croak(aTHX_
 				    "Can't redefine active sort subroutine %s",
 					  GvENAME((GV*)dstr));
--- ./pp_ctl.c.orig	2005-10-23 12:33:05.000000000 +0100
+++ ./pp_ctl.c	2005-10-23 12:56:15.000000000 +0100
@@ -2337,6 +2337,12 @@
 		else
 		    DIE(aTHX_ "Can't goto subroutine from an eval-block");
 	    }
+	    /* Ban goto from comparison routine: see bug #7579 */
+	    else if (CxTYPE(cx) == CXt_SUB &&
+		    CvSTART(cx->blk_sub.cv) &&
+		    CvSTART(cx->blk_sub.cv)->op_type == OP_NULL)
+		DIE(aTHX_ "Can't goto subroutine from a comparison function");
+
 	    if (CxTYPE(cx) == CXt_SUB && cx->blk_sub.hasargs) {
 		/* put @_ back onto stack */
 		AV* av = cx->blk_sub.argarray;

p5pRT · 2005-10-23T20:01:46Z

From robin@cpan.org

Dave Mitchell wrote:

The only fix I can think of for this that keeps the sort code fast, is to
include an indication in the pushed sub context that this is a sort
contaxt, then moduify pp_leavesub, pp_return and pp_leavesublv to check
for this and immediately return Nullop.

It looks as though pp_return already has an explicit check for sort:

if (PL_curstackinfo->si_type == PERLSI_SORT) {
if (cxstack_ix == PL_sortcxix
|| dopoptosub(cxstack_ix) <= PL_sortcxix)
{
if (cxstack_ix > PL_sortcxix)
dounwind(PL_sortcxix);
AvARRAY(PL_curstack)[1] = *SP;
PL_stack_sp = PL_stack_base + 1;
return 0;
}
}

so the vast majority of sub exits are *already* being slowed down by
an explicit check! Surely it wouldn't noticeably hurt to add the same
check to pp_leavesub? (Who uses implicit return anyway?)

Other modules that pull the same trick (in particular List::Util) could
get the benefit of this by a simple PUSHSTACKi(PERLSI_SORT). That way
they'd also benefit from the "Can't redefine active sort subroutine"
machinery.

Does this sound reasonable?

Robin

p5pRT · 2005-10-23T20:10:13Z

From robin@cpan.org

Sébastien Aperghis-Tramoni wrote:

FWIW, it doesn't crash on my default Perl:

I guess you're just "lucky": I'm sure 5.8.5 has the same bug.
Perhaps you can make it crash by increasing the '999' to a
larger number, or even increasing the '2'?

I now have a patch for this, which I'll send through as soon as
'make test' has finished (assuming it passes of course!)

Robin

p5pRT · 2005-10-23T20:57:17Z

From @iabyn

On Sun, Oct 23, 2005 at 08:41:00PM +0100, Robin Houston wrote:

Dave Mitchell wrote:

The only fix I can think of for this that keeps the sort code fast, is to
include an indication in the pushed sub context that this is a sort
contaxt, then moduify pp_leavesub, pp_return and pp_leavesublv to check
for this and immediately return Nullop.

It looks as though pp_return already has an explicit check for sort:
if $PL\_curstackinfo\->si\_type == PERLSI\_SORT$ \{
    if $cxstack\_ix == PL\_sortcxix
        || dopoptosub\(cxstack\_ix$ \<= PL\_sortcxix\)
    \{
        if $cxstack\_ix > PL\_sortcxix$
            dounwind$PL\_sortcxix$;
        AvARRAY$PL\_curstack$\[1\] = \*SP;
        PL\_stack\_sp = PL\_stack\_base \+ 1;
        return 0;
    \}
\}
so the vast majority of sub exits are *already* being slowed down by
an explicit check! Surely it wouldn't noticeably hurt to add the same
check to pp_leavesub? (Who uses implicit return anyway?)

Other modules that pull the same trick (in particular List::Util) could
get the benefit of this by a simple PUSHSTACKi(PERLSI_SORT). That way
they'd also benefit from the "Can't redefine active sort subroutine"
machinery.

Does this sound reasonable?

Mostly.

Personally I'd like to see new macros/functions added to the API that
allow you to set up a lightweight call series, call the CV several times,
then clear up, ie (with better names than this):

{
dLIGHTWEIGHT_SUB_CALL;

...
START_LIGHTWEIGHT_SUB_CALL(cv); /* does PUSHBLOCK, PUSHSUB etc */
for (...) {
PUSHs(...); ...; /* set up args */
CALL_LEIGHTWEIGHT_SUB;
/* handle returned value(s) on stack */
}
END_LIGHTWEIGHT_SUB_CALL(cv); /* does POPBLOCK, POPSUB etc */
}

Then this same API can be used by pp_sort, List::Util, and anyone else
who needs it.

Since the sort sub is no longer being messed with, it may no longer be a
crime to redefine it (although I'm sure about that).

I don't recommend using the PL_curstackinfo->si_type == PERLSI_SORT test
for the general case; perhaps the sub context should be marked in some
way eg by adding a flags field to struct block_sub ?

--
Hofstadter's Law: It always takes longer than you expect, even when you
take into account Hofstadter's Law.

p5pRT · 2005-10-23T21:49:35Z

From robin@cpan.org

On Sun, Oct 23, 2005 at 09:57:52PM +0100, Dave Mitchell wrote:

Since the sort sub is no longer being messed with, it may no longer be a
crime to redefine it (although I'm [not] sure about that).

That's an interesting idea! Probably true.

I don't recommend using the PL_curstackinfo->si_type == PERLSI_SORT test
for the general case; perhaps the sub context should be marked in some
way eg by adding a flags field to struct block_sub ?

I think there's a lot of unused space in cx_type. It's a 32-bit
field, of which the least-significant octet is the actual type,
and the next 2 bits are used for flags. The remaining 22 bits are
going begging, as far as I can see. We could easily nab one of
them as a flag for this.

We could then get rid of PL_sortcxix entirely.

I like your idea of defining a proper API for lightweight callbacks.
(I was contemplating writing a "how-to" guide for lightweight
callbacks, which is what got me thinking about these problems.)

Robin

PS. Do you know why sort does a PUSHSTACK, by the way? I must confess
that I don't understand the reason for the "stack of stacks" at all.
The rule seems to be that, whenever a pp routine invokes a new runops
loop, it pushes a new stack; but I don't understand why that's
necessary. (List::Util doesn't, for example, and doesn't seem to
suffer for it.)

PPS. Other relevant bugs include 7579 and 30333.

p5pRT · 2005-10-23T22:47:13Z

From @iabyn

On Sun, Oct 23, 2005 at 10:48:03PM +0100, Robin Houston wrote:

I think there's a lot of unused space in cx_type. It's a 32-bit
field, of which the least-significant octet is the actual type,
and the next 2 bits are used for flags. The remaining 22 bits are
going begging, as far as I can see. We could easily nab one of
them as a flag for this.

There's also two U8s, meaning there's probably 16 bits wasted due to
alignment problems.

We could then get rid of PL_sortcxix entirely.

Hopefully.

PS. Do you know why sort does a PUSHSTACK, by the way? I must confess
that I don't understand the reason for the "stack of stacks" at all.
The rule seems to be that, whenever a pp routine invokes a new runops
loop, it pushes a new stack; but I don't understand why that's
necessary. (List::Util doesn't, for example, and doesn't seem to
suffer for it.)

As I understand it, it's needed whenever realloc()ing the stack during
growing could confuse the caller, the classic example being magic, eg

dSP;
sv_setsv_mg(foo,bar); /* may invoke magic and realloc the stack */
/* XX SP no longer valid */

As to why sort needs a new stack, I don't know.

--
The crew of the Enterprise encounter an alien life form which is
suprisingly neither humanoid nor made from pure energy.
-- Things That Never Happen in "Star Trek" #22

p5pRT · 2005-10-23T23:40:23Z

From robin@cpan.org

On Sun, Oct 23, 2005 at 11:47:41PM +0100, Dave Mitchell wrote:

As I understand it, it's needed whenever realloc()ing the stack during
growing could confuse the caller, the classic example being magic

Oh I see!

Presumably sort needs it for exactly the same reason: i.e. the
comparison routine could do something that would force the stack
to be realloc()ed?

This sounds a lot like a rather difficult-to-trigger bug in
List::Util. Perhaps it can be made easier to trigger by using
perl's malloc and setting STRESS_REALLOC? I'll have a play...

Robin

p5pRT · 2005-10-24T06:46:09Z

From @iabyn

On Mon, Oct 24, 2005 at 12:11:12AM +0100, Robin Houston wrote:

On Sun, Oct 23, 2005 at 11:47:41PM +0100, Dave Mitchell wrote:

As I understand it, it's needed whenever realloc()ing the stack during
growing could confuse the caller, the classic example being magic

Oh I see!

Presumably sort needs it for exactly the same reason: i.e. the
comparison routine could do something that would force the stack
to be realloc()ed?

Except in this case, the stack reallocation isn't unexpected; all pp_foo
functions may push arbitrary things on the stack, and the trick is that
these functions are called from a top-level RUNOPS loop that doesn't hold
on to any ptrs to the stack. So that can't be the reason (unless pp-sort
is desinged ot be called from other places too???)

--
There's a traditional definition of a shyster: a lawyer who, when the law
is against him, pounds on the facts; when the facts are against him,
pounds on the law; and when both the facts and the law are against him,
pounds on the table.
-- Eben Moglen referring to SCO

p5pRT · 2005-10-24T12:42:36Z

From robin@cpan.org

On Mon, Oct 24, 2005 at 07:45:45AM +0100, Dave Mitchell wrote:

Except in this case, the stack reallocation isn't unexpected; all pp_foo
functions may push arbitrary things on the stack,

Sure. I was thinking of the fact that Perl_pp_sort() itself
has a local copy of the stack pointer. On the other hand, it
doesn't seem to be used after the POPSTACK, so perhaps it
isn't really needed at all.

Robin

p5pRT · 2011-03-19T19:17:29Z

From @cpansprout

Resolved by 9850bf2 (and even documented in perl5100delta).

p5pRT · 2011-03-19T19:17:29Z

@cpansprout - Status changed from 'open' to 'resolved'

p5pRT closed this as completed Mar 19, 2011

p5pRT added Severity Low distro-Linux type-core labels Oct 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recursive comparison routine can cause segfault in sort #8162

Recursive comparison routine can cause segfault in sort #8162

p5pRT commented Oct 22, 2005

p5pRT commented Oct 22, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

p5pRT commented Oct 24, 2005

p5pRT commented Oct 24, 2005

p5pRT commented Mar 19, 2011

p5pRT commented Mar 19, 2011

Recursive comparison routine can cause segfault in sort #8162

Recursive comparison routine can cause segfault in sort #8162

Comments

p5pRT commented Oct 22, 2005

p5pRT commented Oct 22, 2005

From houstorx@rpc142.cs.man.ac.uk

Created by houstorx@rpc142.cs.man.ac.uk

p5pRT commented Oct 23, 2005

From @iabyn

p5pRT commented Oct 23, 2005

p5pRT commented Oct 23, 2005

From maddingue@free.fr

p5pRT commented Oct 23, 2005

From @iabyn

p5pRT commented Oct 23, 2005

From @hvds

p5pRT commented Oct 23, 2005

From maddingue@free.fr

p5pRT commented Oct 23, 2005

From robin@cpan.org

p5pRT commented Oct 23, 2005

From robin@cpan.org

p5pRT commented Oct 23, 2005

From robin@cpan.org

p5pRT commented Oct 23, 2005

From robin@cpan.org

p5pRT commented Oct 23, 2005

From robin@cpan.org

p5pRT commented Oct 23, 2005

From @iabyn

p5pRT commented Oct 23, 2005

From robin@cpan.org

p5pRT commented Oct 23, 2005

From @iabyn

p5pRT commented Oct 23, 2005

From robin@cpan.org

p5pRT commented Oct 24, 2005

From @iabyn

p5pRT commented Oct 24, 2005

From robin@cpan.org

p5pRT commented Mar 19, 2011

From @cpansprout

p5pRT commented Mar 19, 2011