Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PATCH] Carp: Do not crash when reading @DB::args #15928

Closed
p5pRT opened this issue Mar 23, 2017 · 37 comments
Closed

[PATCH] Carp: Do not crash when reading @DB::args #15928

p5pRT opened this issue Mar 23, 2017 · 37 comments

Comments

@p5pRT
Copy link

p5pRT commented Mar 23, 2017

Migrated from rt.perl.org#131046 (status was 'open')

Searchable as RT131046$

@p5pRT
Copy link
Author

p5pRT commented Mar 23, 2017

From @pali

Created by @pali

Trying to read values from array @​DB​::args can lead to perl fatal error
"Bizarre copy of ARRAY in scalar assignment". But missing, incomplete or
possible incorrect value in @​DB​::args is not a fatal error for Carp.

Carp is primary used for reporting warnings and errors from other modules,
so it should not crash perl when trying to print error message.

This patch safely iterates all elements of @​DB​::args array via eval { }
block and replace already freed scalars for Carp usage by string
"** argument not available anymore **".

This prevent crashing perl and allows to use Carp module. It it not a
proper fix but rather workaround for Carp module. At least it allows to
safely use Carp.

Bug​: https://rt.perl.org/Public/Bug/Display.html?id=52610

Perl Info

Flags:
    category=library
    severity=high
    Type=Patch
    PatchStatus=HasPatch
    module=Carp

Site configuration information for perl 5.25.10:

Configured by pali at Sat Jan 14 11:57:32 CET 2017.

Summary of my perl5 (revision 5 version 25 subversion 9) configuration:
  Derived from: 155408ed6773363ab605c3c786489765e35d4ec5
  Platform:
    osname=linux
    osvers=3.13.0-107-generic
    archname=x86_64-linux
    uname='linux pali 3.13.0-107-generic #154~precise1-ubuntu smp tue dec 20 10:36:28 utc 2016 x86_64 x86_64 x86_64 gnulinux '
    config_args='-ds -e -Dprefix=/usr -Dusedevel'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    bincompat5005=undef
  Compiler:
    cc='cc'
    ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    optimize='-O2'
    cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion=''
    gccversion='4.6.3'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.6/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
    libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.15.so
    so=so
    useshrplib=false
    libperl=libperl.a
    gnulibc_version='2.15'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E'
    cccdlflags='-fPIC'
    lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector'

Locally applied patches:
    uncommitted-changes


@INC for perl 5.25.10:
    lib
    /usr/lib/perl5/site_perl/5.25.9/x86_64-linux
    /usr/lib/perl5/site_perl/5.25.9
    /usr/lib/perl5/5.25.9/x86_64-linux
    /usr/lib/perl5/5.25.9


Environment for perl 5.25.10:
    HOME=/home/pali
    LANG=C
    LANGUAGE=C
    LD_LIBRARY_PATH=/usr/lib/i386-linux-gnu/
    LOGDIR (unset)
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Mar 23, 2017

From @pali

0001-Carp-Do-not-crash-when-reading-DB-args.patch
From a0be329581b1a1065ccb2efac495a670a953fa3e Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Tue, 21 Mar 2017 18:43:45 +0100
Subject: [PATCH] Carp: Do not crash when reading @DB::args

Trying to read values from array @DB::args can lead to perl fatal error
"Bizarre copy of ARRAY in scalar assignment". But missing, incomplete or
possible incorrect value in @DB::args is not a fatal error for Carp.

Carp is primary used for reporting warnings and errors from other modules,
so it should not crash perl when trying to print error message.

This patch safely iterates all elements of @DB::args array via eval { }
block and replace already freed scalars for Carp usage by string
"** argument not available anymore **".

This prevent crashing perl and allows to use Carp module. It it not a
proper fix but rather workaround for Carp module. At least it allows to
safely use Carp.

Bug: https://rt.perl.org/Public/Bug/Display.html?id=52610
---
 dist/Carp/lib/Carp.pm       |   12 +++++++++++-
 dist/Carp/t/rt52610_crash.t |   25 +++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 1 deletion(-)
 create mode 100644 dist/Carp/t/rt52610_crash.t

diff --git a/dist/Carp/lib/Carp.pm b/dist/Carp/lib/Carp.pm
index 05052b9..e288ede 100644
--- a/dist/Carp/lib/Carp.pm
+++ b/dist/Carp/lib/Carp.pm
@@ -226,7 +226,17 @@ sub caller_info {
                 = "** Incomplete caller override detected$where; \@DB::args were not set **";
         }
         else {
-            @args = @DB::args;
+            @args = map {
+                my $arg;
+                local $@;
+                eval {
+                    $arg = $_;
+                    1;
+                } or do {
+                    $arg = '** argument not available anymore **';
+                };
+                $arg;
+            } @DB::args;
             my $overflow;
             if ( $MaxArgNums and @args > $MaxArgNums )
             {    # More than we want to show?
diff --git a/dist/Carp/t/rt52610_crash.t b/dist/Carp/t/rt52610_crash.t
new file mode 100644
index 0000000..faa19cb
--- /dev/null
+++ b/dist/Carp/t/rt52610_crash.t
@@ -0,0 +1,25 @@
+use warnings;
+use strict;
+
+use Test::More tests => 1;
+
+use Carp ();
+
+sub do_carp {
+    Carp::longmess;
+}
+
+sub call_with_args {
+    my ($arg_hash, $func) = @_;
+    $func->(@{$arg_hash->{'args'}});
+}
+
+my $msg;
+my $h = {};
+my $arg_hash = {'args' => [undef]};
+call_with_args($arg_hash, sub {
+    $arg_hash->{'args'} = [];
+    $msg = do_carp(sub { $h; });
+});
+
+like $msg, qr/^ at.+\b(?i:rt52610_crash\.t) line \d+\.\n\tmain::__ANON__\(.*\) called at.+\b(?i:rt52610_crash\.t) line \d+\n\tmain::call_with_args\(HASH\(0x[[:xdigit:]]+\), CODE\(0x[[:xdigit:]]+\)\) called at.+\b(?i:rt52610_crash\.t) line \d+$/;
-- 
1.7.9.5

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2017

From zefram@fysh.org

via RT wrote​:

This patch safely iterates all elements of @​DB​::args array

It's not safe. The underlying problem is stack refcounting, leading to
@​DB​::args pointing into freed memory. Interpreting freed memory as SV
structures fundamentally cannot be made safe. A patch like this would
avoid crashing in a few specific cases, where you're lucky enough that
the freed memory is still recognisable as freed, but this is neither
necessary nor sufficient in order to avoid crashing.

This manifestation of the stack refcounting problem can only be solved
by actually fixing stack refcounting. In the meantime I'm disinclined
to apply any such inadequate bandaid.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2017

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Mar 27, 2017

From @pali

On Saturday 25 March 2017 14​:40​:18 Zefram via RT wrote​:

via RT wrote​:

This patch safely iterates all elements of @​DB​::args array

It's not safe. The underlying problem is stack refcounting, leading to
@​DB​::args pointing into freed memory. Interpreting freed memory as SV
structures fundamentally cannot be made safe. A patch like this would
avoid crashing in a few specific cases, where you're lucky enough that
the freed memory is still recognisable as freed, but this is neither
necessary nor sufficient in order to avoid crashing.

This manifestation of the stack refcounting problem can only be solved
by actually fixing stack refcounting. In the meantime I'm disinclined
to apply any such inadequate bandaid.

-zefram

This patch is fixing problem with Carp when it dies with error "Bizarre
copy". I know that problem is with refcounting perl stack. But in case
Carp dies, and does not crash perl interpreter *before* throwing that
error, that such operation was safe and could be handled by eval. In
case reading that free memory cause SEGFAULT of perl process then it
would be before "Bizarre copy" and this patch have no effect on it.

Working Carp module is crucial for debugging and reporting errors. So it
should be fixed to those problems, specially when it dies in Makefile.PL
or any other critical part...

@p5pRT
Copy link
Author

p5pRT commented Jan 31, 2018

From @demerphq

On 25 March 2017 at 22​:39, Zefram <zefram@​fysh.org> wrote​:

via RT wrote​:

This patch safely iterates all elements of @​DB​::args array

It's not safe. The underlying problem is stack refcounting, leading to
@​DB​::args pointing into freed memory. Interpreting freed memory as SV
structures fundamentally cannot be made safe. A patch like this would
avoid crashing in a few specific cases, where you're lucky enough that
the freed memory is still recognisable as freed, but this is neither
necessary nor sufficient in order to avoid crashing.

I really feel this argument is unsound. We can encounter a stack
refcounting bug /only/ because we are serializing the stack to report
some other exception. It is quite reasonable that while serialize the
stacktrace for some other exception we might tickle a stack
refcounting bug that we would never encounter any other way.

It is also unsympathetic to the needs of the developer, if I am
throwing an exception about X, it is absolutely the worst thing that
could happen that the error throwing mechanism ends up replacing the
information about the exception I am trying to report with information
about the fact that Carp tickled a stack-refcounting bug. If I am
throwing an exception then I want to know about THAT exception, not
the one that is produced by Carp.

This manifestation of the stack refcounting problem can only be solved
by actually fixing stack refcounting. In the meantime I'm disinclined
to apply any such inadequate bandaid.

This ticket isn't about fixing the stack refcounting, it is about not
dieing while trying to report some other issue. These exceptions as
far as I know are catchable (or I wouldnt know about them at all, I
only see them at the end of a long error reporting pipeline that
depends on SIG{__DIE__}). If they are catchable then they can be made
to not actively HARM the user by hiding critical information they
really want to know about.

I wrote almost exactly the same patch that pali did, and I really
think this situation calls for a more convincing explanation why
applying it would be a bad thing than the one you have provided here.

Also I guess you missed it, at the time, but pali said as much as I
have here in his reply on this ticket as well.

cheers,
Yves

@p5pRT
Copy link
Author

p5pRT commented Feb 1, 2018

From zefram@fysh.org

demerphq wrote​:

                                   We can encounter a stack

refcounting bug /only/ because we are serializing the stack to report
some other exception.

We can indeed, but that doesn't make it any more possible to work around
the bug. It does give you the opportunity to entirely avoid tickling
these stack refcounting bugs, by not attempting to serialise arguments
from the stack at all. But if you do look at arguments on the stack,
you're subject to the refcounting bug, and can't work around that.

This ticket isn't about fixing the stack refcounting, it is about not
dieing while trying to report some other issue.

Yes, and that objective is not achievable. The stack-not-refcounted
bug cannot be reliably detected.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Feb 1, 2018

From @demerphq

On 1 February 2018 at 10​:17, Zefram <zefram@​fysh.org> wrote​:

demerphq wrote​:

                                   We can encounter a stack

refcounting bug /only/ because we are serializing the stack to report
some other exception.

We can indeed, but that doesn't make it any more possible to work around
the bug. It does give you the opportunity to entirely avoid tickling
these stack refcounting bugs, by not attempting to serialise arguments
from the stack at all. But if you do look at arguments on the stack,
you're subject to the refcounting bug, and can't work around that.

This ticket isn't about fixing the stack refcounting, it is about not
dieing while trying to report some other issue.

Yes, and that objective is not achievable. The stack-not-refcounted
bug cannot be reliably detected.

I really feel like you are missing the point. By adding the eval we do
not die in at least /some/ situations. That is all our users care
about. In most of the cases we are going to die a second later anyway.

If the stack refcounting bug leads to a SEGV, then so be it, there is
not much an end user can do about it. But we do not need to mess up
peoples stack traces and hide user thrown exceptions when we /can/
detect it and handle it gracefully, which as I said already /is
clearly happening at least some of the time, or I wouldn't be aware of
this issue at all/.

By that I mean the fact I know about this issue is because our error
reporting pipeline at Booking collects all errors via $SIG{__DIE__}
and then sends them over the network to centralized collectors. All of
the code involved to send the error is Perl. Thus intercepting these
errors does not appear to be harmful. So I see no reason why Carp
should not intercept them.

Your argument comes down to "we cant do this perfectly so we shouldn't
do it at all". To me that is the fallacy of the excluded middle.

Carp throwing trappable exceptions while serializing the stack is of
no use to anyone other than someone interested in debugging stack
refcounting issues, and losing valuable user data is an unacceptable
price to pay. If we can reduce the surface area of data loss then we
can and should do so.

With respect, unless you can provide a better argument or you decide
to invoke the Pumpking I will merge smoke-me/rt52610 (assuming it
passes tests). Leaving this as is is not in the
best interest of our user community.

cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Feb 23, 2018

From @demerphq

On 1 February 2018 at 10​:35, demerphq <demerphq@​gmail.com> wrote​:

On 1 February 2018 at 10​:17, Zefram <zefram@​fysh.org> wrote​:

demerphq wrote​:

                                   We can encounter a stack

refcounting bug /only/ because we are serializing the stack to report
some other exception.

We can indeed, but that doesn't make it any more possible to work around
the bug. It does give you the opportunity to entirely avoid tickling
these stack refcounting bugs, by not attempting to serialise arguments
from the stack at all. But if you do look at arguments on the stack,
you're subject to the refcounting bug, and can't work around that.

This ticket isn't about fixing the stack refcounting, it is about not
dieing while trying to report some other issue.

Yes, and that objective is not achievable. The stack-not-refcounted
bug cannot be reliably detected.

I really feel like you are missing the point. By adding the eval we do
not die in at least /some/ situations. That is all our users care
about. In most of the cases we are going to die a second later anyway.

If the stack refcounting bug leads to a SEGV, then so be it, there is
not much an end user can do about it. But we do not need to mess up
peoples stack traces and hide user thrown exceptions when we /can/
detect it and handle it gracefully, which as I said already /is
clearly happening at least some of the time, or I wouldn't be aware of
this issue at all/.

By that I mean the fact I know about this issue is because our error
reporting pipeline at Booking collects all errors via $SIG{__DIE__}
and then sends them over the network to centralized collectors. All of
the code involved to send the error is Perl. Thus intercepting these
errors does not appear to be harmful. So I see no reason why Carp
should not intercept them.

Your argument comes down to "we cant do this perfectly so we shouldn't
do it at all". To me that is the fallacy of the excluded middle.

Carp throwing trappable exceptions while serializing the stack is of
no use to anyone other than someone interested in debugging stack
refcounting issues, and losing valuable user data is an unacceptable
price to pay. If we can reduce the surface area of data loss then we
can and should do so.

With respect, unless you can provide a better argument or you decide
to invoke the Pumpking I will merge smoke-me/rt52610 (assuming it
passes tests). Leaving this as is is not in the
best interest of our user community.

Patch pushed as 4764858

Note this ticket is a duplicate of Perl #52610

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Feb 23, 2018

From @demerphq

On 23 February 2018 at 10​:36, demerphq <demerphq@​gmail.com> wrote​:

On 1 February 2018 at 10​:35, demerphq <demerphq@​gmail.com> wrote​:

On 1 February 2018 at 10​:17, Zefram <zefram@​fysh.org> wrote​:

demerphq wrote​:

                                   We can encounter a stack

refcounting bug /only/ because we are serializing the stack to report
some other exception.

We can indeed, but that doesn't make it any more possible to work around
the bug. It does give you the opportunity to entirely avoid tickling
these stack refcounting bugs, by not attempting to serialise arguments
from the stack at all. But if you do look at arguments on the stack,
you're subject to the refcounting bug, and can't work around that.

This ticket isn't about fixing the stack refcounting, it is about not
dieing while trying to report some other issue.

Yes, and that objective is not achievable. The stack-not-refcounted
bug cannot be reliably detected.

I really feel like you are missing the point. By adding the eval we do
not die in at least /some/ situations. That is all our users care
about. In most of the cases we are going to die a second later anyway.

If the stack refcounting bug leads to a SEGV, then so be it, there is
not much an end user can do about it. But we do not need to mess up
peoples stack traces and hide user thrown exceptions when we /can/
detect it and handle it gracefully, which as I said already /is
clearly happening at least some of the time, or I wouldn't be aware of
this issue at all/.

By that I mean the fact I know about this issue is because our error
reporting pipeline at Booking collects all errors via $SIG{__DIE__}
and then sends them over the network to centralized collectors. All of
the code involved to send the error is Perl. Thus intercepting these
errors does not appear to be harmful. So I see no reason why Carp
should not intercept them.

Your argument comes down to "we cant do this perfectly so we shouldn't
do it at all". To me that is the fallacy of the excluded middle.

Carp throwing trappable exceptions while serializing the stack is of
no use to anyone other than someone interested in debugging stack
refcounting issues, and losing valuable user data is an unacceptable
price to pay. If we can reduce the surface area of data loss then we
can and should do so.

With respect, unless you can provide a better argument or you decide
to invoke the Pumpking I will merge smoke-me/rt52610 (assuming it
passes tests). Leaving this as is is not in the
best interest of our user community.

Patch pushed as 4764858

Note this ticket is a duplicate of Perl #52610

Actually, this ticket is a workaround for the bug document at Perl
#52610, sorry for the confusion. :-(

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Feb 24, 2018

From zefram@fysh.org

demerphq wrote​:

Patch pushed as 4764858

There are a couple of problems with this patch, going beyond my
distaste for the inadequacy of the code change. Firstly, the commit
message contains exactly the kind of misconception that I argued against.
Rather than describing the effect as partial, the message says "This patch
safely iterates all elements", "This prevent crashing perl", and "It [is]
... workaround for Carp module". (Bad grammar all in the actual message.)
This is describing the patch as having an effectiveness that it doesn't.

Secondly, the newly added test is predicated on that very same mythical
effectiveness. It sets up a stack refcounting failure, at which point
behaviour is undefined, whether the Carp patch is applied or not.
This is not a correct test​: it can fail without anything getting more
wrong than it already is.

The last argument advanced in favour of this patch, unlike many that
had preceded it, was supposedly based on a correct understanding of its
limitations. I didn't argue against that, because my greater objection
is about the proposed patch being misunderstood and oversold. Where it's
correctly understood, it's more a matter of opinion whether it's useful.
But the patch as applied doesn't seem to have taken that correct
understanding into account. It is firmly based on the misconception,
and I seem to have been fooled into dropping my objections by some talk
that doesn't actually reflect the totality of the patch.

Bonus problem​: the patch desynchronises the version numbers of Carp and
Carp​::Heavy, giving the former an unjustified underscored version.

Clearly the new test needs to be deleted. I don't know what to do about
the false commit message, though. The message doesn't reference this
ticket, so putting a correction here doesn't serve as an annotation.
I'm tempted to revert the commit, on the basis that the code change can
later be applied again with a correct message. Then at least git blame
would point at something correct.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Feb 24, 2018

From @demerphq

On 24 Feb 2018 08​:01, "Zefram" <zefram@​fysh.org> wrote​:

demerphq wrote​:

Patch pushed as 4764858

There are a couple of problems with this patch, going beyond my
distaste for the inadequacy of the code change.

Zefram, that is an uncalled for comment and beneath you.

Firstly

, the commit
message contains exactly the kind of misconception that I argued against.

You had *weeks* to point this out and did not.

Rather than describing the effect as partial, the message says "This patch
safely iterates all elements", "This prevent crashing perl", and "It [is]
... workaround for Carp module". (Bad grammar all in the actual message.)
This is describing the patch as having an effectiveness that it doesn't.

You had weeks to point this out and did not.

Secondly, the newly added test is predicated on that very same mythical
effectiveness. It sets up a stack refcounting failure, at which point
behaviour is undefined, whether the Carp patch is applied or not.
This is not a correct test​: it can fail without anything getting more
wrong than it already is.

You had weeks to point this out and did not.

The last argument advanced in favour of this patch, unlike many that
had preceded it, was supposedly based on a correct understanding of its
limitations. I didn't argue against that, because my greater objection
is about the proposed patch being misunderstood and oversold.

Again, you had weeks to comment on the patch and did not.

What is the point of pushing a patch for review if people like you wait to
state your feedback until it has been merged.

Where it's
correctly understood, it's more a matter of opinion whether it's useful.

No it's not. It is clearly useful in some cases.

But the patch as applied doesn't seem to have taken that correct
understanding into account. It is firmly based on the misconception,
and I seem to have been fooled into dropping my objections by some talk
that doesn't actually reflect the totality of the patch

From my point of view you didn't drop your objections, you disengaged from
the process when challenged. If you wish to take a pasdive-aggressive
stance when people push back on your view then this kind of thing is the
inevitable result.

Again, you had weeks to raise any and all of these points but you chose not
to respond to my last mail.

Bonus problem​: the patch desynchronises the version numbers of Carp and
Carp​::Heavy, giving the former an unjustified underscored version.

That is a mistake, which I honestly thought I had fixed. Another patch was
made to Carp while this was "cooking" which did a full version bump, which
lead to conflicts, I then did a second patch which also did a version bump.
I used underscored versions because I didn't and don't think that core
module changes only present in a blead interim releases should result in
version number churn.

I would really appreciate it if you could take a step back, and perhaps
climb down from your pedestal when you communicate with me. Throwing around
terms like "unjustified" in this context is unhelpful and comes off as
unduly arrogant, and frankly just irritated me. We all know you are one of
the smartest and best hackers on this list, but you really don't need to
take that kind of attitude with people who are just trying to do their best
to move things forward. There are a million ways you could have started a
discussion about the version number changes without offending me.

Clearly the new test needs to be deleted.

No, that isn't clear.

  I don't know what to do about
the false commit message, though.

Again, your choice of words here, especially given the amount of time that
you have had to raise these concerns is really irritating. Even a minor
change like "misleading" would make me far more likely to engage you
seriously on this point and far less likely to be irritated.

The

message doesn't reference this
ticket, so putting a correction here doesn't serve as an annotation.
I'm tempted to revert the commit,

That would be the height of rudeness and most inappropriate. You had weeks
to write this mail and chose not to. That duration means to me you have
lost any right to take unilateral action.

If anybody is going to do anything like that it should be me.

on

the basis that the code change can
later be applied again with a correct message. Then at least git blame
would point at something correct.

You had weeks to raise these points and chose to stay silent. Next time try
a different strategy. This pasdive-aggressive say nothing until the patch
is merged and then ruthlessly criticise what has been merged is not how we
are supposed to work together.

I will do something about this later.

Yves

@p5pRT
Copy link
Author

p5pRT commented Feb 24, 2018

From zefram@fysh.org

demerphq wrote​:

Zefram, that is an uncalled for comment and beneath you.

It is uncalled for to say that there are a couple of problems with
the commit? Am I misunderstanding what constitutes civility?

You had weeks to raise these points and chose to stay silent. Next time try
a different strategy. This pasdive-aggressive say nothing until the patch
is merged and then ruthlessly criticise what has been merged is not how we
are supposed to work together.

This was not some ruse to make you look bad. The reason why I didn't
issue these comments before is that I wasn't aware of your commit message
and the rest until I examined it just now due to it having landed in
blead. It's true that I missed an opportunity to review it when you
pushed a branch, in that I didn't look at the branch. I was to some
extent trusting that your commit would reflect the understanding that
you had just expressed.

From my point of view you didn't drop your objections, you disengaged from
the process when challenged.

As I recall, I responded repeatedly when you challenged my objections.
I don't see any disengagement there. The point where I stopped responding
was where you stopped promoting the misconceptions to which I objected.
You switched to a question of taste, on which I had already stated my
opinion, and I had nothing further to add in response to your contrary
opinion.

That would be the height of rudeness and most inappropriate.

Well, that's part of why I didn't just do it, but floated the idea first.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Feb 24, 2018

From @demerphq

On 24 February 2018 at 02​:34, Zefram <zefram@​fysh.org> wrote​:

demerphq wrote​:

Zefram, that is an uncalled for comment and beneath you.

It is uncalled for to say that there are a couple of problems with
the commit? Am I misunderstanding what constitutes civility?

I was referring to "inadequacy of the code change", which I dont feel
is outright uncivil, but IMO is the type of comment that is beneath
someone of your obvious intelligence.

You had weeks to raise these points and chose to stay silent. Next time try
a different strategy. This pasdive-aggressive say nothing until the patch
is merged and then ruthlessly criticise what has been merged is not how we
are supposed to work together.

This was not some ruse to make you look bad.

Ok, I accept that. But please consider my feedback about your
communication style. You do not need to embed snide comments in your
feedback. It is quite off-putting, even for someone like myself who
has massive respect for you.

The reason why I didn't
issue these comments before is that I wasn't aware of your commit message
and the rest until I examined it just now due to it having landed in
blead. It's true that I missed an opportunity to review it when you
pushed a branch, in that I didn't look at the branch. I was to some
extent trusting that your commit would reflect the understanding that
you had just expressed.

And I still feel it does, however I respect that you feel differently
and will do my best to address your concerns.

Honestly if you had worded things differently I would have been
*happy* to do so.

From my point of view you didn't drop your objections, you disengaged from
the process when challenged.

As I recall, I responded repeatedly when you challenged my objections.
I don't see any disengagement there. The point where I stopped responding
was where you stopped promoting the misconceptions to which I objected.
You switched to a question of taste, on which I had already stated my
opinion, and I had nothing further to add in response to your contrary
opinion.

Well, given I provided a patch to review, and you did not engage my
final mail, and then hit me with a highly opinionated critique after I
pushed it certainly felt very passive-aggressive.

That would be the height of rudeness and most inappropriate.

Well, that's part of why I didn't just do it, but floated the idea first.

I will do my best to address this today. However I have other
obligations today, and may not get to it until tomorrow.

Rest assured I will do my best to respond to your feedback in a
mutually satisfactory way.

cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Feb 24, 2018

From @xsawyerx

On 02/24/2018 05​:00 AM, demerphq wrote​:

On 24 February 2018 at 02​:34, Zefram <zefram@​fysh.org> wrote​:

demerphq wrote​:

Zefram, that is an uncalled for comment and beneath you.
It is uncalled for to say that there are a couple of problems with
the commit? Am I misunderstanding what constitutes civility?
I was referring to "inadequacy of the code change", which I dont feel
is outright uncivil, but IMO is the type of comment that is beneath
someone of your obvious intelligence.

I think a different phrasing reflecting Zefram's intent would be "I
don't think this change is sufficient enough to cover the problem."

@p5pRT
Copy link
Author

p5pRT commented Feb 24, 2018

From @demerphq

On 24 Feb 2018 3​:11 pm, "Sawyer X" <xsawyerx@​gmail.com> wrote​:

On 02/24/2018 05​:00 AM, demerphq wrote​:

On 24 February 2018 at 02​:34, Zefram <zefram@​fysh.org> wrote​:

demerphq wrote​:

Zefram, that is an uncalled for comment and beneath you.
It is uncalled for to say that there are a couple of problems with
the commit? Am I misunderstanding what constitutes civility?
I was referring to "inadequacy of the code change", which I dont feel
is outright uncivil, but IMO is the type of comment that is beneath
someone of your obvious intelligence.

I think a different phrasing reflecting Zefram's intent would be "I
don't think this change is sufficient enough to cover the problem."

Zefram and I had a nice chat and a virtual handshake in private message
after these mails were posted.

I feel whatever misunderstanding there was between us on this matter has
been resolved and he and I have since worked together to figure out what to
do with Carp; which I might add is devilishly complex to change to address
these issues.

Sorry for involving the list in this, and thank you for offering your
thoughts.

Yves

@p5pRT
Copy link
Author

p5pRT commented Feb 24, 2018

From @demerphq

On 24 February 2018 at 01​:00, Zefram <zefram@​fysh.org> wrote​:

demerphq wrote​:

Patch pushed as 4764858

There are a couple of problems with this patch, going beyond my
distaste for the inadequacy of the code change. Firstly, the commit
message contains exactly the kind of misconception that I argued against.
Rather than describing the effect as partial, the message says "This patch
safely iterates all elements", "This prevent crashing perl", and "It [is]
... workaround for Carp module". (Bad grammar all in the actual message.)
This is describing the patch as having an effectiveness that it doesn't.

Ok, so, I have pushed​:

02c84d7

Which I believe documents your concerns here in both the code and the
commit message.

Secondly, the newly added test is predicated on that very same mythical
effectiveness. It sets up a stack refcounting failure, at which point
behaviour is undefined, whether the Carp patch is applied or not.
This is not a correct test​: it can fail without anything getting more
wrong than it already is.

We can debate this part now. ;-)

Do you really think this test should be removed? It seems a reasonable
thing to test given the intent of this patch series.

The last argument advanced in favour of this patch, unlike many that
had preceded it, was supposedly based on a correct understanding of its
limitations. I didn't argue against that, because my greater objection
is about the proposed patch being misunderstood and oversold. Where it's
correctly understood, it's more a matter of opinion whether it's useful.
But the patch as applied doesn't seem to have taken that correct
understanding into account. It is firmly based on the misconception,
and I seem to have been fooled into dropping my objections by some talk
that doesn't actually reflect the totality of the patch.

I really hope that the commentary I added satisfies your objections here.

I hope you understand that I had no intention to fool you in any way,
(and why you suggesting such might have annoyed me in previous
replies.)

Bonus problem​: the patch desynchronises the version numbers of Carp and
Carp​::Heavy, giving the former an unjustified underscored version.

/This/ point we need to debate. :-)

I believe that every distinct code state of a module in dist/ should
have a distinct version number.

I also believe that minor changes in dist modules during the lifetime
of a dev version should be marked as a "dev" versions, and that only
prior to release should we bump them to a "non-dev" version. I believe
that others have felt the same way in at least some cases as Carp has
code to defend against dev/underbar versions.

Related to this, I have bumped the versions twice now, meaning two
conceptually related changes have resulted in a jump from 1.45 to
1.48. Personally I think this is silly as it means there are version
gaps in the CPAN published copies of this code, for instance as far as
I can tell the latest version on CPAN is 1.38.

Clearly the new test needs to be deleted.

I really don't understand why, and have not done so myself.

I don't know what to do about
the false commit message, though. The message doesn't reference this
ticket, so putting a correction here doesn't serve as an annotation.
I'm tempted to revert the commit, on the basis that the code change can
later be applied again with a correct message. Then at least git blame
would point at something correct.

Given the code churn in Carp this was no longer an option, or at least
was a bad option.

I elected to push 02c84d7 as a
compromise. (See below)

I hope with the sequence of changes here you feel that your concerns
have been adequately addressed.

Cheers,
Yves

commit 02c84d7
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Sat Feb 24 12​:48​:46 2018 +0100

  Carp​: add comment explaining the fix for perl #131046

  In 4764858 I added code that
  eval's the argument extraction from DB​::args() (code written by
  both Pali and myself independently).

  Unfortunately the commit message for that commit was somewhat
  misleading, making it sound like the patch was a complete fix for
  the underlying problem of stack-not-refcounted bugs, when in fact
  it is actually a workaround, and a not-universally popular one
  either due to its incompleteness. For more details of why it is
  not popular see Zeframs commentary in
  https://rt.perl.org/Public/Bug/Display.html?id=131046

  Despite the concerns expressed by Zefram when considered from the
  POV of a large enterprise using Carp to trap exceptions from
  application code, having Carp throw its own exceptions about
  stack-not-refcounted bugs and thus hiding the applications error
  data is a significant issue.

  So in 4764858 we use eval to
  handle those cases where Perl /can/ detect a stack-not-refcounted
  error, and thus preserve as much of the applications error data
  as possible.

  This commit adds a comment to Carp which spells all of this out.

Inline Patch
diff --git a/dist/Carp/Changes b/dist/Carp/Changes
index db39203..2b549d9 100644
--- a/dist/Carp/Changes
+++ b/dist/Carp/Changes
@@ -1,3 +1,9 @@
+version 1.49
+
+ * comment only change, document the change from 1.47 better
+   and create a commit in blead-perl which can be used to
+   reference this issue from the bug report.
+
 version 1.48

  * guard against hand-rolled UNIVERSAL::can() implementations
diff --git a/dist/Carp/lib/Carp.pm b/dist/Carp/lib/Carp.pm
index 9956f12..419ba9c 100644
--- a/dist/Carp/lib/Carp.pm
+++ b/dist/Carp/lib/Carp.pm
@@ -116,7 +116,7 @@ BEGIN {
        ;
 }

-our $VERSION = '1.48';
+our $VERSION = '1.49';
 $VERSION =~ tr/_//d;

 our $MaxEvalLen = 0;
@@ -232,7 +232,18 @@ sub caller_info {

     my $sub_name = Carp::get_subname( \%call_info );
     if ( $call_info{has_args} ) {
-        # guard our serialization of the stack from stack refcounting bugs
+        # Guard our serialization of the stack from stack refcounting bugs
+        # NOTE this is NOT a complete solution, we cannot 100% guard against
+        # these bugs.  However in many cases Perl *is* capable of detecting
+        # them and throws an error when it does.  Unfortunately serializing
+        # the arguments on the stack is a perfect way of finding these bugs,
+        # even when they would not affect normal program flow that did not
+        # poke around inside the stack.  Inside of Carp.pm it makes little
+        # sense reporting these bugs, as Carp's job is to report the callers
+        # errors, not the ones it might happen to tickle while doing so.
+        # See: https://rt.perl.org/Public/Bug/Display.html?id=131046
+        # and: https://rt.perl.org/Public/Bug/Display.html?id=52610
+        # for more details and discussion. - Yves
         my @args = map {
                 my $arg;
                 local $@= $@;
diff --git a/dist/Carp/lib/Carp/Heavy.pm b/dist/Carp/lib/Carp/Heavy.pm
index 5188f40..e2b7292 100644
--- a/dist/Carp/lib/Carp/Heavy.pm
+++ b/dist/Carp/lib/Carp/Heavy.pm
@@ -2,7 +2,7 @@ package Carp::Heavy;

 use Carp ();

-our $VERSION = '1.48';
+our $VERSION = '1.49';
 $VERSION =~ tr/_//d;

 # Carp::Heavy was merged into Carp in version 1.12.  Any mismatched versions


-- 

perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Feb 24, 2018

From @cpansprout

On Sat, 24 Feb 2018 04​:33​:59 -0800, demerphq wrote​:

On 24 February 2018 at 01​:00, Zefram <zefram@​fysh.org> wrote​:

Bonus problem​: the patch desynchronises the version numbers of Carp
and
Carp​::Heavy, giving the former an unjustified underscored version.

/This/ point we need to debate. :-)

I believe that every distinct code state of a module in dist/ should
have a distinct version number.

Every commit? Or just every release?

I also believe that minor changes in dist modules during the lifetime
of a dev version should be marked as a "dev" versions, and that only
prior to release should we bump them to a "non-dev" version. I believe
that others have felt the same way in at least some cases as Carp has
code to defend against dev/underbar versions.

That puts a burden on perl maintainers to remember which version scheme to use in each case. I seem to remember it was Ricardo Signes that ruled that upstream-blead modules should not have underscores in their versions. That makes things much simpler for me, at least.

Related to this, I have bumped the versions twice now, meaning two
conceptually related changes have resulted in a jump from 1.45 to
1.48. Personally I think this is silly as it means there are version
gaps in the CPAN published copies of this code, for instance as far as
I can tell the latest version on CPAN is 1.38.

Again, we don’t need new versions for every commit, just every release.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Feb 24, 2018

From @demerphq

On 24 February 2018 at 16​:39, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

On Sat, 24 Feb 2018 04​:33​:59 -0800, demerphq wrote​:

On 24 February 2018 at 01​:00, Zefram <zefram@​fysh.org> wrote​:

Bonus problem​: the patch desynchronises the version numbers of Carp
and
Carp​::Heavy, giving the former an unjustified underscored version.

/This/ point we need to debate. :-)

I believe that every distinct code state of a module in dist/ should
have a distinct version number.

Every commit? Or just every release?

Every commit. And I thought we had tests for that too. Maybe I am
wrong tho. I haven't double checked. Anytime i have run a full
make-test after modifying something in dist i have also bumped the
version number.

I also believe that minor changes in dist modules during the lifetime
of a dev version should be marked as a "dev" versions, and that only
prior to release should we bump them to a "non-dev" version. I believe
that others have felt the same way in at least some cases as Carp has
code to defend against dev/underbar versions.

That puts a burden on perl maintainers to remember which version scheme to use in each case. I seem to remember it was Ricardo Signes that ruled that upstream-blead modules should not have underscores in their versions. That makes things much simpler for me, at least.

Er, I would have thought that it would be simple. Use underbars, and
only when preparing a release should we change to a non-underbar.

Anyway, since you say Ricardo ruled otherwise, then I accept that.
Perhaps we should remove the support for underbars from the code.

Related to this, I have bumped the versions twice now, meaning two
conceptually related changes have resulted in a jump from 1.45 to
1.48. Personally I think this is silly as it means there are version
gaps in the CPAN published copies of this code, for instance as far as
I can tell the latest version on CPAN is 1.38.

Again, we don’t need new versions for every commit, just every release.

So every minor version?

Personally i think that is the wrong policy, but I dont care enough to
argue for the alternative. I will stop bumping versions in dist then.

But for the record, i dont seem to be the only one who bumps the
version numbers in their patches.

Yves

@p5pRT
Copy link
Author

p5pRT commented Feb 24, 2018

From @cpansprout

On Sat, 24 Feb 2018 08​:07​:08 -0800, demerphq wrote​:

On 24 February 2018 at 16​:39, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

On Sat, 24 Feb 2018 04​:33​:59 -0800, demerphq wrote​:

On 24 February 2018 at 01​:00, Zefram <zefram@​fysh.org> wrote​:

Bonus problem​: the patch desynchronises the version numbers of
Carp
and
Carp​::Heavy, giving the former an unjustified underscored version.

/This/ point we need to debate. :-)

I believe that every distinct code state of a module in dist/ should
have a distinct version number.

Every commit? Or just every release?

Every commit. And I thought we had tests for that too. Maybe I am
wrong tho. I haven't double checked. Anytime i have run a full
make-test after modifying something in dist i have also bumped the
version number.

I also believe that minor changes in dist modules during the
lifetime
of a dev version should be marked as a "dev" versions, and that only
prior to release should we bump them to a "non-dev" version. I
believe
that others have felt the same way in at least some cases as Carp
has
code to defend against dev/underbar versions.

That puts a burden on perl maintainers to remember which version
scheme to use in each case. I seem to remember it was Ricardo Signes
that ruled that upstream-blead modules should not have underscores in
their versions. That makes things much simpler for me, at least.

Er, I would have thought that it would be simple. Use underbars, and
only when preparing a release should we change to a non-underbar.

Anyway, since you say Ricardo ruled otherwise, then I accept that.
Perhaps we should remove the support for underbars from the code.

But we will need to put it back if we backport Carp patches to maint branches.

Related to this, I have bumped the versions twice now, meaning two
conceptually related changes have resulted in a jump from 1.45 to
1.48. Personally I think this is silly as it means there are version
gaps in the CPAN published copies of this code, for instance as far
as
I can tell the latest version on CPAN is 1.38.

Again, we don’t need new versions for every commit, just every
release.

So every minor version?

Every blead release. 5.27.9 had Carp 1.26. 5.27.8 can have Carp 1.27.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Feb 25, 2018

From @demerphq

On 24 February 2018 at 16​:39, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

On Sat, 24 Feb 2018 04​:33​:59 -0800, demerphq wrote​:

On 24 February 2018 at 01​:00, Zefram <zefram@​fysh.org> wrote​:

Bonus problem​: the patch desynchronises the version numbers of Carp
and
Carp​::Heavy, giving the former an unjustified underscored version.

/This/ point we need to debate. :-)

I believe that every distinct code state of a module in dist/ should
have a distinct version number.

Every commit? Or just every release?

I also believe that minor changes in dist modules during the lifetime
of a dev version should be marked as a "dev" versions, and that only
prior to release should we bump them to a "non-dev" version. I believe
that others have felt the same way in at least some cases as Carp has
code to defend against dev/underbar versions.

That puts a burden on perl maintainers to remember which version scheme to use in each case. I seem to remember it was Ricardo Signes that ruled that upstream-blead modules should not have underscores in their versions. That makes things much simpler for me, at least.

This just came up in another context for me. Doesnt this policy just
move the burden around? Who is responsible for bumping the version
then? The first person to change the file in a given release? The RM?
It still seems to me that a version bump every patch is the sanest way
to deal with this stuff while at the same time not externalizing the
costs of the version updates onto someone else...

Yves

@p5pRT
Copy link
Author

p5pRT commented Feb 25, 2018

From @cpansprout

On Sun, 25 Feb 2018 02​:19​:50 -0800, demerphq wrote​:

On 24 February 2018 at 16​:39, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

That puts a burden on perl maintainers to remember which version
scheme to use in each case. I seem to remember it was Ricardo Signes
that ruled that upstream-blead modules should not have underscores in
their versions. That makes things much simpler for me, at least.

This just came up in another context for me. Doesnt this policy just
move the burden around? Who is responsible for bumping the version
then? The first person to change the file in a given release?

Yes, and we have a porting test that politely reminds the first person who changes each module.

The RM?
It still seems to me that a version bump every patch is the sanest way
to deal with this stuff while at the same time not externalizing the
costs of the version updates onto someone else...

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Mar 30, 2018

From zefram@fysh.org

demerphq wrote​:

02c84d7

That'll do, for accurately describing the purpose of the code change.

Do you really think this test should be removed?

Yes. It is not a reasonable thing to test, because, although the code
change aims to avoid crashing in that kind of situation, it doesn't (and
can't) guarantee to avoid a crash. A small change in memory allocation
patterns in the core, or a platform-dependent difference in such patterns,
could make that test fail, without introducing any flaw into the Carp
logic that we don't already know about. We also have no basis to say that
such minor differences in memory usage are themselves bugs. We therefore
have no reasonable expectation that this test would reliably pass.

I have deleted the offending test in commit
a77eff3.

-zefram

Vogtinator added a commit to Vogtinator/os-autoinst-distri-opensuse that referenced this issue Jun 10, 2021
Avoid a harmless warning, which however triggers printing a backtrace,
which crashes perl because one of the callers has improper stack variable
management (Perl/perl5#15928).
Vogtinator added a commit to os-autoinst/os-autoinst-distri-opensuse that referenced this issue Jun 10, 2021
Avoid a harmless warning, which however triggers printing a backtrace,
which crashes perl because one of the callers has improper stack variable
management (Perl/perl5#15928).
@jkeenan
Copy link
Contributor

jkeenan commented Dec 8, 2021

There has been no further discussion in this ticket since March 2018. Much of the discussion went on a tangent about version numbering; that would be better handled in a separate ticket.

I am taking this ticket for the purpose of closing it within 7 days unless someone raises a serious objection.

Thank you very much.
Jim Keenan

@jkeenan jkeenan self-assigned this Dec 8, 2021
@jkeenan
Copy link
Contributor

jkeenan commented Dec 20, 2021

There has been no further discussion in this ticket since March 2018. Much of the discussion went on a tangent about version numbering; that would be better handled in a separate ticket.

I am taking this ticket for the purpose of closing it within 7 days unless someone raises a serious objection.

Thank you very much. Jim Keenan

Closing ticket as per previous post.

@jkeenan jkeenan closed this as completed Dec 20, 2021
@poti1
Copy link

poti1 commented Aug 11, 2023

So I'm trying to find out how the above patch can cause a segmentation fault.

I'm in a production environment which is using confess and it works.
But occasionally confess causes a segmentation fault.

I traced the issue to Carp::caller_info I add a print before and after that map statement.
Once in a while only the print statement before the map is shown and the map or something else produces a SEGV.

I have an eval { ... create_an_object_which_uses_confess ... }, but that obviously does not catch SEGV and instead my test environment would not work correctly.

@poti1
Copy link

poti1 commented Aug 11, 2023

Anyone have an idea of why/how @DB::args can cause a SEGV?

btw, the test environment has about 21 stack levels and would fail after stack 20.

@demerphq
Copy link
Collaborator

demerphq commented Aug 11, 2023 via email

@poti1
Copy link

poti1 commented Aug 11, 2023

I had made a Devel::Peek::Dump of @DB::args and had seen these interesting lines (which supports your above comment):

    SV = UNKNOWN(0xff) (0x5609f15806e8) at 0x5609eac59660
      REFCNT = 0
      FLAGS = ()

@poti1
Copy link

poti1 commented Aug 11, 2023

Do you have a script/one-liner I can run to test this out? (your above snippet didnt cause SEGV).

Im my testing environment there is an eval around the function that runs confess.

I'm trying to get a small snippet to simulate this bug.

@demerphq
Copy link
Collaborator

demerphq commented Aug 11, 2023 via email

@demerphq
Copy link
Collaborator

demerphq commented Aug 11, 2023 via email

@poti1
Copy link

poti1 commented Aug 11, 2023

Thanks for that description: it helps to better understand how things work.

@poti1
Copy link

poti1 commented Aug 11, 2023

In my specific scenario, a bug was fixed by localizing a global cache:

local $GlobalCache = GlobalCache->new();

But that one statement somehow, 20 layers or so further down would cause the SEGV from confess.

@poti1
Copy link

poti1 commented Aug 12, 2023

So to figure out where it's breaking, I should see in the call frame what used to be the input (20 frames up) which now is REFCNT = 0.

Then somehow find what removed the data.

@demerphq
Copy link
Collaborator

demerphq commented Aug 12, 2023 via email

@iabyn
Copy link
Contributor

iabyn commented Aug 14, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants