New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PATCH] Carp: Do not crash when reading @DB::args #15928
Comments
From @paliCreated by @paliTrying to read values from array @DB::args can lead to perl fatal error Carp is primary used for reporting warnings and errors from other modules, This patch safely iterates all elements of @DB::args array via eval { } This prevent crashing perl and allows to use Carp module. It it not a Bug: https://rt.perl.org/Public/Bug/Display.html?id=52610 Perl Info
|
From @pali0001-Carp-Do-not-crash-when-reading-DB-args.patchFrom a0be329581b1a1065ccb2efac495a670a953fa3e Mon Sep 17 00:00:00 2001
From: Pali <pali@cpan.org>
Date: Tue, 21 Mar 2017 18:43:45 +0100
Subject: [PATCH] Carp: Do not crash when reading @DB::args
Trying to read values from array @DB::args can lead to perl fatal error
"Bizarre copy of ARRAY in scalar assignment". But missing, incomplete or
possible incorrect value in @DB::args is not a fatal error for Carp.
Carp is primary used for reporting warnings and errors from other modules,
so it should not crash perl when trying to print error message.
This patch safely iterates all elements of @DB::args array via eval { }
block and replace already freed scalars for Carp usage by string
"** argument not available anymore **".
This prevent crashing perl and allows to use Carp module. It it not a
proper fix but rather workaround for Carp module. At least it allows to
safely use Carp.
Bug: https://rt.perl.org/Public/Bug/Display.html?id=52610
---
dist/Carp/lib/Carp.pm | 12 +++++++++++-
dist/Carp/t/rt52610_crash.t | 25 +++++++++++++++++++++++++
2 files changed, 36 insertions(+), 1 deletion(-)
create mode 100644 dist/Carp/t/rt52610_crash.t
diff --git a/dist/Carp/lib/Carp.pm b/dist/Carp/lib/Carp.pm
index 05052b9..e288ede 100644
--- a/dist/Carp/lib/Carp.pm
+++ b/dist/Carp/lib/Carp.pm
@@ -226,7 +226,17 @@ sub caller_info {
= "** Incomplete caller override detected$where; \@DB::args were not set **";
}
else {
- @args = @DB::args;
+ @args = map {
+ my $arg;
+ local $@;
+ eval {
+ $arg = $_;
+ 1;
+ } or do {
+ $arg = '** argument not available anymore **';
+ };
+ $arg;
+ } @DB::args;
my $overflow;
if ( $MaxArgNums and @args > $MaxArgNums )
{ # More than we want to show?
diff --git a/dist/Carp/t/rt52610_crash.t b/dist/Carp/t/rt52610_crash.t
new file mode 100644
index 0000000..faa19cb
--- /dev/null
+++ b/dist/Carp/t/rt52610_crash.t
@@ -0,0 +1,25 @@
+use warnings;
+use strict;
+
+use Test::More tests => 1;
+
+use Carp ();
+
+sub do_carp {
+ Carp::longmess;
+}
+
+sub call_with_args {
+ my ($arg_hash, $func) = @_;
+ $func->(@{$arg_hash->{'args'}});
+}
+
+my $msg;
+my $h = {};
+my $arg_hash = {'args' => [undef]};
+call_with_args($arg_hash, sub {
+ $arg_hash->{'args'} = [];
+ $msg = do_carp(sub { $h; });
+});
+
+like $msg, qr/^ at.+\b(?i:rt52610_crash\.t) line \d+\.\n\tmain::__ANON__\(.*\) called at.+\b(?i:rt52610_crash\.t) line \d+\n\tmain::call_with_args\(HASH\(0x[[:xdigit:]]+\), CODE\(0x[[:xdigit:]]+\)\) called at.+\b(?i:rt52610_crash\.t) line \d+$/;
--
1.7.9.5
|
From zefram@fysh.orgvia RT wrote:
It's not safe. The underlying problem is stack refcounting, leading to This manifestation of the stack refcounting problem can only be solved -zefram |
The RT System itself - Status changed from 'new' to 'open' |
From @paliOn Saturday 25 March 2017 14:40:18 Zefram via RT wrote:
This patch is fixing problem with Carp when it dies with error "Bizarre Working Carp module is crucial for debugging and reporting errors. So it |
From @demerphqOn 25 March 2017 at 22:39, Zefram <zefram@fysh.org> wrote:
I really feel this argument is unsound. We can encounter a stack It is also unsympathetic to the needs of the developer, if I am
This ticket isn't about fixing the stack refcounting, it is about not I wrote almost exactly the same patch that pali did, and I really Also I guess you missed it, at the time, but pali said as much as I cheers, |
From zefram@fysh.orgdemerphq wrote:
We can indeed, but that doesn't make it any more possible to work around
Yes, and that objective is not achievable. The stack-not-refcounted -zefram |
From @demerphqOn 1 February 2018 at 10:17, Zefram <zefram@fysh.org> wrote:
I really feel like you are missing the point. By adding the eval we do If the stack refcounting bug leads to a SEGV, then so be it, there is By that I mean the fact I know about this issue is because our error Your argument comes down to "we cant do this perfectly so we shouldn't Carp throwing trappable exceptions while serializing the stack is of With respect, unless you can provide a better argument or you decide cheers, -- |
From @demerphqOn 1 February 2018 at 10:35, demerphq <demerphq@gmail.com> wrote:
Patch pushed as 4764858 Note this ticket is a duplicate of Perl #52610 Yves -- |
From @demerphqOn 23 February 2018 at 10:36, demerphq <demerphq@gmail.com> wrote:
Actually, this ticket is a workaround for the bug document at Perl Yves -- |
From zefram@fysh.orgdemerphq wrote:
There are a couple of problems with this patch, going beyond my Secondly, the newly added test is predicated on that very same mythical The last argument advanced in favour of this patch, unlike many that Bonus problem: the patch desynchronises the version numbers of Carp and Clearly the new test needs to be deleted. I don't know what to do about -zefram |
From @demerphqOn 24 Feb 2018 08:01, "Zefram" <zefram@fysh.org> wrote: demerphq wrote:
There are a couple of problems with this patch, going beyond my Zefram, that is an uncalled for comment and beneath you. Firstly , the commit You had *weeks* to point this out and did not. Rather than describing the effect as partial, the message says "This patch You had weeks to point this out and did not. Secondly, the newly added test is predicated on that very same mythical You had weeks to point this out and did not. The last argument advanced in favour of this patch, unlike many that Again, you had weeks to comment on the patch and did not. What is the point of pushing a patch for review if people like you wait to Where it's No it's not. It is clearly useful in some cases. But the patch as applied doesn't seem to have taken that correct From my point of view you didn't drop your objections, you disengaged from Again, you had weeks to raise any and all of these points but you chose not Bonus problem: the patch desynchronises the version numbers of Carp and That is a mistake, which I honestly thought I had fixed. Another patch was I would really appreciate it if you could take a step back, and perhaps Clearly the new test needs to be deleted. No, that isn't clear. I don't know what to do about Again, your choice of words here, especially given the amount of time that The message doesn't reference this That would be the height of rudeness and most inappropriate. You had weeks If anybody is going to do anything like that it should be me. on the basis that the code change can You had weeks to raise these points and chose to stay silent. Next time try I will do something about this later. Yves |
From zefram@fysh.orgdemerphq wrote:
It is uncalled for to say that there are a couple of problems with
This was not some ruse to make you look bad. The reason why I didn't
As I recall, I responded repeatedly when you challenged my objections.
Well, that's part of why I didn't just do it, but floated the idea first. -zefram |
From @demerphqOn 24 February 2018 at 02:34, Zefram <zefram@fysh.org> wrote:
I was referring to "inadequacy of the code change", which I dont feel
Ok, I accept that. But please consider my feedback about your
And I still feel it does, however I respect that you feel differently Honestly if you had worded things differently I would have been
Well, given I provided a patch to review, and you did not engage my
I will do my best to address this today. However I have other Rest assured I will do my best to respond to your feedback in a cheers, -- |
From @xsawyerxOn 02/24/2018 05:00 AM, demerphq wrote:
I think a different phrasing reflecting Zefram's intent would be "I |
From @demerphqOn 24 Feb 2018 3:11 pm, "Sawyer X" <xsawyerx@gmail.com> wrote: On 02/24/2018 05:00 AM, demerphq wrote:
I think a different phrasing reflecting Zefram's intent would be "I Zefram and I had a nice chat and a virtual handshake in private message I feel whatever misunderstanding there was between us on this matter has Sorry for involving the list in this, and thank you for offering your Yves |
From @demerphqOn 24 February 2018 at 01:00, Zefram <zefram@fysh.org> wrote:
Ok, so, I have pushed: Which I believe documents your concerns here in both the code and the
We can debate this part now. ;-) Do you really think this test should be removed? It seems a reasonable
I really hope that the commentary I added satisfies your objections here. I hope you understand that I had no intention to fool you in any way,
/This/ point we need to debate. :-) I believe that every distinct code state of a module in dist/ should I also believe that minor changes in dist modules during the lifetime Related to this, I have bumped the versions twice now, meaning two
I really don't understand why, and have not done so myself.
Given the code churn in Carp this was no longer an option, or at least I elected to push 02c84d7 as a I hope with the sequence of changes here you feel that your concerns Cheers, commit 02c84d7 Carp: add comment explaining the fix for perl #131046 In 4764858 I added code that Unfortunately the commit message for that commit was somewhat Despite the concerns expressed by Zefram when considered from the So in 4764858 we use eval to This commit adds a comment to Carp which spells all of this out. Inline Patchdiff --git a/dist/Carp/Changes b/dist/Carp/Changes
index db39203..2b549d9 100644
--- a/dist/Carp/Changes
+++ b/dist/Carp/Changes
@@ -1,3 +1,9 @@
+version 1.49
+
+ * comment only change, document the change from 1.47 better
+ and create a commit in blead-perl which can be used to
+ reference this issue from the bug report.
+
version 1.48
* guard against hand-rolled UNIVERSAL::can() implementations
diff --git a/dist/Carp/lib/Carp.pm b/dist/Carp/lib/Carp.pm
index 9956f12..419ba9c 100644
--- a/dist/Carp/lib/Carp.pm
+++ b/dist/Carp/lib/Carp.pm
@@ -116,7 +116,7 @@ BEGIN {
;
}
-our $VERSION = '1.48';
+our $VERSION = '1.49';
$VERSION =~ tr/_//d;
our $MaxEvalLen = 0;
@@ -232,7 +232,18 @@ sub caller_info {
my $sub_name = Carp::get_subname( \%call_info );
if ( $call_info{has_args} ) {
- # guard our serialization of the stack from stack refcounting bugs
+ # Guard our serialization of the stack from stack refcounting bugs
+ # NOTE this is NOT a complete solution, we cannot 100% guard against
+ # these bugs. However in many cases Perl *is* capable of detecting
+ # them and throws an error when it does. Unfortunately serializing
+ # the arguments on the stack is a perfect way of finding these bugs,
+ # even when they would not affect normal program flow that did not
+ # poke around inside the stack. Inside of Carp.pm it makes little
+ # sense reporting these bugs, as Carp's job is to report the callers
+ # errors, not the ones it might happen to tickle while doing so.
+ # See: https://rt.perl.org/Public/Bug/Display.html?id=131046
+ # and: https://rt.perl.org/Public/Bug/Display.html?id=52610
+ # for more details and discussion. - Yves
my @args = map {
my $arg;
local $@= $@;
diff --git a/dist/Carp/lib/Carp/Heavy.pm b/dist/Carp/lib/Carp/Heavy.pm
index 5188f40..e2b7292 100644
--- a/dist/Carp/lib/Carp/Heavy.pm
+++ b/dist/Carp/lib/Carp/Heavy.pm
@@ -2,7 +2,7 @@ package Carp::Heavy;
use Carp ();
-our $VERSION = '1.48';
+our $VERSION = '1.49';
$VERSION =~ tr/_//d;
# Carp::Heavy was merged into Carp in version 1.12. Any mismatched versions
-- perl -Mre=debug -e "/just|another|perl|hacker/" |
From @cpansproutOn Sat, 24 Feb 2018 04:33:59 -0800, demerphq wrote:
Every commit? Or just every release?
That puts a burden on perl maintainers to remember which version scheme to use in each case. I seem to remember it was Ricardo Signes that ruled that upstream-blead modules should not have underscores in their versions. That makes things much simpler for me, at least.
Again, we don’t need new versions for every commit, just every release. -- Father Chrysostomos |
From @demerphqOn 24 February 2018 at 16:39, Father Chrysostomos via RT
Every commit. And I thought we had tests for that too. Maybe I am
Er, I would have thought that it would be simple. Use underbars, and Anyway, since you say Ricardo ruled otherwise, then I accept that.
So every minor version? Personally i think that is the wrong policy, but I dont care enough to But for the record, i dont seem to be the only one who bumps the Yves |
From @cpansproutOn Sat, 24 Feb 2018 08:07:08 -0800, demerphq wrote:
But we will need to put it back if we backport Carp patches to maint branches.
Every blead release. 5.27.9 had Carp 1.26. 5.27.8 can have Carp 1.27. -- Father Chrysostomos |
From @demerphqOn 24 February 2018 at 16:39, Father Chrysostomos via RT
This just came up in another context for me. Doesnt this policy just Yves |
From @cpansproutOn Sun, 25 Feb 2018 02:19:50 -0800, demerphq wrote:
Yes, and we have a porting test that politely reminds the first person who changes each module.
-- Father Chrysostomos |
From zefram@fysh.orgdemerphq wrote: That'll do, for accurately describing the purpose of the code change.
Yes. It is not a reasonable thing to test, because, although the code I have deleted the offending test in commit -zefram |
Avoid a harmless warning, which however triggers printing a backtrace, which crashes perl because one of the callers has improper stack variable management (Perl/perl5#15928).
Avoid a harmless warning, which however triggers printing a backtrace, which crashes perl because one of the callers has improper stack variable management (Perl/perl5#15928).
There has been no further discussion in this ticket since March 2018. Much of the discussion went on a tangent about version numbering; that would be better handled in a separate ticket. I am taking this ticket for the purpose of closing it within 7 days unless someone raises a serious objection. Thank you very much. |
Closing ticket as per previous post. |
So I'm trying to find out how the above patch can cause a segmentation fault. I'm in a production environment which is using I traced the issue to I have an |
Anyone have an idea of why/how btw, the test environment has about 21 stack levels and would fail after stack 20. |
On Fri, 11 Aug 2023, 09:24 poti1, ***@***.***> wrote:
Anyone have an idea of why/how @db::args can cause a SEGV?
Currently the stack is not refcounted and it is possible that an item gets
freed even though it is still on the stack. Often this is not an issue from
the point of view of correct code not using the freed item so you dont see
errors in normal operation, but an unrelated confess() will trigger them as
it serializes the stack. Modern versions of Carp are armored to reduce
these problems but it's pretty hard to totally eliminate them.
We have work happening right now to refcount the stack but that probably
won't help you for some time.
The general pattern for these issues is pushing a container variable of
some form onto the stack along with something it contains, and then
emptying or replacing the contents of the container. Eg something like this
untested code:
my @x = ([]);
***@***.***,$x[0]);
sub F { FF(@_) }
sub FF {
my ($ar) = @_;
@$ar = ();
confess "boom"
}
Often when we see reports about this it relates to globs, or global vars,
but it can happen with any container type var.
btw, the test environment has about 21 stack levels and would fail after
stack 20.
Presumably stack level 20 pushes the container and its contents and layer
21 empties the container and then does a confess.
Yves
… |
I had made a
|
Do you have a script/one-liner I can run to test this out? (your above snippet didnt cause SEGV). Im my testing environment there is an I'm trying to get a small snippet to simulate this bug. |
On Fri, 11 Aug 2023 at 11:09, poti1 ***@***.***> wrote:
Do you have a script/one-liner I can run to test this out? (your above
snippet didnt cause SEGV).
Im my testing environment there is an eval around the function that runs
confess.
I'm trying to get a small snippet to simulate this bug.
Sorry no, I don't have a script/one-liner to help test this out. I suspect
that more than one thing needs to go wrong at once. The code that I gave
you already is the root cause: premature freeing of a value on the stack.
However if you are on a sufficiently modern perl confess should be armored
against the scenario you showed with the REFCOUNT = 0 dump.
…--
perl -Mre=debug -e "/just|another|perl|hacker/"
|
On Fri, 11 Aug 2023 at 17:08, demerphq ***@***.***> wrote:
On Fri, 11 Aug 2023 at 11:09, poti1 ***@***.***> wrote:
> Do you have a script/one-liner I can run to test this out? (your above
> snippet didnt cause SEGV).
>
> Im my testing environment there is an eval around the function that runs
> confess.
>
> I'm trying to get a small snippet to simulate this bug.
>
Sorry no, I don't have a script/one-liner to help test this out. I
suspect that more than one thing needs to go wrong at once. The code that
I gave you already is the root cause: premature freeing of a value on the
stack. However if you are on a sufficiently modern perl confess should be
armored against the scenario you showed with the REFCOUNT = 0 dump.
Sorry, premature send, I meant to paste in the result you got earlier.
SV = UNKNOWN(0xff) (0x5609f15806e8) at 0x5609eac59660
REFCNT = 0
FLAGS = ()
I believe that confess should treat that part as equivalent to an
undef but also throw a warning. I *think* that to turn it into a SEGV
something else needs to happen, possibly Perl needs to re-use this SV
slot for a data structure that is not serializable by confess, which
then causes the SEGV. When I say "SV slot" what I mean is that Perl
allocates SV's (actually their constituent parts) in "slabs", and
reuses previously allocated SV's when they have been freed. So i
think the sequence is something like this:
1. Code pushes a container and its content into the stack at the
same time. (Not necessarily the same code.)
2. The container is emptied, resulting in the contents being freed.
3. More SV's are allocated, reusing the now freed SV. Possibly the
SV types that aren't Perl visible (for instance an INVLIST SV comes to
mind, but that is a totally off the top of my head guess.)
4. Code does a confess, which tries to the stringify the arguments,
and hits a type that can't be stringified, causing the segfault.
Assuming this is the cause of the segfault, and it is not certain,
there are other possible causes I could think of (historically if
Carp::confess ends up calling confess you can also segfault), and
while this probably /is/ deterministic, coming up with a script that
recreates the segfault is harder than coming up with a script that
replicates the premature free. The segfault will be very sensitive to
order of operations, and may even require XS code to be involved.
OTOH, In older perls we handled this special case differently during
confess, and we would segfault every time (iirc). I feel your pain on
debugging this, it used to be a regular issue until we patched confess
to be hardened against this problem.
cheers,
Yves
…--
perl -Mre=debug -e "/just|another|perl|hacker/"
|
Thanks for that description: it helps to better understand how things work. |
In my specific scenario, a bug was fixed by localizing a global cache:
But that one statement somehow, 20 layers or so further down would cause the |
So to figure out where it's breaking, I should see in the call frame what used to be the input (20 frames up) which now is Then somehow find what removed the data. |
On Sat, 12 Aug 2023, 11:57 poti1, ***@***.***> wrote:
So to figure out where it's breaking, I should see in the call frame what
used to be the input (20 frames up) which now is REFCNT = 0.
Then somehow find what removed the data.
Yeah, this is going to be tricky. One thing you might try is to replace
confess with your own stack dumper based on Devel::Peek, and then see what
it says, force triggering it one frame before the frame that currently
dies, and then compare the two. If my suspicions are correct then I think
there is at least some chance that Devel::Peek does not segfault where
confess does.
Btw do you have any sig handlers in place here?
Yves
… |
On Fri, Aug 11, 2023 at 08:22:28AM -0700, Yves Orton wrote:
Sorry, premature send, I meant to paste in the result you got earlier.
SV = UNKNOWN(0xff) (0x5609f15806e8) at 0x5609eac59660
REFCNT = 0
FLAGS = ()
I believe that confess should treat that part as equivalent to an
undef but also throw a warning.
Some of the changes in my soon-to-be-merged rc_stack branch affect
@db::args handling, even on default builds. In particular, SVs with a
refcnt of 0 are no longer aliased into @db::args, they become undef
instead.
…--
No matter how many dust sheets you use, you will get paint on the carpet.
|
Migrated from rt.perl.org#131046 (status was 'open')
Searchable as RT131046$
The text was updated successfully, but these errors were encountered: