Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

null ptr deref, segfault in Perl_pp_split () at pp.c:5738 #15576

Closed
p5pRT opened this issue Sep 1, 2016 · 9 comments
Closed

null ptr deref, segfault in Perl_pp_split () at pp.c:5738 #15576

p5pRT opened this issue Sep 1, 2016 · 9 comments

Comments

@p5pRT
Copy link

p5pRT commented Sep 1, 2016

Migrated from rt.perl.org#129158 (status was 'resolved')

Searchable as RT129158$

@p5pRT
Copy link
Author

p5pRT commented Sep 1, 2016

From @geeknik

Perl v5.25.5 (v5.25.4-25-g109ac34*), found with AFL + ASAN. A non-instrumented build of
v5.25.4-5-g92d73bf returns the valgrind output at the end.

hexdump -C over727
00000000 6d 61 70 7b 73 5b 5d 5b 5d 6f 3e 73 70 6c 69 74 |map{s[][]o>split|
00000010 00 30 2c 24 30 5b 73 70 6c 69 74 2f 28 30 29 2f |.0,$0[split/(0)/|
00000020 3e 30 5d 3e 6d 61 70 7b 30 3f 7b 7d 3a 30 7d 30 |>0]>map{0?{}​:0}0|
00000030 7d 3c 44 41 54 41 3e 5f 5f 45 4e 44 5f 5f 0a 0a |}<DATA>__END__..|
00000040 30 |0|
00000041

ASAN​:SIGSEGV

==8284==ERROR​: AddressSanitizer​: SEGV on unknown address 0x000000000000 (pc 0x000000a0c999 bp 0x7fffa83f31a0 sp 0x7fffa83f3000 T0)
  #0 0xa0c998 in Perl_pp_split /root/perl/pp.c​:5738​:5
  #1 0x7f26a3 in Perl_runops_debug /root/perl/dump.c​:2234​:23
  #2 0x5a10c6 in S_run_body /root/perl/perl.c​:2525​:2
  #3 0x5a10c6 in perl_run /root/perl/perl.c​:2448
  #4 0x4de6cd in main /root/perl/perlmain.c​:123​:9
  #5 0x7f5b16991b44 in __libc_start_main /build/glibc-uPj9cH/glibc-2.19/csu/libc-start.c​:287
  #6 0x4de33c in _start (/root/perl/perl+0x4de33c)

AddressSanitizer can not provide additional info.
SUMMARY​: AddressSanitizer​: SEGV /root/perl/pp.c​:5738 Perl_pp_split
==8284==ABORTING

Program received signal SIGSEGV, Segmentation fault.
0x0000000000a0c999 in Perl_pp_split () at pp.c​:5738
5738 rx = PM_GETRE(pm);
(gdb) bt
#0 0x0000000000a0c999 in Perl_pp_split () at pp.c​:5738
#1 0x00000000007f26a4 in Perl_runops_debug () at dump.c​:2234
#2 0x00000000005a10c7 in S_run_body (oldscope=<optimized out>) at perl.c​:2525
#3 perl_run (my_perl=<optimized out>) at perl.c​:2448
#4 0x00000000004de6ce in main (argc=<optimized out>, argv=<optimized out>, env=<optimized out>) at perlmain.c​:123

Non-instrumented v5.25.4-5-g92d73bf Valgrind output​:
==27188== Conditional jump or move depends on uninitialised value(s)
==27188== at 0x55E88C​: Perl_pp_split (pp.c​:5736)
==27188== by 0x4D6261​: Perl_runops_debug (dump.c​:2234)
==27188== by 0x452E96​: S_run_body (perl.c​:2525)
==27188== by 0x452E96​: perl_run (perl.c​:2448)
==27188== by 0x421834​: main (perlmain.c​:123)
==27188==
==27188== Conditional jump or move depends on uninitialised value(s)
==27188== at 0x4D8312​: Perl_die (util.c​:1719)
==27188== by 0x5604C1​: Perl_pp_split (pp.c​:5737)
==27188== by 0x4D6261​: Perl_runops_debug (dump.c​:2234)
==27188== by 0x452E96​: S_run_body (perl.c​:2525)
==27188== by 0x452E96​: perl_run (perl.c​:2448)
==27188== by 0x421834​: main (perlmain.c​:123)
==27188==
panic​: pp_split, pm=0, s=6def41 at over727 line 1, <DATA> line 2.

@p5pRT
Copy link
Author

p5pRT commented Sep 1, 2016

From @geeknik

over727.gz

@p5pRT
Copy link
Author

p5pRT commented Sep 12, 2016

From @cpansprout

On Thu Sep 01 01​:49​:12 2016, brian.carpenter@​gmail.com wrote​:

Perl v5.25.5 (v5.25.4-25-g109ac34*), found with AFL + ASAN. A non-
instrumented build of
v5.25.4-5-g92d73bf returns the valgrind output at the end.

I can reproduce it on dromedary, but not locally. On dromedary I don’t have a functional gdb, so it’s a little hard to debug.

I tried bisecting, but got perl-5.6.0-4727-g4cddb5c, which seems like a red herring.

I managed to reduce it to this​:

$ cat foo
map{s///o > split 0,split /0/>0}<DATA>__END__

$

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Sep 12, 2016

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Sep 12, 2016

From @iabyn

On Sun, Sep 11, 2016 at 10​:17​:39PM -0700, Father Chrysostomos via RT wrote​:

On Thu Sep 01 01​:49​:12 2016, brian.carpenter@​gmail.com wrote​:

Perl v5.25.5 (v5.25.4-25-g109ac34*), found with AFL + ASAN. A non-
instrumented build of
v5.25.4-5-g92d73bf returns the valgrind output at the end.

I can reproduce it on dromedary, but not locally. On dromedary I don’t have a functional gdb, so it’s a little hard to debug.

I tried bisecting, but got perl-5.6.0-4727-g4cddb5c, which seems like a red herring.

I managed to reduce it to this​:

$ cat foo
map{s///o > split 0,split /0/>0}<DATA>__END__

It's only an issue on non-threaded builds. It started failing sometime
between 5.16.0 and 5.18.0.
I'm looking into it.

--
"Procrastination grows to fill the available time"
  -- Mitchell's corollary to Parkinson's Law

@p5pRT
Copy link
Author

p5pRT commented Sep 26, 2016

From @iabyn

On Mon, Sep 12, 2016 at 03​:23​:07PM +0100, Dave Mitchell wrote​:

On Sun, Sep 11, 2016 at 10​:17​:39PM -0700, Father Chrysostomos via RT wrote​:

On Thu Sep 01 01​:49​:12 2016, brian.carpenter@​gmail.com wrote​:

Perl v5.25.5 (v5.25.4-25-g109ac34*), found with AFL + ASAN. A non-
instrumented build of
v5.25.4-5-g92d73bf returns the valgrind output at the end.

I can reproduce it on dromedary, but not locally. On dromedary I don’t have a functional gdb, so it’s a little hard to debug.

I tried bisecting, but got perl-5.6.0-4727-g4cddb5c, which seems like a red herring.

I managed to reduce it to this​:

$ cat foo
map{s///o > split 0,split /0/>0}<DATA>__END__

It's only an issue on non-threaded builds. It started failing sometime
between 5.16.0 and 5.18.0.
I'm looking into it.

It turns out that this is another variant of RT #124368​: on non-threaded
builds, the s///o combines​: modifying the op tree under /o to avoid
recompilation on subsequent iterations; and m// using the last successful
pattern. The combination of the two makes /o end up modifying some random
other part of the op tree.

I haven't fixed this yet, but while investigating what was going on, I
decided I would finally sort out the horrible OP_SPLIT/OP_PUSHRE mess,
which I have now done and am smoking as smoke-me/davem/pushre2.

The most significant commit from that branch is the following, which
I intend to merge into blead in few days time. Note this it is *not* a fix
for this ticket.

commit e505c6f
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Thu Sep 15 10​:59​:37 2016 +0100
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Sun Sep 25 07​:57​:49 2016 +0100

  make OP_SPLIT a PMOP, and eliminate OP_PUSHRE
 
  Most ops that execute a regex, such as match and subst, are of type PMOP.
  A PMOP allows the actual regex to be attached directly to that op, due
  to its extra fields.
 
  OP_SPLIT is different; it is just a plain LISTOP, but it always has an
  OP_PUSHRE as its first child, which *is* a PMOP and which has the regex
  attached.
 
  At runtime, pp_pushre()'s only job is to push itself (i.e. the current
  PL_op) onto the stack. Later pp_split() pops this to get access to the
  regex it wants to execute.
 
  This is a bit unpleasant, because we're pushing an OP* onto the stack,
  which is supposed to be an array of SV*'s. As a bit of a hack, on
  DEBUGGING builds we push a PVLV with the PL_op address embedded instead,
  but this still isn't very satisfactory.
 
  Now that regexes are first-class SVs, we could push a REGEXP onto the
  stack rather than PL_op. However, there is an optimisation of @​array =
  split which eliminates the assign and embeds the array's GV/padix directly
  in the PUSHRE op. So split still needs access to that op. But the pushre
  op will always be splitop->op_first anyway, so one possibility is to just
  skip executing the pushre altogether, and make pp_split just directly
  access op_first instead to get the regex and @​array info.
 
  But if we're doing that, then why not just go the full hog and make
  OP_SPLIT into a PMOP, and eliminate the OP_PUSHRE op entirely​: with the
  data that was spread across the two ops now combined into just the one
  split op.
 
  That is exactly what this commit does.
 
  For a simple compile-time pattern like split(/foo/, $s, 1), the optree
  looks like​:
 
  before​:
  <@​> split[t2] lK
  </> pushre(/"foo"/) s/RTIME
  <0> padsv[$s​:1,2] s
  <$> const(IV 1) s
 
  after​:
  </> split(/"foo"/)[t2] lK/RTIME
  <0> padsv[$s​:1,2] s
  <$> const[IV 1] s
 
  while for a run-time expression like split(/$pat/, $s, 1),
 
  before​:
  <@​> split[t3] lK
  </> pushre() sK/RTIME
  <|> regcomp(other->8) sK
  <0> padsv[$pat​:2,3] s
  <0> padsv[$s​:1,3] s
  <$> const(IV 1)s
 
  after​:
  </> split()[t3] lK/RTIME
  <|> regcomp(other->8) sK
  <0> padsv[$pat​:2,3] s
  <0> padsv[$s​:1,3] s
  <$> const[IV 1] s
 
  This makes the code faster and simpler.
 
  At the same time, two new private flags have been added for OP_SPLIT -
  OPpSPLIT_ASSIGN and OPpSPLIT_LEX - which make it explicit that the
  assign op has been optimised away, and if so, whether the array is
  lexical.
 
  Also, deparsing of split has been improved, to the extent that
 
  perl TEST -deparse op/split.t
 
  now passes.
 
  Also, a couple of panic messages in pp_split() have been replaced with
  asserts().

--
Modern art​:
  "That's easy, I could have done that!"
  "Ah, but you didn't!"

@p5pRT
Copy link
Author

p5pRT commented Oct 28, 2017

From @jkeenan

On Mon, 26 Sep 2016 11​:58​:56 GMT, davem wrote​:

On Mon, Sep 12, 2016 at 03​:23​:07PM +0100, Dave Mitchell wrote​:

On Sun, Sep 11, 2016 at 10​:17​:39PM -0700, Father Chrysostomos via RT
wrote​:

On Thu Sep 01 01​:49​:12 2016, brian.carpenter@​gmail.com wrote​:

Perl v5.25.5 (v5.25.4-25-g109ac34*), found with AFL + ASAN. A
non-
instrumented build of
v5.25.4-5-g92d73bf returns the valgrind output at the end.

I can reproduce it on dromedary, but not locally. On dromedary I
don’t have a functional gdb, so it’s a little hard to debug.

I tried bisecting, but got perl-5.6.0-4727-g4cddb5c, which seems
like a red herring.

I managed to reduce it to this​:

$ cat foo
map{s///o > split 0,split /0/>0}<DATA>__END__

It's only an issue on non-threaded builds. It started failing
sometime
between 5.16.0 and 5.18.0.
I'm looking into it.

It turns out that this is another variant of RT #124368​: on non-
threaded
builds, the s///o combines​: modifying the op tree under /o to avoid
recompilation on subsequent iterations; and m// using the last
successful
pattern. The combination of the two makes /o end up modifying some
random
other part of the op tree.

I haven't fixed this yet, but while investigating what was going on, I
decided I would finally sort out the horrible OP_SPLIT/OP_PUSHRE mess,
which I have now done and am smoking as smoke-me/davem/pushre2.

The most significant commit from that branch is the following, which
I intend to merge into blead in few days time. Note this it is *not* a
fix
for this ticket.

Dave, Father C, et al.,

Could we get an update on the status of this ticket?

commit e505c6f
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Thu Sep 15 10​:59​:37 2016 +0100
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Sun Sep 25 07​:57​:49 2016 +0100

make OP_SPLIT a PMOP, and eliminate OP_PUSHRE

Most ops that execute a regex, such as match and subst, are of type
PMOP.
A PMOP allows the actual regex to be attached directly to that op, due
to its extra fields.

OP_SPLIT is different; it is just a plain LISTOP, but it always has an
OP_PUSHRE as its first child, which *is* a PMOP and which has the
regex
attached.

At runtime, pp_pushre()'s only job is to push itself (i.e. the current
PL_op) onto the stack. Later pp_split() pops this to get access to the
regex it wants to execute.

This is a bit unpleasant, because we're pushing an OP* onto the stack,
which is supposed to be an array of SV*'s. As a bit of a hack, on
DEBUGGING builds we push a PVLV with the PL_op address embedded
instead,
but this still isn't very satisfactory.

Now that regexes are first-class SVs, we could push a REGEXP onto the
stack rather than PL_op. However, there is an optimisation of @​array =
split which eliminates the assign and embeds the array's GV/padix
directly
in the PUSHRE op. So split still needs access to that op. But the
pushre
op will always be splitop->op_first anyway, so one possibility is to
just
skip executing the pushre altogether, and make pp_split just directly
access op_first instead to get the regex and @​array info.

But if we're doing that, then why not just go the full hog and make
OP_SPLIT into a PMOP, and eliminate the OP_PUSHRE op entirely​: with
the
data that was spread across the two ops now combined into just the one
split op.

That is exactly what this commit does.

For a simple compile-time pattern like split(/foo/, $s, 1), the
optree
looks like​:

before​:
<@​> split[t2] lK
</> pushre(/"foo"/) s/RTIME
<0> padsv[$s​:1,2] s
<$> const(IV 1) s

after​:
</> split(/"foo"/)[t2] lK/RTIME
<0> padsv[$s​:1,2] s
<$> const[IV 1] s

while for a run-time expression like split(/$pat/, $s, 1),

before​:
<@​> split[t3] lK
</> pushre() sK/RTIME
<|> regcomp(other->8) sK
<0> padsv[$pat​:2,3] s
<0> padsv[$s​:1,3] s
<$> const(IV 1)s

after​:
</> split()[t3] lK/RTIME
<|> regcomp(other->8) sK
<0> padsv[$pat​:2,3] s
<0> padsv[$s​:1,3] s
<$> const[IV 1] s

This makes the code faster and simpler.

At the same time, two new private flags have been added for OP_SPLIT -
OPpSPLIT_ASSIGN and OPpSPLIT_LEX - which make it explicit that the
assign op has been optimised away, and if so, whether the array is
lexical.

Also, deparsing of split has been improved, to the extent that

perl TEST -deparse op/split.t

now passes.

Also, a couple of panic messages in pp_split() have been replaced with
asserts().

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 28, 2017

From @iabyn

On Sat, Oct 28, 2017 at 06​:30​:31AM -0700, James E Keenan via RT wrote​:

On Mon, 26 Sep 2016 11​:58​:56 GMT, davem wrote​:

On Mon, Sep 12, 2016 at 03​:23​:07PM +0100, Dave Mitchell wrote​:

On Sun, Sep 11, 2016 at 10​:17​:39PM -0700, Father Chrysostomos via RT
wrote​:

On Thu Sep 01 01​:49​:12 2016, brian.carpenter@​gmail.com wrote​:

Perl v5.25.5 (v5.25.4-25-g109ac34*), found with AFL + ASAN. A
non-
instrumented build of
v5.25.4-5-g92d73bf returns the valgrind output at the end.

I can reproduce it on dromedary, but not locally. On dromedary I
don’t have a functional gdb, so it’s a little hard to debug.

I tried bisecting, but got perl-5.6.0-4727-g4cddb5c, which seems
like a red herring.

I managed to reduce it to this​:

$ cat foo
map{s///o > split 0,split /0/>0}<DATA>__END__

It's only an issue on non-threaded builds. It started failing
sometime
between 5.16.0 and 5.18.0.
I'm looking into it.

It turns out that this is another variant of RT #124368​: on non-
threaded
builds, the s///o combines​: modifying the op tree under /o to avoid
recompilation on subsequent iterations; and m// using the last
successful
pattern. The combination of the two makes /o end up modifying some
random
other part of the op tree.

I haven't fixed this yet, but while investigating what was going on, I
decided I would finally sort out the horrible OP_SPLIT/OP_PUSHRE mess,
which I have now done and am smoking as smoke-me/davem/pushre2.

The most significant commit from that branch is the following, which
I intend to merge into blead in few days time. Note this it is *not* a
fix
for this ticket.

Dave, Father C, et al.,

Could we get an update on the status of this ticket?

It appears to have been fixed by v5.27.2-138-g3cb4cde.

--
Nothing ventured, nothing lost.

@p5pRT
Copy link
Author

p5pRT commented Nov 28, 2017

@iabyn - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant