Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invlist_inline.h:40: S__invlist_len: Assertion `((svtype)((invlist)->sv_flags & 0xff)) == SVt_INVLIST' faile #15467

Closed
p5pRT opened this issue Jul 21, 2016 · 8 comments

Comments

@p5pRT
Copy link

p5pRT commented Jul 21, 2016

Migrated from rt.perl.org#128686 (status was 'resolved')

Searchable as RT128686$

@p5pRT
Copy link
Author

p5pRT commented Jul 21, 2016

From @geeknik

Perl v5.25.3 (v5.25.2-160-gcd478ec) plus the attached test case cause this assertion failure​:

invlist_inline.h​:40​: S__invlist_len​: Assertion `((svtype)((invlist)->sv_flags & 0xff)) == SVt_INVLIST' failed

@p5pRT
Copy link
Author

p5pRT commented Jul 21, 2016

From @geeknik

test17

@p5pRT
Copy link
Author

p5pRT commented Jul 21, 2016

From @dcollinsn

Bisects specifically to​:

02517e3 is the first new commit
commit 02517e3
Author​: Karl Williamson <khw@​cpan.org>
Date​: Tue Jul 12 21​:15​:07 2016 -0600

  regcomp.c​: Refactor code dealing with m/[...]/d

  This consolidates some code that deals with bracketed character classes
  under /d. As a result, some throw-away steps can be omitted, and things
  aren't scattered about. The earlier version skipped doing some things
  if the class is to be inverted. The reason turns out to not be because
  it was necessary, but that the dump of the compiled pattern was unclear.
  Previous commits have fixed that, so this now handles inverted character
  classes.

:100644 100644 2003d3d443759c2a286ad93f304334aad11e3551 a008b1a51bcb1bbdd81c431229367e907bb32581 M regcomp.c
:040000 040000 aca585ceb3854b8f9338d7cadeedd3a9c5b86c30 5908f11822278eb5a1755b3d9b99ca9c8f739667 M t

Backtrace​:

(gdb) bt
#0 0x00007ffff6cf9458 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff6cfa8da in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007ffff6cf2387 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007ffff6cf2432 in __assert_fail ()
  from /lib/x86_64-linux-gnu/libc.so.6
#4 0x000000000048db2a in S__invlist_len (invlist=<optimized out>)
  at invlist_inline.h​:40
#5 0x00000000004957b4 in Perl__invlist_intersection_maybe_complement_2nd (
  a=0x96ad40, b=0x96ad58, complement_b=<optimized out>, i=0x7fffffffd668)
  at regcomp.c​:9292
#6 0x00000000004a5635 in S_regclass (
  pRExC_state=pRExC_state@​entry=0x7fffffffdbe0,
  flagp=flagp@​entry=0x7fffffffd924, depth=depth@​entry=5,
  stop_at_1=stop_at_1@​entry=false, allow_multi_folds=<optimized out>,
  allow_multi_folds@​entry=true,
  silence_non_portable=silence_non_portable@​entry=false,
  strict=<optimized out>, optimizable=<optimized out>,
  ret_invlist=<optimized out>, return_posix_warnings=<optimized out>)
  at regcomp.c​:17276
#7 0x00000000004aa840 in S_regatom (
  pRExC_state=pRExC_state@​entry=0x7fffffffdbe0,
  flagp=flagp@​entry=0x7fffffffd924, depth=depth@​entry=4) at regcomp.c​:12330
#8 0x00000000004ad6ea in S_regpiece (depth=3, flagp=<synthetic pointer>,
---Type <return> to continue, or q <return> to quit---
  pRExC_state=0x7fffffffdbe0) at regcomp.c​:11452
#9 S_regbranch (pRExC_state=pRExC_state@​entry=0x7fffffffdbe0,
  flagp=flagp@​entry=0x7fffffffd9bc, first=first@​entry=1,
  depth=depth@​entry=2) at regcomp.c​:11377
#10 0x00000000004b4177 in S_reg (
  pRExC_state=pRExC_state@​entry=0x7fffffffdbe0,
  flagp=flagp@​entry=0x7fffffffdaac, depth=1, paren=0) at regcomp.c​:11115
#11 0x00000000004c181f in Perl_re_op_compile (patternp=<optimized out>,
  pat_count=<optimized out>, expr=<optimized out>,
  eng=0x6a3a80 <PL_core_reg_engine>, old_re=0x0,
  is_bare_re=<optimized out>, orig_rx_flags=4, pm_flags=4) at regcomp.c​:7248
#12 0x000000000042d6e9 in Perl_pmruntime (o=0x974228, expr=0x9741e8,
  repl=0x0, isreg=isreg@​entry=true, floor=<optimized out>) at op.c​:5724
#13 0x0000000000481007 in Perl_yyparse (gramtype=gramtype@​entry=258)
  at perly.y​:1028
#14 0x000000000043e0ce in S_parse_body (xsinit=0x630230 <xs_init>, env=0x0)
  at perl.c​:2369
#15 perl_parse (my_perl=<optimized out>,
  xsinit=xsinit@​entry=0x630230 <xs_init>, argc=<optimized out>,
  argv=<optimized out>, env=env@​entry=0x0) at perl.c​:1685
#16 0x0000000000406950 in main (argc=<optimized out>, argv=<optimized out>,
  env=<optimized out>) at miniperlmain.c​:127

I think the problem is this code, with a sort of use-after-free of nonascii_but_latin1_properties. Changing the order of these blocks resolves

  /* And add them to the final list of such characters. */
  if (has_upper_latin1_only_utf8_matches) {
  _invlist_union(has_upper_latin1_only_utf8_matches,
  nonascii_but_latin1_properties,
  &has_upper_latin1_only_utf8_matches);
  SvREFCNT_dec_NN(nonascii_but_latin1_properties);
  }
  else {
  has_upper_latin1_only_utf8_matches
  = nonascii_but_latin1_properties;
  }

  /* Remove them from what now becomes the unconditional list */
  _invlist_subtract(posixes, nonascii_but_latin1_properties,
  &posixes);

Make test is clean. Changing the order shouldn't change the behavior - the subtract modifies posixes, which isn't used in either case of the if block.

--
Respectfully,
Dan Collins

@p5pRT
Copy link
Author

p5pRT commented Jul 21, 2016

From @dcollinsn

0001-RT-128686-Fix-a-use-after-free-in-S_regclass.patch
From 84a5123020f4b392c0d3f3dc2ea46f5bbdcb0526 Mon Sep 17 00:00:00 2001
From: Dan Collins <dcollinsn@gmail.com>
Date: Thu, 21 Jul 2016 00:02:58 -0400
Subject: [PATCH] [RT #128686] Fix a use-after-free in S_regclass

---
 regcomp.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/regcomp.c b/regcomp.c
index 7f6d5ee..738a21a 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -17474,6 +17474,10 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, U32 depth,
                 _invlist_intersection(posixes, PL_UpperLatin1,
                                       &nonascii_but_latin1_properties);
 
+                /* Remove them from what now becomes the unconditional list */
+                _invlist_subtract(posixes, nonascii_but_latin1_properties,
+                                  &posixes);
+
                 /* And add them to the final list of such characters. */
                 if (has_upper_latin1_only_utf8_matches) {
                     _invlist_union(has_upper_latin1_only_utf8_matches,
@@ -17486,10 +17490,6 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, U32 depth,
                                                 = nonascii_but_latin1_properties;
                 }
 
-                /* Remove them from what now becomes the unconditional list */
-                _invlist_subtract(posixes, nonascii_but_latin1_properties,
-                                  &posixes);
-
                 /* And the remainder are the unconditional ones */
                 if (cp_list) {
                     _invlist_union(cp_list, posixes, &cp_list);
-- 
2.8.1

@p5pRT
Copy link
Author

p5pRT commented Jul 21, 2016

From [Unknown Contact. See original ticket]

Bisects specifically to​:

02517e3 is the first new commit
commit 02517e3
Author​: Karl Williamson <khw@​cpan.org>
Date​: Tue Jul 12 21​:15​:07 2016 -0600

  regcomp.c​: Refactor code dealing with m/[...]/d

  This consolidates some code that deals with bracketed character classes
  under /d. As a result, some throw-away steps can be omitted, and things
  aren't scattered about. The earlier version skipped doing some things
  if the class is to be inverted. The reason turns out to not be because
  it was necessary, but that the dump of the compiled pattern was unclear.
  Previous commits have fixed that, so this now handles inverted character
  classes.

:100644 100644 2003d3d443759c2a286ad93f304334aad11e3551 a008b1a51bcb1bbdd81c431229367e907bb32581 M regcomp.c
:040000 040000 aca585ceb3854b8f9338d7cadeedd3a9c5b86c30 5908f11822278eb5a1755b3d9b99ca9c8f739667 M t

Backtrace​:

(gdb) bt
#0 0x00007ffff6cf9458 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff6cfa8da in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007ffff6cf2387 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007ffff6cf2432 in __assert_fail ()
  from /lib/x86_64-linux-gnu/libc.so.6
#4 0x000000000048db2a in S__invlist_len (invlist=<optimized out>)
  at invlist_inline.h​:40
#5 0x00000000004957b4 in Perl__invlist_intersection_maybe_complement_2nd (
  a=0x96ad40, b=0x96ad58, complement_b=<optimized out>, i=0x7fffffffd668)
  at regcomp.c​:9292
#6 0x00000000004a5635 in S_regclass (
  pRExC_state=pRExC_state@​entry=0x7fffffffdbe0,
  flagp=flagp@​entry=0x7fffffffd924, depth=depth@​entry=5,
  stop_at_1=stop_at_1@​entry=false, allow_multi_folds=<optimized out>,
  allow_multi_folds@​entry=true,
  silence_non_portable=silence_non_portable@​entry=false,
  strict=<optimized out>, optimizable=<optimized out>,
  ret_invlist=<optimized out>, return_posix_warnings=<optimized out>)
  at regcomp.c​:17276
#7 0x00000000004aa840 in S_regatom (
  pRExC_state=pRExC_state@​entry=0x7fffffffdbe0,
  flagp=flagp@​entry=0x7fffffffd924, depth=depth@​entry=4) at regcomp.c​:12330
#8 0x00000000004ad6ea in S_regpiece (depth=3, flagp=<synthetic pointer>,
---Type <return> to continue, or q <return> to quit---
  pRExC_state=0x7fffffffdbe0) at regcomp.c​:11452
#9 S_regbranch (pRExC_state=pRExC_state@​entry=0x7fffffffdbe0,
  flagp=flagp@​entry=0x7fffffffd9bc, first=first@​entry=1,
  depth=depth@​entry=2) at regcomp.c​:11377
#10 0x00000000004b4177 in S_reg (
  pRExC_state=pRExC_state@​entry=0x7fffffffdbe0,
  flagp=flagp@​entry=0x7fffffffdaac, depth=1, paren=0) at regcomp.c​:11115
#11 0x00000000004c181f in Perl_re_op_compile (patternp=<optimized out>,
  pat_count=<optimized out>, expr=<optimized out>,
  eng=0x6a3a80 <PL_core_reg_engine>, old_re=0x0,
  is_bare_re=<optimized out>, orig_rx_flags=4, pm_flags=4) at regcomp.c​:7248
#12 0x000000000042d6e9 in Perl_pmruntime (o=0x974228, expr=0x9741e8,
  repl=0x0, isreg=isreg@​entry=true, floor=<optimized out>) at op.c​:5724
#13 0x0000000000481007 in Perl_yyparse (gramtype=gramtype@​entry=258)
  at perly.y​:1028
#14 0x000000000043e0ce in S_parse_body (xsinit=0x630230 <xs_init>, env=0x0)
  at perl.c​:2369
#15 perl_parse (my_perl=<optimized out>,
  xsinit=xsinit@​entry=0x630230 <xs_init>, argc=<optimized out>,
  argv=<optimized out>, env=env@​entry=0x0) at perl.c​:1685
#16 0x0000000000406950 in main (argc=<optimized out>, argv=<optimized out>,
  env=<optimized out>) at miniperlmain.c​:127

I think the problem is this code, with a sort of use-after-free of nonascii_but_latin1_properties. Changing the order of these blocks resolves

  /* And add them to the final list of such characters. */
  if (has_upper_latin1_only_utf8_matches) {
  _invlist_union(has_upper_latin1_only_utf8_matches,
  nonascii_but_latin1_properties,
  &has_upper_latin1_only_utf8_matches);
  SvREFCNT_dec_NN(nonascii_but_latin1_properties);
  }
  else {
  has_upper_latin1_only_utf8_matches
  = nonascii_but_latin1_properties;
  }

  /* Remove them from what now becomes the unconditional list */
  _invlist_subtract(posixes, nonascii_but_latin1_properties,
  &posixes);

Make test is clean. Changing the order shouldn't change the behavior - the subtract modifies posixes, which isn't used in either case of the if block.

--
Respectfully,
Dan Collins

@p5pRT
Copy link
Author

p5pRT commented Jul 21, 2016

From @khwilliamson

Thanks for finding this. I freed a variable before its final use
Fixed by 33bc5d3
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jul 21, 2016

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jul 21, 2016

@khwilliamson - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant