Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compcv refcount (again) #14553

Open
p5pRT opened this issue Mar 2, 2015 · 8 comments
Open

compcv refcount (again) #14553

p5pRT opened this issue Mar 2, 2015 · 8 comments

Comments

@p5pRT
Copy link

p5pRT commented Mar 2, 2015

Migrated from rt.perl.org#123963 (status was 'open')

Searchable as RT123963$

@p5pRT
Copy link
Author

p5pRT commented Mar 2, 2015

From @hvds

They're getting harder to find, or at least to minimize. :)

AFL (<http​://lcamtuf.coredump.cx/afl/>) finds this​:

% perl -e 'print "m\x{0}\x{1}\x{0}0000@​\x{0}\x{13}\x{ff}000000\x{1}\x{0}"' | ./miniperl -c
miniperl​: sv.c​:6537​: Perl_sv_clear​: Assertion `((svtype)((sv)->sv_flags & 0xff)) != (svtype)0xff' failed.
Aborted (core dumped)
%

This looks like another compcv refcount issue​:

(gdb) where
#0 0x00007ffff72edbb9 in __GI_raise (sig=sig@​entry=6)
  at ../nptl/sysdeps/unix/sysv/linux/raise.c​:56
#1 0x00007ffff72f0fc8 in __GI_abort () at abort.c​:89
#2 0x00007ffff72e6a76 in __assert_fail_base (
  fmt=0x7ffff7438370 "%s%s%s​:%u​: %s%sAssertion `%s' failed.\n%n",
  assertion=assertion@​entry=0x7b4980 "((svtype)((sv)->sv_flags & 0xff)) != (svtype)0xff", file=file@​entry=0x7a1134 "sv.c", line=line@​entry=6537,
  function=function@​entry=0x7bdb15 <__PRETTY_FUNCTION__.17333> "Perl_sv_clear") at assert.c​:92
#3 0x00007ffff72e6b22 in __GI___assert_fail (
  assertion=0x7b4980 "((svtype)((sv)->sv_flags & 0xff)) != (svtype)0xff",
  file=0x7a1134 "sv.c", line=6537,
  function=0x7bdb15 <__PRETTY_FUNCTION__.17333> "Perl_sv_clear")
  at assert.c​:101
#4 0x00000000005c2c4d in Perl_sv_clear (orig_sv=0xa262c8) at sv.c​:6537
#5 0x00000000005c5af6 in Perl_sv_free2 (sv=0xa262c8, rc=1) at sv.c​:7031
#6 0x00000000004b343c in S_SvREFCNT_dec (sv=0xa262c8) at inline.h​:166
#7 0x00000000004b82f3 in Perl_yyparse (gramtype=258) at perly.c​:423
#8 0x0000000000408c19 in S_parse_body (env=0x0, xsinit=0x4509d7 <xs_init>)
  at perl.c​:2277
#9 0x00000000004078b5 in perl_parse (my_perl=0xa24010,
  xsinit=0x4509d7 <xs_init>, argc=3, argv=0x7fffffffe638, env=0x0)
  at perl.c​:1611
#10 0x0000000000450939 in main (argc=3, argv=0x7fffffffe638,
  env=0x7fffffffe658) at miniperlmain.c​:120
(gdb)

Hugo

@p5pRT
Copy link
Author

p5pRT commented Mar 2, 2015

From @cpansprout

On Mon Mar 02 00​:50​:54 2015, hv wrote​:

They're getting harder to find, or at least to minimize. :)

AFL (<http​://lcamtuf.coredump.cx/afl/>) finds this​:

% perl -e 'print
"m\x{0}\x{1}\x{0}0000@​\x{0}\x{13}\x{ff}000000\x{1}\x{0}"' | ./miniperl
-c
miniperl​: sv.c​:6537​: Perl_sv_clear​: Assertion `((svtype)((sv)-

sv_flags & 0xff)) != (svtype)0xff' failed.
Aborted (core dumped)
%

$ perl -CS -e 'print "use utf8; m|\@​\x{ff13}|"' | ./miniperl -Ilib -c
Assertion failed​: (SvTYPE(sv) != (svtype)SVTYPEMASK), function Perl_sv_clear, file sv.c, line 6537.
Abort trap​: 6

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Mar 2, 2015

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Mar 3, 2015

From @hvds

I've restarted AFL with a new build of blead at 923ed58, which is running way cleaner since (I guess) the [perl #123802] issues were nailed.

It has reported only one crash so far, and I suspect it's another instance of the issue in this ticket​:

% perl -e 'print chr(hex($_)) for qw{2f 00 40 00 40 10 64 a0 2f 00 40 80 65 a0 a1 0a}' | ./miniperl -c
miniperl​: sv.c​:6537​: Perl_sv_clear​: Assertion `((svtype)((sv)->sv_flags & 0xff)) != (svtype)0xff' failed.
Aborted (core dumped)
%

.. with a similar backtrace.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Mar 3, 2015

From @cpansprout

On Mon Mar 02 16​:05​:03 2015, hv wrote​:

I've restarted AFL with a new build of blead at 923ed58, which is
running way cleaner since (I guess) the [perl #123802] issues were
nailed.

It has reported only one crash so far, and I suspect it's another
instance of the issue in this ticket​:

% perl -e 'print chr(hex($_)) for qw{2f 00 40 00 40 10 64 a0 2f 00 40
80 65 a0 a1 0a}' | ./miniperl -c
miniperl​: sv.c​:6537​: Perl_sv_clear​: Assertion `((svtype)((sv)-

sv_flags & 0xff)) != (svtype)0xff' failed.
Aborted (core dumped)
%

.. with a similar backtrace.

Hugo

In both cases we have a regular expression containing an @​ followed by a non-ASCII digit.

A bisect points to f25ce84.

f25ce84 is the first bad commit
commit f25ce84
Author​: Karl Williamson <public@​khwilliamson.com>
Date​: Mon Jan 6 12​:22​:02 2014 -0700

  isWORDCHAR_uni(), isDIGIT_utf8() etc no longer go out to disk
 
  Previous commits in this series have caused all the POSIX classes to be
  completely specified at C compile time. This allows us to revise the
  base function used by all these macros to use these definitions,
  avoiding reading them in from disk.

-DT output shows that the lexer emits @​ , THING instead of @​ WORD.

The resulting failure mode is similar to #123802 (with LEX_INTERPSTART this time), though I don’t know that there is a way to trigger it that does not depend on this Unicode bug.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Mar 3, 2015

From @cpansprout

On Mon Mar 02 17​:50​:47 2015, sprout wrote​:

On Mon Mar 02 16​:05​:03 2015, hv wrote​:

I've restarted AFL with a new build of blead at 923ed58, which is
running way cleaner since (I guess) the [perl #123802] issues were
nailed.

It has reported only one crash so far, and I suspect it's another
instance of the issue in this ticket​:

% perl -e 'print chr(hex($_)) for qw{2f 00 40 00 40 10 64 a0 2f 00 40
80 65 a0 a1 0a}' | ./miniperl -c
miniperl​: sv.c​:6537​: Perl_sv_clear​: Assertion `((svtype)((sv)-

sv_flags & 0xff)) != (svtype)0xff' failed.
Aborted (core dumped)
%

.. with a similar backtrace.

Hugo

In both cases we have a regular expression containing an @​ followed by
a non-ASCII digit.

A bisect points to f25ce84.

f25ce84 is the first bad commit
commit f25ce84
Author​: Karl Williamson <public@​khwilliamson.com>
Date​: Mon Jan 6 12​:22​:02 2014 -0700

isWORDCHAR_uni(), isDIGIT_utf8() etc no longer go out to disk

Previous commits in this series have caused all the POSIX classes to
be
completely specified at C compile time. This allows us to revise the
base function used by all these macros to use these definitions,
avoiding reading them in from disk.

-DT output shows that the lexer emits @​ , THING instead of @​ WORD.

The resulting failure mode is similar to #123802 (with LEX_INTERPSTART
this time), though I don’t know that there is a way to trigger it that
does not depend on this Unicode bug.

scan_const sees what looks like an @​ followed by a valid identifier, so it stops before the @​. Shortly thereafter, parse_ident rejects the fullwidth 3 (\x{ff13}) as not being valid at the beginning of an identifier. So the lexer is not consistent with itself.

scan_const is using isWORDCHAR_lazy_if. parse_ident is using isIDFIRST_utf8. Did f25ce84 indirectly change the behaviour of either of those?

Since this is a regression, I think we ought to address it before 5.22.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Mar 6, 2015

From @khwilliamson

On 03/02/2015 10​:50 PM, Father Chrysostomos via RT wrote​:

On Mon Mar 02 17​:50​:47 2015, sprout wrote​:

On Mon Mar 02 16​:05​:03 2015, hv wrote​:

I've restarted AFL with a new build of blead at 923ed58, which is
running way cleaner since (I guess) the [perl #123802] issues were
nailed.

It has reported only one crash so far, and I suspect it's another
instance of the issue in this ticket​:

% perl -e 'print chr(hex($_)) for qw{2f 00 40 00 40 10 64 a0 2f 00 40
80 65 a0 a1 0a}' | ./miniperl -c
miniperl​: sv.c​:6537​: Perl_sv_clear​: Assertion `((svtype)((sv)-

sv_flags & 0xff)) != (svtype)0xff' failed.
Aborted (core dumped)
%

.. with a similar backtrace.

Hugo

In both cases we have a regular expression containing an @​ followed by
a non-ASCII digit.

A bisect points to f25ce84.

f25ce84 is the first bad commit
commit f25ce84
Author​: Karl Williamson <public@​khwilliamson.com>
Date​: Mon Jan 6 12​:22​:02 2014 -0700

isWORDCHAR_uni(), isDIGIT_utf8() etc no longer go out to disk

Previous commits in this series have caused all the POSIX classes to
be
completely specified at C compile time. This allows us to revise the
base function used by all these macros to use these definitions,
avoiding reading them in from disk.

-DT output shows that the lexer emits @​ , THING instead of @​ WORD.

The resulting failure mode is similar to #123802 (with LEX_INTERPSTART
this time), though I don’t know that there is a way to trigger it that
does not depend on this Unicode bug.

scan_const sees what looks like an @​ followed by a valid identifier, so it stops before the @​. Shortly thereafter, parse_ident rejects the fullwidth 3 (\x{ff13}) as not being valid at the beginning of an identifier. So the lexer is not consistent with itself.

I think we should do an audit of the uses of these. In this case, WORD
is inappropriate. My guess is that there are other instances of the
same, which only bite us on unaccustomed Unicode characters.

scan_const is using isWORDCHAR_lazy_if. parse_ident is using isIDFIRST_utf8. Did f25ce84 indirectly change the behaviour of either of those?

Looking at the commit, the behavior change was inadvertent

Since this is a regression, I think we ought to address it before 5.22.

Agreed, and I see you have already added it to the blockers.

@p5pRT
Copy link
Author

p5pRT commented Mar 11, 2015

From @cpansprout

On Fri Mar 06 10​:38​:31 2015, public@​khwilliamson.com wrote​:

On 03/02/2015 10​:50 PM, Father Chrysostomos via RT wrote​:

scan_const sees what looks like an @​ followed by a valid identifier,
so it stops before the @​. Shortly thereafter, parse_ident rejects
the fullwidth 3 (\x{ff13}) as not being valid at the beginning of an
identifier. So the lexer is not consistent with itself.

I think we should do an audit of the uses of these. In this case,
WORD
is inappropriate. My guess is that there are other instances of the
same, which only bite us on unaccustomed Unicode characters.

I have fixed this particular case in 9d58dbc. I have not done the audit, so this ticket should stay open, but I am removing it from the blockers’ list.

scan_const is using isWORDCHAR_lazy_if. parse_ident is using
isIDFIRST_utf8. Did f25ce84 indirectly change the behaviour of
either of those?

Looking at the commit, the behavior change was inadvertent

Since this is a regression, I think we ought to address it before
5.22.

Agreed, and I see you have already added it to the blockers.

--

Father Chrysostomos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants