Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi method cache causes Base64 regression #5428

Open
p6rt opened this issue Jul 6, 2016 · 8 comments
Open

multi method cache causes Base64 regression #5428

p6rt opened this issue Jul 6, 2016 · 8 comments
Labels
SEGV Segmentation fault, bus error, etc. testneeded

Comments

@p6rt
Copy link

p6rt commented Jul 6, 2016

Migrated from rt.perl.org#128553 (status was 'open')

Searchable as RT128553$

@p6rt
Copy link
Author

p6rt commented Jul 6, 2016

From @ugexe

The failures are only for decoding. Encoding still works.

Failures​:
https://travis-ci.org/ugexe/zef/jobs/142611302#L1178

Relevant code​:
https://github.com/ugexe/Perl6-Base64/blob/master/lib/Base64.pm6#L41

The first multi dispatch cache commit only caused failures sometimes. The commit afterwards (d4ac15c) to fix a different multi dispatch cache problem causes it to always fail, and often segfault.

@p6rt
Copy link
Author

p6rt commented Jul 6, 2016

From @ugexe

Note that the final decode-base64 candidate shows the correct results when debugging statements are added

This gist also shows a small change that makes it produce the correct values but it still segfaults more often than not
https://gist.github.com/ugexe/baa168a641894a0731595c812724f76d

@p6rt
Copy link
Author

p6rt commented Jul 9, 2016

From @jnthn

On Tue Jul 05 17​:51​:46 2016, ugexe@​cpan.org wrote​:

Note that the final decode-base64 candidate shows the correct results
when debugging statements are added

This gist also shows a small change that makes it produce the correct
values but it still segfaults more often than not
https://gist.github.com/ugexe/baa168a641894a0731595c812724f76d

Having dug into it a little, I'm not sure that the new multi-caching stuff has caused this, so much as uncovered it (perhaps by moving GC collection points). The valgrind output is as follows​:

  ok 1 - decoding the empty string
  ok 2 - decoding "A"
  not ok 3 - decoding "Ab"
 
# Failed test 'decoding "Ab"'
# at t/01-basic.t line 29
  # expected​: 'Ab'
  # got​: 'A'
  not ok 4 - decoding "Abc"
 
# Failed test 'decoding "Abc"'
# at t/01-basic.t line 30
  # expected​: 'Abc'
  # got​: 'A'
==19699== Invalid read of size 8
==19699== at 0x4FF2B3E​: gc_mark (MVMCallCapture.c​:51)
==19699== by 0x4FC6BF5​: process_worklist (collect.c​:313)
==19699== by 0x4FC71A0​: MVM_gc_collect (collect.c​:129)
==19699== by 0x4FC2735​: run_gc (orchestrate.c​:304)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator (orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8BEB​: MVM_frame_capturelex (frame.c​:1035)
==19699== by 0x4F98D8F​: MVM_interp_run (interp.c​:1070)
==19699== by 0x505E3F8​: MVM_vm_run_file (moar.c​:304)
==19699== by 0x401036​: main (main.c​:191)
==19699== Address 0x658f9d0 is 0 bytes inside a block of size 32 free'd
==19699== at 0x4C2BDEC​: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19699== by 0x4FF2CDA​: MVM_free (alloc.h​:29)
==19699== by 0x4FF2CDA​: gc_free (MVMCallCapture.c​:78)
==19699== by 0x4FC72D3​: MVM_gc_collect_free_nursery_uncopied (collect.c​:580)
==19699== by 0x4FC2976​: run_gc (orchestrate.c​:328)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator (orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8D4C​: MVM_frame_takeclosure (frame.c​:1051)
==19699== by 0xEA66105​: ???
==19699== by 0x50342B8​: MVM_jit_enter_code (compile.c​:136)
==19699== by 0x4F938F0​: MVM_interp_run (interp.c​:5456)
==19699==
==19699== Invalid read of size 1
==19699== at 0x4FF2A7D​: gc_mark (MVMCallCapture.c​:55)
==19699== by 0x4FC6BF5​: process_worklist (collect.c​:313)
==19699== by 0x4FC71A0​: MVM_gc_collect (collect.c​:129)
==19699== by 0x4FC2735​: run_gc (orchestrate.c​:304)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator (orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8BEB​: MVM_frame_capturelex (frame.c​:1035)
==19699== by 0x4F98D8F​: MVM_interp_run (interp.c​:1070)
==19699== by 0x505E3F8​: MVM_vm_run_file (moar.c​:304)
==19699== by 0x401036​: main (main.c​:191)
==19699== Address 0xb36cc70 is 0 bytes inside a block of size 1 free'd
==19699== at 0x4C2BDEC​: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19699== by 0x4FF2CD1​: MVM_free (alloc.h​:29)
==19699== by 0x4FF2CD1​: gc_free (MVMCallCapture.c​:77)
==19699== by 0x4FC72D3​: MVM_gc_collect_free_nursery_uncopied (collect.c​:580)
==19699== by 0x4FC2976​: run_gc (orchestrate.c​:328)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator (orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8D4C​: MVM_frame_takeclosure (frame.c​:1051)
==19699== by 0xEA66105​: ???
==19699== by 0x50342B8​: MVM_jit_enter_code (compile.c​:136)
==19699== by 0x4F938F0​: MVM_interp_run (interp.c​:5456)
==19699==
  not ok 5 - decoding "Abcd"

Which is suggestive of some kind of memory corruption (seemingly due to over-sharing or a premature free) inside of the argument capture. Still trying to figure out exactly what leads to this.

@p6rt
Copy link
Author

p6rt commented Jul 9, 2016

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Jul 17, 2016

From @ugexe

On Sat Jul 09 07​:13​:48 2016, jnthn@​jnthn.net wrote​:

On Tue Jul 05 17​:51​:46 2016, ugexe@​cpan.org wrote​:

Note that the final decode-base64 candidate shows the correct results
when debugging statements are added

This gist also shows a small change that makes it produce the correct
values but it still segfaults more often than not
https://gist.github.com/ugexe/baa168a641894a0731595c812724f76d

Having dug into it a little, I'm not sure that the new multi-caching
stuff has caused this, so much as uncovered it (perhaps by moving GC
collection points

You are right. I golfed it down to the following gist which does not use multi methods​:
https://gist.github.com/ugexe/7ce08d3ad4cfd36f275c437ba9713471

In the gist comments there is also a link to the bisect output, but for brevity​:
bisect output​: https://gist.github.com/Whateverable/e0a28162b19f601616aa8776f24dd027
rakudo commit​: rakudo/rakudo@5581b24

@p6rt
Copy link
Author

p6rt commented Jul 28, 2016

From @LLFourn

I've been getting segfaults in this area recently. The trace is a bit
different but I guess it's related. It seems that flag_map in gc_mark is no
longer allocated so I get segfault.

(lldb) r
There is a running process, kill it and restart?​: [Y/n] y
Process 75673 exited with status = 9 (0x00000009)
Process 75691 launched​: '/Users/llfourn/src/rakudo/install/bin/moar'
(x86_64)
Process 75691 stopped
* thread #​1​: tid = 0x23cfc68, 0x0000000100063ca4
libmoar.dylib`gc_mark(tc=0x0000000100500290, st=0x0000000101004f40,
data=<unavailable>, worklist=0x000000010d27c1a0) + 84 at
MVMCallCapture.c​:55, queue = 'com.apple.main-thread', stop reason =
EXC_BAD_ACCESS (code=1, address=0x10a72533)
  frame #​0​: 0x0000000100063ca4
libmoar.dylib`gc_mark(tc=0x0000000100500290, st=0x0000000101004f40,
data=<unavailable>, worklist=0x000000010d27c1a0) + 84 at MVMCallCapture.c​:55
  52 MVMuint16 count = ctx->arg_count;
  53 MVMuint16 i, flag;
  54 for (i = 0, flag = 0; i < count; i++, flag++) {
-> 55 if (flag_map[flag] & MVM_CALLSITE_ARG_NAMED) {
  56 /* Current position is name, then next is value. */
  57 MVM_gc_worklist_add(tc, worklist, &ctx->args[i].s);
  58 i++;
(lldb) bt
* thread #​1​: tid = 0x23cfc68, 0x0000000100063ca4
libmoar.dylib`gc_mark(tc=0x0000000100500290, st=0x0000000101004f40,
data=<unavailable>, worklist=0x000000010d27c1a0) + 84 at
MVMCallCapture.c​:55, queue = 'com.apple.main-thread', stop reason =
EXC_BAD_ACCESS (code=1, address=0x10a72533)
  * frame #​0​: 0x0000000100063ca4
libmoar.dylib`gc_mark(tc=0x0000000100500290, st=0x0000000101004f40,
data=<unavailable>, worklist=0x000000010d27c1a0) + 84 at MVMCallCapture.c​:55
  frame #​1​: 0x000000010003f002
libmoar.dylib`MVM_gc_mark_collectable(tc=<unavailable>,
worklist=<unavailable>, new_addr=<unavailable>) + 1714 at collect.c​:399
  frame #​2​: 0x000000010003e883
libmoar.dylib`process_worklist(tc=0x0000000100500290,
worklist=0x000000010d27c1a0, wtp=0x00007fff5fbff468, gen='\0') + 899 at
collect.c​:313
  frame #​3​: 0x000000010003e324
libmoar.dylib`MVM_gc_collect(tc=0x0000000100500290,
what_to_do=<unavailable>, gen=<unavailable>) + 820 at collect.c​:129
  frame #​4​: 0x000000010003b703
libmoar.dylib`run_gc(tc=0x0000000100500290, what_to_do='\0') + 131 at
orchestrate.c​:304
  frame #​5​: 0x000000010003b4c8
libmoar.dylib`MVM_gc_enter_from_allocator(tc=0x0000000100500290) + 984 at
orchestrate.c​:438
  frame #​6​: 0x000000010003bca8 libmoar.dylib`MVM_gc_allocate_zeroed
[inlined] MVM_gc_allocate_nursery(tc=<unavailable>, size=<unavailable>) +
52 at allocation.c​:32
  frame #​7​: 0x000000010003bc74 libmoar.dylib`MVM_gc_allocate_zeroed
[inlined] MVM_gc_allocate(tc=<unavailable>) + 23 at allocation.h​:13
  frame #​8​: 0x000000010003bc5d
libmoar.dylib`MVM_gc_allocate_zeroed(tc=0x0000000100500290, size=40) + 13
at allocation.c​:49
  frame #​9​: 0x000000010003bf44
libmoar.dylib`MVM_gc_allocate_object(tc=0x0000000100500290,
st=<unavailable>) + 84 at allocation.c​:86
  frame #​10​: 0x000000010005813f libmoar.dylib`allocate(tc=<unavailable>,
st=<unavailable>) + 15 at P6opaque.c​:60
  frame #​11​: 0x0000000100050d27 libmoar.dylib`MVM_repr_box_str [inlined]
MVM_repr_alloc_init(tc=0x0000000100500290, type=0x0000000101626fc0) + 14 at
reprconv.c​:13
  frame #​12​: 0x0000000100050d19
libmoar.dylib`MVM_repr_box_str(tc=0x0000000100500290,
type=0x0000000101626fc0, val=<unavailable>) + 73 at reprconv.c​:586
  frame #​13​: 0x0000000100009c07
libmoar.dylib`MVM_exception_backtrace(tc=0x0000000100500290,
ex_obj=<unavailable>) + 1111 at exceptions.c​:407
  frame #​14​: 0x0000000100017488
libmoar.dylib`MVM_interp_run(tc=0x0000000100500290,
initial_invoke=<unavailable>, invoke_data=<unavailable>) + 47320 at
interp.c​:3830
  frame #​15​: 0x00000001000c8305
libmoar.dylib`MVM_vm_run_file(instance=<unavailable>,
filename=<unavailable>) + 181 at moar.c​:304
  frame #​16​: 0x000000010000170a moar`main(argc=<unavailable>,
argv=0x00007fff5fbff8e8) + 666 at main.c​:191
  frame #​17​: 0x00007fff900c05ad libdyld.dylib`start + 1
  frame #​18​: 0x00007fff900c05ad libdyld.dylib`start + 1
(lldb) fr v
(MVMThreadContext *) tc = 0x0000000100500290
(MVMSTable *) st = 0x0000000101004f40
(void *) data = <variable not available>

(MVMGCWorklist *) worklist = 0x000000010d27c1a0
(MVMCallCaptureBody *) body = <no location, value may have been optimized
out>

(MVMArgProcContext *) ctx = 0x000000010a6f3c00
(MVMuint16) flag = 0
(MVMuint16) i = 0
(MVMuint16) count = <variable not available>

(MVMuint8 *) flag_map = 0x0000000010a72533 ""
(lldb) p flag_map
(MVMuint8 *) $6 = 0x0000000010a72533 ""
(lldb) p flag_map[0]
error​: Couldn't apply expression side effects : Couldn't dematerialize a
result variable​: couldn't read its memory

On Sun, Jul 10, 2016 at 12​:19 AM jnthn@​jnthn.net via RT <
perl6-bugs-followup@​perl.org> wrote​:

On Tue Jul 05 17​:51​:46 2016, ugexe@​cpan.org wrote​:

Note that the final decode-base64 candidate shows the correct results
when debugging statements are added

This gist also shows a small change that makes it produce the correct
values but it still segfaults more often than not
https://gist.github.com/ugexe/baa168a641894a0731595c812724f76d

Having dug into it a little, I'm not sure that the new multi-caching stuff
has caused this, so much as uncovered it (perhaps by moving GC collection
points). The valgrind output is as follows​:

ok 1 \- decoding the empty string
ok 2 \- decoding "A"
not ok 3 \- decoding "Ab"

# Failed test 'decoding "Ab"'
# at t/01-basic.t line 29
# expected​: 'Ab'
# got​: 'A'
not ok 4 - decoding "Abc"

# Failed test 'decoding "Abc"'
# at t/01-basic.t line 30
# expected​: 'Abc'
# got​: 'A'
==19699== Invalid read of size 8
==19699== at 0x4FF2B3E​: gc_mark (MVMCallCapture.c​:51)
==19699== by 0x4FC6BF5​: process_worklist (collect.c​:313)
==19699== by 0x4FC71A0​: MVM_gc_collect (collect.c​:129)
==19699== by 0x4FC2735​: run_gc (orchestrate.c​:304)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator (orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8BEB​: MVM_frame_capturelex (frame.c​:1035)
==19699== by 0x4F98D8F​: MVM_interp_run (interp.c​:1070)
==19699== by 0x505E3F8​: MVM_vm_run_file (moar.c​:304)
==19699== by 0x401036​: main (main.c​:191)
==19699== Address 0x658f9d0 is 0 bytes inside a block of size 32 free'd
==19699== at 0x4C2BDEC​: free (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19699== by 0x4FF2CDA​: MVM_free (alloc.h​:29)
==19699== by 0x4FF2CDA​: gc_free (MVMCallCapture.c​:78)
==19699== by 0x4FC72D3​: MVM_gc_collect_free_nursery_uncopied
(collect.c​:580)
==19699== by 0x4FC2976​: run_gc (orchestrate.c​:328)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator (orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8D4C​: MVM_frame_takeclosure (frame.c​:1051)
==19699== by 0xEA66105​: ???
==19699== by 0x50342B8​: MVM_jit_enter_code (compile.c​:136)
==19699== by 0x4F938F0​: MVM_interp_run (interp.c​:5456)
==19699==
==19699== Invalid read of size 1
==19699== at 0x4FF2A7D​: gc_mark (MVMCallCapture.c​:55)
==19699== by 0x4FC6BF5​: process_worklist (collect.c​:313)
==19699== by 0x4FC71A0​: MVM_gc_collect (collect.c​:129)
==19699== by 0x4FC2735​: run_gc (orchestrate.c​:304)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator (orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8BEB​: MVM_frame_capturelex (frame.c​:1035)
==19699== by 0x4F98D8F​: MVM_interp_run (interp.c​:1070)
==19699== by 0x505E3F8​: MVM_vm_run_file (moar.c​:304)
==19699== by 0x401036​: main (main.c​:191)
==19699== Address 0xb36cc70 is 0 bytes inside a block of size 1 free'd
==19699== at 0x4C2BDEC​: free (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19699== by 0x4FF2CD1​: MVM_free (alloc.h​:29)
==19699== by 0x4FF2CD1​: gc_free (MVMCallCapture.c​:77)
==19699== by 0x4FC72D3​: MVM_gc_collect_free_nursery_uncopied
(collect.c​:580)
==19699== by 0x4FC2976​: run_gc (orchestrate.c​:328)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator (orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8D4C​: MVM_frame_takeclosure (frame.c​:1051)
==19699== by 0xEA66105​: ???
==19699== by 0x50342B8​: MVM_jit_enter_code (compile.c​:136)
==19699== by 0x4F938F0​: MVM_interp_run (interp.c​:5456)
==19699==
not ok 5 - decoding "Abcd"

Which is suggestive of some kind of memory corruption (seemingly due to
over-sharing or a premature free) inside of the argument capture. Still
trying to figure out exactly what leads to this.

@p6rt
Copy link
Author

p6rt commented Jul 28, 2016

From @LLFourn

By flag_map I mean ctx->callsite->arg_flags​:

(lldb) p ctx->arg_flags
(MVMCallsiteEntry *) $8 = 0x0000000000000000
(lldb) p ctx->callsite->arg_flags
(MVMCallsiteEntry *) $9 = 0x0000000010a72533 ""

On Mon, Jul 11, 2016 at 3​:59 AM Lloyd Fournier <lloyd.fourn@​gmail.com>
wrote​:

I've been getting segfaults in this area recently. The trace is a bit
different but I guess it's related. It seems that flag_map in gc_mark is no
longer allocated so I get segfault.

(lldb) r
There is a running process, kill it and restart?​: [Y/n] y
Process 75673 exited with status = 9 (0x00000009)
Process 75691 launched​: '/Users/llfourn/src/rakudo/install/bin/moar'
(x86_64)
Process 75691 stopped
* thread #​1​: tid = 0x23cfc68, 0x0000000100063ca4
libmoar.dylib`gc_mark(tc=0x0000000100500290, st=0x0000000101004f40,
data=<unavailable>, worklist=0x000000010d27c1a0) + 84 at
MVMCallCapture.c​:55, queue = 'com.apple.main-thread', stop reason =
EXC_BAD_ACCESS (code=1, address=0x10a72533)
frame #​0​: 0x0000000100063ca4
libmoar.dylib`gc_mark(tc=0x0000000100500290, st=0x0000000101004f40,
data=<unavailable>, worklist=0x000000010d27c1a0) + 84 at MVMCallCapture.c​:55
52 MVMuint16 count = ctx->arg_count;
53 MVMuint16 i, flag;
54 for (i = 0, flag = 0; i < count; i++, flag++) {
-> 55 if (flag_map[flag] & MVM_CALLSITE_ARG_NAMED) {
56 /* Current position is name, then next is value. */
57 MVM_gc_worklist_add(tc, worklist, &ctx->args[i].s);
58 i++;
(lldb) bt
* thread #​1​: tid = 0x23cfc68, 0x0000000100063ca4
libmoar.dylib`gc_mark(tc=0x0000000100500290, st=0x0000000101004f40,
data=<unavailable>, worklist=0x000000010d27c1a0) + 84 at
MVMCallCapture.c​:55, queue = 'com.apple.main-thread', stop reason =
EXC_BAD_ACCESS (code=1, address=0x10a72533)
* frame #​0​: 0x0000000100063ca4
libmoar.dylib`gc_mark(tc=0x0000000100500290, st=0x0000000101004f40,
data=<unavailable>, worklist=0x000000010d27c1a0) + 84 at MVMCallCapture.c​:55
frame #​1​: 0x000000010003f002
libmoar.dylib`MVM_gc_mark_collectable(tc=<unavailable>,
worklist=<unavailable>, new_addr=<unavailable>) + 1714 at collect.c​:399
frame #​2​: 0x000000010003e883
libmoar.dylib`process_worklist(tc=0x0000000100500290,
worklist=0x000000010d27c1a0, wtp=0x00007fff5fbff468, gen='\0') + 899 at
collect.c​:313
frame #​3​: 0x000000010003e324
libmoar.dylib`MVM_gc_collect(tc=0x0000000100500290,
what_to_do=<unavailable>, gen=<unavailable>) + 820 at collect.c​:129
frame #​4​: 0x000000010003b703
libmoar.dylib`run_gc(tc=0x0000000100500290, what_to_do='\0') + 131 at
orchestrate.c​:304
frame #​5​: 0x000000010003b4c8
libmoar.dylib`MVM_gc_enter_from_allocator(tc=0x0000000100500290) + 984 at
orchestrate.c​:438
frame #​6​: 0x000000010003bca8 libmoar.dylib`MVM_gc_allocate_zeroed
[inlined] MVM_gc_allocate_nursery(tc=<unavailable>, size=<unavailable>) +
52 at allocation.c​:32
frame #​7​: 0x000000010003bc74 libmoar.dylib`MVM_gc_allocate_zeroed
[inlined] MVM_gc_allocate(tc=<unavailable>) + 23 at allocation.h​:13
frame #​8​: 0x000000010003bc5d
libmoar.dylib`MVM_gc_allocate_zeroed(tc=0x0000000100500290, size=40) + 13
at allocation.c​:49
frame #​9​: 0x000000010003bf44
libmoar.dylib`MVM_gc_allocate_object(tc=0x0000000100500290,
st=<unavailable>) + 84 at allocation.c​:86
frame #​10​: 0x000000010005813f libmoar.dylib`allocate(tc=<unavailable>,
st=<unavailable>) + 15 at P6opaque.c​:60
frame #​11​: 0x0000000100050d27 libmoar.dylib`MVM_repr_box_str [inlined]
MVM_repr_alloc_init(tc=0x0000000100500290, type=0x0000000101626fc0) + 14 at
reprconv.c​:13
frame #​12​: 0x0000000100050d19
libmoar.dylib`MVM_repr_box_str(tc=0x0000000100500290,
type=0x0000000101626fc0, val=<unavailable>) + 73 at reprconv.c​:586
frame #​13​: 0x0000000100009c07
libmoar.dylib`MVM_exception_backtrace(tc=0x0000000100500290,
ex_obj=<unavailable>) + 1111 at exceptions.c​:407
frame #​14​: 0x0000000100017488
libmoar.dylib`MVM_interp_run(tc=0x0000000100500290,
initial_invoke=<unavailable>, invoke_data=<unavailable>) + 47320 at
interp.c​:3830
frame #​15​: 0x00000001000c8305
libmoar.dylib`MVM_vm_run_file(instance=<unavailable>,
filename=<unavailable>) + 181 at moar.c​:304
frame #​16​: 0x000000010000170a moar`main(argc=<unavailable>,
argv=0x00007fff5fbff8e8) + 666 at main.c​:191
frame #​17​: 0x00007fff900c05ad libdyld.dylib`start + 1
frame #​18​: 0x00007fff900c05ad libdyld.dylib`start + 1
(lldb) fr v
(MVMThreadContext *) tc = 0x0000000100500290
(MVMSTable *) st = 0x0000000101004f40
(void *) data = <variable not available>

(MVMGCWorklist *) worklist = 0x000000010d27c1a0
(MVMCallCaptureBody *) body = <no location, value may have been optimized
out>

(MVMArgProcContext *) ctx = 0x000000010a6f3c00
(MVMuint16) flag = 0
(MVMuint16) i = 0
(MVMuint16) count = <variable not available>

(MVMuint8 *) flag_map = 0x0000000010a72533 ""
(lldb) p flag_map
(MVMuint8 *) $6 = 0x0000000010a72533 ""
(lldb) p flag_map[0]
error​: Couldn't apply expression side effects : Couldn't dematerialize a
result variable​: couldn't read its memory

On Sun, Jul 10, 2016 at 12​:19 AM jnthn@​jnthn.net via RT <
perl6-bugs-followup@​perl.org> wrote​:

On Tue Jul 05 17​:51​:46 2016, ugexe@​cpan.org wrote​:

Note that the final decode-base64 candidate shows the correct results
when debugging statements are added

This gist also shows a small change that makes it produce the correct
values but it still segfaults more often than not
https://gist.github.com/ugexe/baa168a641894a0731595c812724f76d

Having dug into it a little, I'm not sure that the new multi-caching
stuff has caused this, so much as uncovered it (perhaps by moving GC
collection points). The valgrind output is as follows​:

ok 1 \- decoding the empty string
ok 2 \- decoding "A"
not ok 3 \- decoding "Ab"

# Failed test 'decoding "Ab"'
# at t/01-basic.t line 29
# expected​: 'Ab'
# got​: 'A'
not ok 4 - decoding "Abc"

# Failed test 'decoding "Abc"'
# at t/01-basic.t line 30
# expected​: 'Abc'
# got​: 'A'
==19699== Invalid read of size 8
==19699== at 0x4FF2B3E​: gc_mark (MVMCallCapture.c​:51)
==19699== by 0x4FC6BF5​: process_worklist (collect.c​:313)
==19699== by 0x4FC71A0​: MVM_gc_collect (collect.c​:129)
==19699== by 0x4FC2735​: run_gc (orchestrate.c​:304)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator (orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8BEB​: MVM_frame_capturelex (frame.c​:1035)
==19699== by 0x4F98D8F​: MVM_interp_run (interp.c​:1070)
==19699== by 0x505E3F8​: MVM_vm_run_file (moar.c​:304)
==19699== by 0x401036​: main (main.c​:191)
==19699== Address 0x658f9d0 is 0 bytes inside a block of size 32 free'd
==19699== at 0x4C2BDEC​: free (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19699== by 0x4FF2CDA​: MVM_free (alloc.h​:29)
==19699== by 0x4FF2CDA​: gc_free (MVMCallCapture.c​:78)
==19699== by 0x4FC72D3​: MVM_gc_collect_free_nursery_uncopied
(collect.c​:580)
==19699== by 0x4FC2976​: run_gc (orchestrate.c​:328)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator (orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8D4C​: MVM_frame_takeclosure (frame.c​:1051)
==19699== by 0xEA66105​: ???
==19699== by 0x50342B8​: MVM_jit_enter_code (compile.c​:136)
==19699== by 0x4F938F0​: MVM_interp_run (interp.c​:5456)
==19699==
==19699== Invalid read of size 1
==19699== at 0x4FF2A7D​: gc_mark (MVMCallCapture.c​:55)
==19699== by 0x4FC6BF5​: process_worklist (collect.c​:313)
==19699== by 0x4FC71A0​: MVM_gc_collect (collect.c​:129)
==19699== by 0x4FC2735​: run_gc (orchestrate.c​:304)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator (orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8BEB​: MVM_frame_capturelex (frame.c​:1035)
==19699== by 0x4F98D8F​: MVM_interp_run (interp.c​:1070)
==19699== by 0x505E3F8​: MVM_vm_run_file (moar.c​:304)
==19699== by 0x401036​: main (main.c​:191)
==19699== Address 0xb36cc70 is 0 bytes inside a block of size 1 free'd
==19699== at 0x4C2BDEC​: free (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19699== by 0x4FF2CD1​: MVM_free (alloc.h​:29)
==19699== by 0x4FF2CD1​: gc_free (MVMCallCapture.c​:77)
==19699== by 0x4FC72D3​: MVM_gc_collect_free_nursery_uncopied
(collect.c​:580)
==19699== by 0x4FC2976​: run_gc (orchestrate.c​:328)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator (orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8D4C​: MVM_frame_takeclosure (frame.c​:1051)
==19699== by 0xEA66105​: ???
==19699== by 0x50342B8​: MVM_jit_enter_code (compile.c​:136)
==19699== by 0x4F938F0​: MVM_interp_run (interp.c​:5456)
==19699==
not ok 5 - decoding "Abcd"

Which is suggestive of some kind of memory corruption (seemingly due to
over-sharing or a premature free) inside of the argument capture. Still
trying to figure out exactly what leads to this.

@p6rt
Copy link
Author

p6rt commented Mar 9, 2018

From @dogbert17

On Wed, 27 Jul 2016 17​:43​:48 -0700, lloyd.fourn@​gmail.com wrote​:

By flag_map I mean ctx->callsite->arg_flags​:

(lldb) p ctx->arg_flags
(MVMCallsiteEntry *) $8 = 0x0000000000000000
(lldb) p ctx->callsite->arg_flags
(MVMCallsiteEntry *) $9 = 0x0000000010a72533 ""

On Mon, Jul 11, 2016 at 3​:59 AM Lloyd Fournier <lloyd.fourn@​gmail.com>
wrote​:

I've been getting segfaults in this area recently. The trace is a bit
different but I guess it's related. It seems that flag_map in gc_mark
is no
longer allocated so I get segfault.

(lldb) r
There is a running process, kill it and restart?​: [Y/n] y
Process 75673 exited with status = 9 (0x00000009)
Process 75691 launched​: '/Users/llfourn/src/rakudo/install/bin/moar'
(x86_64)
Process 75691 stopped
* thread #​1​: tid = 0x23cfc68, 0x0000000100063ca4
libmoar.dylib`gc_mark(tc=0x0000000100500290, st=0x0000000101004f40,
data=<unavailable>, worklist=0x000000010d27c1a0) + 84 at
MVMCallCapture.c​:55, queue = 'com.apple.main-thread', stop reason =
EXC_BAD_ACCESS (code=1, address=0x10a72533)
frame #​0​: 0x0000000100063ca4
libmoar.dylib`gc_mark(tc=0x0000000100500290, st=0x0000000101004f40,
data=<unavailable>, worklist=0x000000010d27c1a0) + 84 at
MVMCallCapture.c​:55
52 MVMuint16 count = ctx->arg_count;
53 MVMuint16 i, flag;
54 for (i = 0, flag = 0; i < count; i++, flag++) {
-> 55 if (flag_map[flag] & MVM_CALLSITE_ARG_NAMED) {
56 /* Current position is name, then next is
value. */
57 MVM_gc_worklist_add(tc, worklist, &ctx-

args[i].s);
58 i++;
(lldb) bt
* thread #​1​: tid = 0x23cfc68, 0x0000000100063ca4
libmoar.dylib`gc_mark(tc=0x0000000100500290, st=0x0000000101004f40,
data=<unavailable>, worklist=0x000000010d27c1a0) + 84 at
MVMCallCapture.c​:55, queue = 'com.apple.main-thread', stop reason =
EXC_BAD_ACCESS (code=1, address=0x10a72533)
* frame #​0​: 0x0000000100063ca4
libmoar.dylib`gc_mark(tc=0x0000000100500290, st=0x0000000101004f40,
data=<unavailable>, worklist=0x000000010d27c1a0) + 84 at
MVMCallCapture.c​:55
frame #​1​: 0x000000010003f002
libmoar.dylib`MVM_gc_mark_collectable(tc=<unavailable>,
worklist=<unavailable>, new_addr=<unavailable>) + 1714 at
collect.c​:399
frame #​2​: 0x000000010003e883
libmoar.dylib`process_worklist(tc=0x0000000100500290,
worklist=0x000000010d27c1a0, wtp=0x00007fff5fbff468, gen='\0') + 899
at
collect.c​:313
frame #​3​: 0x000000010003e324
libmoar.dylib`MVM_gc_collect(tc=0x0000000100500290,
what_to_do=<unavailable>, gen=<unavailable>) + 820 at collect.c​:129
frame #​4​: 0x000000010003b703
libmoar.dylib`run_gc(tc=0x0000000100500290, what_to_do='\0') + 131 at
orchestrate.c​:304
frame #​5​: 0x000000010003b4c8
libmoar.dylib`MVM_gc_enter_from_allocator(tc=0x0000000100500290) +
984 at
orchestrate.c​:438
frame #​6​: 0x000000010003bca8 libmoar.dylib`MVM_gc_allocate_zeroed
[inlined] MVM_gc_allocate_nursery(tc=<unavailable>,
size=<unavailable>) +
52 at allocation.c​:32
frame #​7​: 0x000000010003bc74 libmoar.dylib`MVM_gc_allocate_zeroed
[inlined] MVM_gc_allocate(tc=<unavailable>) + 23 at allocation.h​:13
frame #​8​: 0x000000010003bc5d
libmoar.dylib`MVM_gc_allocate_zeroed(tc=0x0000000100500290, size=40)
+ 13
at allocation.c​:49
frame #​9​: 0x000000010003bf44
libmoar.dylib`MVM_gc_allocate_object(tc=0x0000000100500290,
st=<unavailable>) + 84 at allocation.c​:86
frame #​10​: 0x000000010005813f
libmoar.dylib`allocate(tc=<unavailable>,
st=<unavailable>) + 15 at P6opaque.c​:60
frame #​11​: 0x0000000100050d27 libmoar.dylib`MVM_repr_box_str
[inlined]
MVM_repr_alloc_init(tc=0x0000000100500290, type=0x0000000101626fc0) +
14 at
reprconv.c​:13
frame #​12​: 0x0000000100050d19
libmoar.dylib`MVM_repr_box_str(tc=0x0000000100500290,
type=0x0000000101626fc0, val=<unavailable>) + 73 at reprconv.c​:586
frame #​13​: 0x0000000100009c07
libmoar.dylib`MVM_exception_backtrace(tc=0x0000000100500290,
ex_obj=<unavailable>) + 1111 at exceptions.c​:407
frame #​14​: 0x0000000100017488
libmoar.dylib`MVM_interp_run(tc=0x0000000100500290,
initial_invoke=<unavailable>, invoke_data=<unavailable>) + 47320 at
interp.c​:3830
frame #​15​: 0x00000001000c8305
libmoar.dylib`MVM_vm_run_file(instance=<unavailable>,
filename=<unavailable>) + 181 at moar.c​:304
frame #​16​: 0x000000010000170a moar`main(argc=<unavailable>,
argv=0x00007fff5fbff8e8) + 666 at main.c​:191
frame #​17​: 0x00007fff900c05ad libdyld.dylib`start + 1
frame #​18​: 0x00007fff900c05ad libdyld.dylib`start + 1
(lldb) fr v
(MVMThreadContext *) tc = 0x0000000100500290
(MVMSTable *) st = 0x0000000101004f40
(void *) data = <variable not available>

(MVMGCWorklist *) worklist = 0x000000010d27c1a0
(MVMCallCaptureBody *) body = <no location, value may have been
optimized
out>

(MVMArgProcContext *) ctx = 0x000000010a6f3c00
(MVMuint16) flag = 0
(MVMuint16) i = 0
(MVMuint16) count = <variable not available>

(MVMuint8 *) flag_map = 0x0000000010a72533 ""
(lldb) p flag_map
(MVMuint8 *) $6 = 0x0000000010a72533 ""
(lldb) p flag_map[0]
error​: Couldn't apply expression side effects : Couldn't
dematerialize a
result variable​: couldn't read its memory

On Sun, Jul 10, 2016 at 12​:19 AM jnthn@​jnthn.net via RT <
perl6-bugs-followup@​perl.org> wrote​:

On Tue Jul 05 17​:51​:46 2016, ugexe@​cpan.org wrote​:

Note that the final decode-base64 candidate shows the correct
results
when debugging statements are added

This gist also shows a small change that makes it produce the
correct
values but it still segfaults more often than not
https://gist.github.com/ugexe/baa168a641894a0731595c812724f76d

Having dug into it a little, I'm not sure that the new multi-caching
stuff has caused this, so much as uncovered it (perhaps by moving GC
collection points). The valgrind output is as follows​:

ok 1 - decoding the empty string
ok 2 - decoding "A"
not ok 3 - decoding "Ab"

# Failed test 'decoding "Ab"'
# at t/01-basic.t line 29
# expected​: 'Ab'
# got​: 'A'
not ok 4 - decoding "Abc"

# Failed test 'decoding "Abc"'
# at t/01-basic.t line 30
# expected​: 'Abc'
# got​: 'A'
==19699== Invalid read of size 8
==19699== at 0x4FF2B3E​: gc_mark (MVMCallCapture.c​:51)
==19699== by 0x4FC6BF5​: process_worklist (collect.c​:313)
==19699== by 0x4FC71A0​: MVM_gc_collect (collect.c​:129)
==19699== by 0x4FC2735​: run_gc (orchestrate.c​:304)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator
(orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8BEB​: MVM_frame_capturelex (frame.c​:1035)
==19699== by 0x4F98D8F​: MVM_interp_run (interp.c​:1070)
==19699== by 0x505E3F8​: MVM_vm_run_file (moar.c​:304)
==19699== by 0x401036​: main (main.c​:191)
==19699== Address 0x658f9d0 is 0 bytes inside a block of size 32
free'd
==19699== at 0x4C2BDEC​: free (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19699== by 0x4FF2CDA​: MVM_free (alloc.h​:29)
==19699== by 0x4FF2CDA​: gc_free (MVMCallCapture.c​:78)
==19699== by 0x4FC72D3​: MVM_gc_collect_free_nursery_uncopied
(collect.c​:580)
==19699== by 0x4FC2976​: run_gc (orchestrate.c​:328)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator
(orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8D4C​: MVM_frame_takeclosure (frame.c​:1051)
==19699== by 0xEA66105​: ???
==19699== by 0x50342B8​: MVM_jit_enter_code (compile.c​:136)
==19699== by 0x4F938F0​: MVM_interp_run (interp.c​:5456)
==19699==
==19699== Invalid read of size 1
==19699== at 0x4FF2A7D​: gc_mark (MVMCallCapture.c​:55)
==19699== by 0x4FC6BF5​: process_worklist (collect.c​:313)
==19699== by 0x4FC71A0​: MVM_gc_collect (collect.c​:129)
==19699== by 0x4FC2735​: run_gc (orchestrate.c​:304)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator
(orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8BEB​: MVM_frame_capturelex (frame.c​:1035)
==19699== by 0x4F98D8F​: MVM_interp_run (interp.c​:1070)
==19699== by 0x505E3F8​: MVM_vm_run_file (moar.c​:304)
==19699== by 0x401036​: main (main.c​:191)
==19699== Address 0xb36cc70 is 0 bytes inside a block of size 1
free'd
==19699== at 0x4C2BDEC​: free (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19699== by 0x4FF2CD1​: MVM_free (alloc.h​:29)
==19699== by 0x4FF2CD1​: gc_free (MVMCallCapture.c​:77)
==19699== by 0x4FC72D3​: MVM_gc_collect_free_nursery_uncopied
(collect.c​:580)
==19699== by 0x4FC2976​: run_gc (orchestrate.c​:328)
==19699== by 0x4FC320D​: MVM_gc_enter_from_allocator
(orchestrate.c​:438)
==19699== by 0x4FC3627​: MVM_gc_allocate_nursery (allocation.c​:32)
==19699== by 0x4FC39ED​: MVM_gc_allocate_frame (allocation.c​:99)
==19699== by 0x4FA749F​: MVM_frame_force_to_heap (frame.c​:657)
==19699== by 0x4FA8D4C​: MVM_frame_takeclosure (frame.c​:1051)
==19699== by 0xEA66105​: ???
==19699== by 0x50342B8​: MVM_jit_enter_code (compile.c​:136)
==19699== by 0x4F938F0​: MVM_interp_run (interp.c​:5456)
==19699==
not ok 5 - decoding "Abcd"

Which is suggestive of some kind of memory corruption (seemingly due
to
over-sharing or a premature free) inside of the argument capture.
Still
trying to figure out exactly what leads to this.

Fixed with commit (2017-09-30) rakudo/rakudo@41896b7.
Test needed.

@p6rt p6rt added SEGV Segmentation fault, bus error, etc. testneeded labels Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SEGV Segmentation fault, bus error, etc. testneeded
Projects
None yet
Development

No branches or pull requests

1 participant