Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC related crash on repeated sleepsort (using Channel) #5255

Closed
p6rt opened this issue Apr 22, 2016 · 9 comments
Closed

GC related crash on repeated sleepsort (using Channel) #5255

p6rt opened this issue Apr 22, 2016 · 9 comments

Comments

@p6rt
Copy link

p6rt commented Apr 22, 2016

Migrated from rt.perl.org#127960 (status was 'resolved')

Searchable as RT127960$

@p6rt
Copy link
Author

p6rt commented Apr 22, 2016

From marcus.ramberg@usit.uio.no

From #perl6

marcusramberg <http://dogfood.convos.by/freenode/marcusramberg>
Listening to Damian talk about concurrency in Oslo, and I tried one of his examples, but it seems to crash perl6 when you try to run it for some iterations - https://gist.github.com/marcusramberg/f789306f4f580c6cf1270ca12a333391 <https://gist.github.com/marcusramberg/f789306f4f580c6cf1270ca12a333391>
gives me [1] 33874 abort perl6 channel.p6
17​:11
bartolin <http://dogfood.convos.by/freenode/bartolin>
timotimo​: how did you force-reset it? from the command line?
17​:11
RabidGravy <http://dogfood.convos.by/freenode/RabidGravy>
marcusramberg, so it does
17​:11
timotimo <http://dogfood.convos.by/freenode/timotimo>
i use virt-manager, because i have no clue about the commandline
17​:12
but i'm using it via x11 forwarding because i didn't yet put my ssh key of my laptop on there yet
17​:12
lizmat <http://dogfood.convos.by/freenode/lizmat>
marcusramberg​: crash confirmed :-(
17​:12
ajoe <http://dogfood.convos.by/freenode/ajoe> joined #perl6.
17​:12
marcusramberg <http://dogfood.convos.by/freenode/marcusramberg>
:-/
17​:13
bartolin <http://dogfood.convos.by/freenode/bartolin>
but you have a root login for the hypervisor? could you try "virsh list" there?
17​:13
lizmat <http://dogfood.convos.by/freenode/lizmat>
marcusramberg​: the good news is that it fails consistently after a fixed number of iterations
17​:

17​:13
RabidGravy <http://dogfood.convos.by/freenode/RabidGravy>
lizmat, marcusramberg it appears to be associated with garbage collection looking at the backtrace
17

17​:13
lizmat <http://dogfood.convos.by/freenode/lizmat>
marcusramberg​: and that number actually differs with different settings of MVM_SPESH_DISABLE and --optimize

@p6rt
Copy link
Author

p6rt commented Apr 22, 2016

From @jonathanstowe

The code​:

sub sleep_sort (*@​list where .all >= 0) {

  my $channel = Channel.new;

  await @​list.map​: -> $delay {
  Promise.start({
  sleep $delay / 1000;
  $channel.send($delay);
  });

  };

  $channel.close;

  return $channel.list;
}

say sleep_sort(3,2,1,5,4) for (1 ... 10000);

And gives rise to this back trace​:
#​0 0x00007ffff73e8a98 in raise () from /lib64/libc.so.6
No symbol table info available.
#​1 0x00007ffff73ea69a in abort () from /lib64/libc.so.6
No symbol table info available.
#​2 0x00007ffff79ac867 in uv_mutex_destroy ()
  from /home/jonathan/.rakudobrew/moar-nom/install/lib/libmoar.so
No symbol table info available.
#​3 0x00007ffff7930c9d in gc_free ()
  from /home/jonathan/.rakudobrew/moar-nom/install/lib/libmoar.so
No symbol table info available.
#​4 0x00007ffff78fe8a1 in MVM_gc_collect_free_nursery_uncopied ()
  from /home/jonathan/.rakudobrew/moar-nom/install/lib/libmoar.so
No symbol table info available.
#​5 0x00007ffff78fab4a in run_gc ()
  from /home/jonathan/.rakudobrew/moar-nom/install/lib/libmoar.so
No symbol table info available.
#​6 0x00007ffff78fb312 in MVM_gc_enter_from_allocator ()
  from /home/jonathan/.rakudobrew/moar-nom/install/lib/libmoar.so
No symbol table info available.
#​7 0x00007ffff78fb448 in MVM_gc_allocate_nursery ()
  from /home/jonathan/.rakudobrew/moar-nom/install/lib/libmoar.so
No symbol table info available.
#​8 0x00007ffff78fb68a in MVM_gc_allocate_object ()

This is Rakudo version 2016.03-135-gbaf8ac3 built on MoarVM version 2016.03-108-gca1a21a

@p6rt
Copy link
Author

p6rt commented Apr 22, 2016

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Apr 23, 2016

From @lizmat

[2016-04-22T16​:08​:20+0100] <marcusramberg> Listening to Damian talk about concurrency in Oslo, and I tried one of his examples, but it seems to crash perl6 when you try to run it for some iterations - https://gist.github.com/marcusramberg/f789306f4f580c6cf1270ca12a333391
[2016-04-22T16​:08​:30+0100] <marcusramberg> gives me [1] 33874 abort perl6 channel.p6
[2016-04-22T16​:11​:44+0100] <RabidGravy> marcusramberg, so it does
[2016-04-22T16​:12​:23+0100] <lizmat> marcusramberg​: crash confirmed :-(
[2016-04-22T16​:12​:34+0100] <marcusramberg> :-/
[2016-04-22T16​:13​:29+0100] <lizmat> marcusramberg​: the good news is that it fails consistently after a fixed number of iterations
[2016-04-22T16​:13​:31+0100] <perlpilot> marcusramberg​: Did Damian actually run it ?
[2016-04-22T16​:13​:41+0100] <marcusramberg> perlpilot​: only once
[2016-04-22T16​:13​:50+0100] <RabidGravy> lizmat, marcusramberg it appears to be associated with garbage collection looking at the backtrace
[2016-04-22T16​:13​:55+0100] <lizmat> marcusramberg​: and that number actually differs with different settings of MVM_SPESH_DISABLE and --optimize
[2016-04-22T16​:13​:59+0100] <perlpilot> marcusramberg​: do you happen to know what rakudo version he was using?
[2016-04-22T16​:14​:53+0100] <lizmat> if I add an "nqp​::force_gc" to the sleep_sort sub, it doesn't fail
[2016-04-22T16​:15​:11+0100] <marcusramberg> perlpilot​: no. he's still talking :)
[2016-04-22T16​:15​:29+0100] <lizmat> RabidGravy​: so indeed GC related
[2016-04-22T16​:15​:41+0100] <perlpilot> marcusramberg​: well ask him about it! See if he admits to shenanigans ;)
[2016-04-22T16​:18​:26+0100] <lizmat> marcusramberg​: care to submit a rakudobug ?
[2016-04-22T16​:18​:57+0100] <marcusramberg> lizmat​: I can try
[2016-04-22T16​:19​:18+0100] <lizmat> copy this discussion to an email and send it to rakudobug@​perl.org

The code in question​:

sub sleep_sort (*@​list where .all >= 0) {

  my $channel = Channel.new;

  await @​list.map​: -> $delay {
  Promise.start({
  sleep $delay / 1000;
  $channel.send($delay);
  });

  };

  $channel.close;

  return $channel.list;
}

say sleep_sort(3,2,1,5,4) for (1 ... 10000);

@p6rt
Copy link
Author

p6rt commented Apr 26, 2016

From @jonathanstowe

Looks like this isn't failing with

This is Rakudo version 2016.04-36-gce5dc00 built on MoarVM version 2016.04

Probably some test to close.

@p6rt
Copy link
Author

p6rt commented Apr 26, 2016

From @lizmat

On 26 Apr 2016, at 19​:24, Jonathan Stowe via RT <perl6-bugs-followup@​perl.org> wrote​:

Looks like this isn't failing with

This is Rakudo version 2016.04-36-gce5dc00 built on MoarVM version 2016.04

Probably some test to close.

No, please keep this open, as this was only fixed by essentially a workaround​:

Channel.Supply.list still fails in the same way.

Liz

@p6rt
Copy link
Author

p6rt commented Apr 26, 2016

From marcus.ramberg@usit.uio.no

On 26 Apr 2016, at 19​:24, Jonathan Stowe via RT <perl6-bugs-followup@​perl.org> wrote​:

Looks like this isn't failing with

This is Rakudo version 2016.04-36-gce5dc00 built on MoarVM version 2016.04

Probably some test to close.

yay

@p6rt
Copy link
Author

p6rt commented Nov 2, 2016

From @jnthn

On Fri Apr 22 08​:55​:46 2016, jns+bc@​gellyfish.co.uk wrote​:

The code​:

sub sleep_sort (*@​list where .all >= 0) {

my $channel = Channel.new;

await @​list.map​: -> $delay {
Promise.start({
sleep $delay / 1000;
$channel.send($delay);
});

};

$channel.close;

return $channel.list;
}

say sleep_sort(3,2,1,5,4) for (1 ... 10000);

And gives rise to this back trace​:
#​0 0x00007ffff73e8a98 in raise () from /lib64/libc.so.6
No symbol table info available.
#​1 0x00007ffff73ea69a in abort () from /lib64/libc.so.6
No symbol table info available.
#​2 0x00007ffff79ac867 in uv_mutex_destroy ()
from /home/jonathan/.rakudobrew/moar-nom/install/lib/libmoar.so
No symbol table info available.
#​3 0x00007ffff7930c9d in gc_free ()
from /home/jonathan/.rakudobrew/moar-nom/install/lib/libmoar.so
No symbol table info available.

Looks like the mutex missing unlock bug that I fixed in Rakudo commit 48c2af6d059. Also, I've run the code in question (all 10,000 iterations of it) a bunch of times and not seen any failures. For good measure, I've also added it as a stress test in S17-channel/stress.t.

/jnthn

@p6rt
Copy link
Author

p6rt commented Nov 2, 2016

@jnthn - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant