Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken Text::CSV tests and possibly other ecosystem fallout (buffering) #6493

Closed
p6rt opened this issue Sep 5, 2017 · 17 comments
Closed

Broken Text::CSV tests and possibly other ecosystem fallout (buffering) #6493

p6rt opened this issue Sep 5, 2017 · 17 comments
Labels
regression Issue did not exist previously testneeded

Comments

@p6rt
Copy link

p6rt commented Sep 5, 2017

Migrated from rt.perl.org#132030 (status was 'resolved')

Searchable as RT132030$

@p6rt
Copy link
Author

p6rt commented Sep 5, 2017

From @AlexDaniel

<[Tux]> CSV tests started to fail
<[Tux]> # at t/90_csv.t line 260
<[Tux]> # expected​: $[["1", "2", "3"], ["2", "a b", ""]]
<[Tux]> # got​: $[["1", "2", "3"],]
<[Tux]> # Failed test 'AOH parse out'

Bisectable points to rakudo/rakudo@4b02b8a

bisect log​: https://gist.github.com/62a876b09bfecc9aa305061e384f43ce
double-check with committable​: https://gist.github.com/50bd16934e6bc93ad49a1659cf31bf06

I'm suspecting that Whateverable issues that I'm seeing right now are also associated with this.

@p6rt
Copy link
Author

p6rt commented Sep 5, 2017

From @AlexDaniel

The issue is likely in Text​::CSV so it's not fully a rakudo bug.

However, we probably have a bunch of broken modules in the ecosystem because of this. I guess we need a Toaster report and a bunch of PRs for missing .close calls to fix everything…
On 2017-09-05 00​:54​:27, alex.jakimenko@​gmail.com wrote​:

<[Tux]> CSV tests started to fail
<[Tux]> # at t/90_csv.t line 260
<[Tux]> # expected​: $[["1", "2", "3"], ["2", "a b", ""]]
<[Tux]> # got​: $[["1", "2", "3"],]
<[Tux]> # Failed test 'AOH parse out'

Bisectable points to
rakudo/rakudo@4b02b8a

bisect log​: https://gist.github.com/62a876b09bfecc9aa305061e384f43ce
double-check with committable​:
https://gist.github.com/50bd16934e6bc93ad49a1659cf31bf06

I'm suspecting that Whateverable issues that I'm seeing right now are
also associated with this.

@p6rt
Copy link
Author

p6rt commented Sep 5, 2017

From @jnthn

On Tue, 05 Sep 2017 00​:54​:27 -0700, alex.jakimenko@​gmail.com wrote​:

<[Tux]> CSV tests started to fail
<[Tux]> # at t/90_csv.t line 260
<[Tux]> # expected​: $[["1", "2", "3"], ["2", "a b", ""]]
<[Tux]> # got​: $[["1", "2", "3"],]
<[Tux]> # Failed test 'AOH parse out'

Bisectable points to
rakudo/rakudo@4b02b8a

bisect log​: https://gist.github.com/62a876b09bfecc9aa305061e384f43ce
double-check with committable​:
https://gist.github.com/50bd16934e6bc93ad49a1659cf31bf06

I'm suspecting that Whateverable issues that I'm seeing right now are
also associated with this.

Text​::CSV is failing to close an output handle on all codepaths in one of its methods, and can be fixed by a patch to do that​:

https://gist.github.com/jnthn/a999df1f89d24cdc6ffac42ca55be806

Failing to close output handles has been clearly documented (and yes, documented well before the recent buffering change) as something that can cause data loss. Default output buffering just makes it quite a lot more likely to show up.

While there will be some ecosystem fallout like this, unfortunately I don't think it's avoidable. If we whip out the patch that turns output buffering on by default for non-TTYs for this release, then when will we include it? The longer we leave it, the more painful it will be, because more code will be written that is careless with handles.

I don't think "leave it off by default" is a good option either, otherwise we get to spend the next decade hearing "Perl 6 I/O is slow" because it'd be one of the only languages that doesn't buffer its output without an explicit flag being passed to enable that (which nearly nobody doing quick benchmarks will know to use).

In summary, if we had a time machine, going back and implementing this when the ecosystem was smaller would be a good use of it. But we don't, and we won't in the future, so I suspect the best thing to do is take the hit now.

Will leave this ticket open a little longer for further comments/discussion, but my intention is to reject it.

/jnthn

@p6rt
Copy link
Author

p6rt commented Sep 5, 2017

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Sep 5, 2017

From 1parrota@gmail.com

Perl 5 programmers are used to being casual about closing file
handles. Obviously, 6 requires a change of habits. A cultural shift,
if that's not too pretentious a term. Perhaps it needs to be mentioned
in large, friendly letters somewhere in the documentation?

On 9/5/17, jnthn@​jnthn.net via RT <perl6-bugs-followup@​perl.org> wrote​:

On Tue, 05 Sep 2017 00​:54​:27 -0700, alex.jakimenko@​gmail.com wrote​:

<[Tux]> CSV tests started to fail
<[Tux]> # at t/90_csv.t line 260
<[Tux]> # expected​: $[["1", "2", "3"], ["2", "a b", ""]]
<[Tux]> # got​: $[["1", "2", "3"],]
<[Tux]> # Failed test 'AOH parse out'

Bisectable points to
rakudo/rakudo@4b02b8a

bisect log​: https://gist.github.com/62a876b09bfecc9aa305061e384f43ce
double-check with committable​:
https://gist.github.com/50bd16934e6bc93ad49a1659cf31bf06

I'm suspecting that Whateverable issues that I'm seeing right now are
also associated with this.

Text​::CSV is failing to close an output handle on all codepaths in one of
its methods, and can be fixed by a patch to do that​:

https://gist.github.com/jnthn/a999df1f89d24cdc6ffac42ca55be806

Failing to close output handles has been clearly documented (and yes,
documented well before the recent buffering change) as something that can
cause data loss. Default output buffering just makes it quite a lot more
likely to show up.

While there will be some ecosystem fallout like this, unfortunately I don't
think it's avoidable. If we whip out the patch that turns output buffering
on by default for non-TTYs for this release, then when will we include it?
The longer we leave it, the more painful it will be, because more code will
be written that is careless with handles.

I don't think "leave it off by default" is a good option either, otherwise
we get to spend the next decade hearing "Perl 6 I/O is slow" because it'd be
one of the only languages that doesn't buffer its output without an explicit
flag being passed to enable that (which nearly nobody doing quick benchmarks
will know to use).

In summary, if we had a time machine, going back and implementing this when
the ecosystem was smaller would be a good use of it. But we don't, and we
won't in the future, so I suspect the best thing to do is take the hit now.

Will leave this ticket open a little longer for further comments/discussion,
but my intention is to reject it.

/jnthn

@p6rt
Copy link
Author

p6rt commented Sep 5, 2017

From @jnthn

On Tue, 05 Sep 2017 05​:29​:00 -0700, 1parrota@​gmail.com wrote​:

Perl 5 programmers are used to being casual about closing file
handles. Obviously, 6 requires a change of habits. A cultural shift,
if that's not too pretentious a term. Perhaps it needs to be mentioned
in large, friendly letters somewhere in the documentation?

It's called out in the open docs here​:

https://docs.perl6.org/routine/open#(IO::Handle)_method_open

But it would be worth a mention here also​:

https://docs.perl6.org/language/5to6-perlfunc#open

I've filed a docs issue about that​:

Raku/doc#1536

@p6rt
Copy link
Author

p6rt commented Sep 5, 2017

From @geekosaur

On Tue, Sep 5, 2017 at 5​:40 AM, jnthn@​jnthn.net via RT <
perl6-bugs-followup@​perl.org> wrote​:

Failing to close output handles has been clearly documented (and yes,
documented well before the recent buffering change) as something that can
cause data loss. Default output buffering just makes it quite a lot more
likely to show up.

While there will be some ecosystem fallout like this, unfortunately I
don't think it's avoidable. If we whip out the patch that turns output
buffering on by default for non-TTYs for this release, then when will we
include it? The longer we leave it, the more painful it will be, because
more code will be written that is careless with handles.

I don't think "leave it off by default" is a good option either, otherwise
we get to spend the next decade hearing "Perl 6 I/O is slow" because it'd
be one of the only languages that doesn't buffer its output without an
explicit flag being passed to enable that (which nearly nobody doing quick
benchmarks will know to use).

Are we missing something to flush/close handles at exit? Leaving it to a GC
that may not finalize before exit is not really an option.

--
brandon s allbery kf8nh sine nomine associates
allbery.b@​gmail.com ballbery@​sinenomine.net
unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

@p6rt
Copy link
Author

p6rt commented Sep 5, 2017

From @AlexDaniel

“Will leave this ticket open a little longer for further comments/discussion, but my intention is to reject it.”

Nope, that's not exactly a good sounding plan.

We'll run toaster and fix all modules that require fixing (or create appropriate issues if fixing something turns out to be too hard). This is exactly what I did for Grammar.parse change before the previous release (even though we reverted it anyway). And this is what I'm planning to do for this change also.

You are 100% right about the change, but flipping the ecosystem upside down is not exactly a good strategy even if you're right.

On 2017-09-05 09​:11​:19, allbery.b@​gmail.com wrote​:

On Tue, Sep 5, 2017 at 5​:40 AM, jnthn@​jnthn.net via RT <
perl6-bugs-followup@​perl.org> wrote​:

Failing to close output handles has been clearly documented (and yes,
documented well before the recent buffering change) as something that can
cause data loss. Default output buffering just makes it quite a lot more
likely to show up.

While there will be some ecosystem fallout like this, unfortunately I
don't think it's avoidable. If we whip out the patch that turns output
buffering on by default for non-TTYs for this release, then when will we
include it? The longer we leave it, the more painful it will be, because
more code will be written that is careless with handles.

I don't think "leave it off by default" is a good option either, otherwise
we get to spend the next decade hearing "Perl 6 I/O is slow" because it'd
be one of the only languages that doesn't buffer its output without an
explicit flag being passed to enable that (which nearly nobody doing quick
benchmarks will know to use).

Are we missing something to flush/close handles at exit? Leaving it to a GC
that may not finalize before exit is not really an option.

@p6rt
Copy link
Author

p6rt commented Sep 6, 2017

From @jnthn

On Tue, 05 Sep 2017 09​:11​:19 -0700, allbery.b@​gmail.com wrote​:

On Tue, Sep 5, 2017 at 5​:40 AM, jnthn@​jnthn.net via RT <
perl6-bugs-followup@​perl.org> wrote​:

Failing to close output handles has been clearly documented (and yes,
documented well before the recent buffering change) as something that can
cause data loss. Default output buffering just makes it quite a lot more
likely to show up.

While there will be some ecosystem fallout like this, unfortunately I
don't think it's avoidable. If we whip out the patch that turns output
buffering on by default for non-TTYs for this release, then when will we
include it? The longer we leave it, the more painful it will be, because
more code will be written that is careless with handles.

I don't think "leave it off by default" is a good option either, otherwise
we get to spend the next decade hearing "Perl 6 I/O is slow" because it'd
be one of the only languages that doesn't buffer its output without an
explicit flag being passed to enable that (which nearly nobody doing quick
benchmarks will know to use).

Are we missing something to flush/close handles at exit? Leaving it to a GC
that may not finalize before exit is not really an option.

To recap the IRC discussion yesterday​: no, we haven't had this so far (except for stdout/stderr), and have gotten away with it due to the lack of output buffering. At present, we can either choose between​:

1) Start keeping a list of open files, and at exit close them (flushing is already part of closing). This can be done at Perl 6 level, in the same place we make sure to run END blocks.

2) Having unclosed handles possible to GC, and closing them if/when they get GC'd.

Today we are doing #​2. We could switch to doing #​1. We can't currently do both, because the moment we start keeping a list of open handles then they can't be GC'd, and so #​2 can't happen.

My initial inclination was to preserve behavior #​2, though others have pointed out that behavior #​1 is more useful for debugging in that it ensures log files, for example, will be written in the event of a crash, and a program relying on behavior #​2 could already run out of handles today anyway if it were less lucky with GC timing. This is a fair argument, and the automatic close at exit might be softer on the ecosystem too (but would have done nothing for the Text​::CSV case, which is the original subject of this ticket, because it wrote a file, didn't close it, then separately opened it for reading).

There's probably enough consensus to switch to option #​1, and lizmat++ mentioned maybe looking into a patch to do that.

In the longer run, we can have both, but it depends on implementing weak references. In terms of backend support, the JVM does have them, and it seems there's an npm package [1] exposing v8 weak refs so a Node.js backend could support that also. I'm OK with adding them to MoarVM in the future, but both doing that and exposing weak references at Perl 6 level would be a non-small, and certainly non-trivial, task.

@p6rt
Copy link
Author

p6rt commented Sep 6, 2017

From @lizmat

On 6 Sep 2017, at 15​:38, jnthn@​jnthn.net via RT <perl6-bugs-followup@​perl.org> wrote​:
To recap the IRC discussion yesterday​: no, we haven't had this so far (except for stdout/stderr), and have gotten away with it due to the lack of output buffering. At present, we can either choose between​:

1) Start keeping a list of open files, and at exit close them (flushing is already part of closing). This can be done at Perl 6 level, in the same place we make sure to run END blocks.

2) Having unclosed handles possible to GC, and closing them if/when they get GC'd.

Today we are doing #​2. We could switch to doing #​1. We can't currently do both, because the moment we start keeping a list of open handles then they can't be GC'd, and so #​2 can't happen.

My initial inclination was to preserve behavior #​2, though others have pointed out that behavior #​1 is more useful for debugging in that it ensures log files, for example, will be written in the event of a crash, and a program relying on behavior #​2 could already run out of handles today anyway if it were less lucky with GC timing. This is a fair argument, and the automatic close at exit might be softer on the ecosystem too (but would have done nothing for the Text​::CSV case, which is the original subject of this ticket, because it wrote a file, didn't close it, then separately opened it for reading).

There's probably enough consensus to switch to option #​1, and lizmat++ mentioned maybe looking into a patch to do that.

FWIW, I have a patch, but it causes t/spec/S32-io/lock.t to hang in test #​6. Trying to find out what is going on there right now.

Liz

@p6rt
Copy link
Author

p6rt commented Sep 11, 2017

From @lizmat

Fixed with 3c9cfdba88287e23e0ced8 (and further refined by later commits), tests needed.

On 6 Sep 2017, at 15​:38, jnthn@​jnthn.net via RT <perl6-bugs-followup@​perl.org> wrote​:

On Tue, 05 Sep 2017 09​:11​:19 -0700, allbery.b@​gmail.com wrote​:

On Tue, Sep 5, 2017 at 5​:40 AM, jnthn@​jnthn.net via RT <
perl6-bugs-followup@​perl.org> wrote​:

Failing to close output handles has been clearly documented (and yes,
documented well before the recent buffering change) as something that can
cause data loss. Default output buffering just makes it quite a lot more
likely to show up.

While there will be some ecosystem fallout like this, unfortunately I
don't think it's avoidable. If we whip out the patch that turns output
buffering on by default for non-TTYs for this release, then when will we
include it? The longer we leave it, the more painful it will be, because
more code will be written that is careless with handles.

I don't think "leave it off by default" is a good option either, otherwise
we get to spend the next decade hearing "Perl 6 I/O is slow" because it'd
be one of the only languages that doesn't buffer its output without an
explicit flag being passed to enable that (which nearly nobody doing quick
benchmarks will know to use).

Are we missing something to flush/close handles at exit? Leaving it to a GC
that may not finalize before exit is not really an option.

To recap the IRC discussion yesterday​: no, we haven't had this so far (except for stdout/stderr), and have gotten away with it due to the lack of output buffering. At present, we can either choose between​:

1) Start keeping a list of open files, and at exit close them (flushing is already part of closing). This can be done at Perl 6 level, in the same place we make sure to run END blocks.

2) Having unclosed handles possible to GC, and closing them if/when they get GC'd.

Today we are doing #​2. We could switch to doing #​1. We can't currently do both, because the moment we start keeping a list of open handles then they can't be GC'd, and so #​2 can't happen.

My initial inclination was to preserve behavior #​2, though others have pointed out that behavior #​1 is more useful for debugging in that it ensures log files, for example, will be written in the event of a crash, and a program relying on behavior #​2 could already run out of handles today anyway if it were less lucky with GC timing. This is a fair argument, and the automatic close at exit might be softer on the ecosystem too (but would have done nothing for the Text​::CSV case, which is the original subject of this ticket, because it wrote a file, didn't close it, then separately opened it for reading).

There's probably enough consensus to switch to option #​1, and lizmat++ mentioned maybe looking into a patch to do that.

In the longer run, we can have both, but it depends on implementing weak references. In terms of backend support, the JVM does have them, and it seems there's an npm package [1] exposing v8 weak refs so a Node.js backend could support that also. I'm OK with adding them to MoarVM in the future, but both doing that and exposing weak references at Perl 6 level would be a non-small, and certainly non-trivial, task.

@p6rt
Copy link
Author

p6rt commented Sep 14, 2017

From @AlexDaniel

FWIW there is still some ecosystem fallout (possibly very minor). I'll be linking issues here so that we have all things in one place.

IO​::MiddleMan and Lumberjack​: raku-community-modules/IO-MiddleMan#5
On 2017-09-11 04​:18​:39, elizabeth wrote​:

Fixed with 3c9cfdba88287e23e0ced8 (and further refined by later
commits), tests needed.

On 6 Sep 2017, at 15​:38, jnthn@​jnthn.net via RT <perl6-bugs-
followup@​perl.org> wrote​:

On Tue, 05 Sep 2017 09​:11​:19 -0700, allbery.b@​gmail.com wrote​:

On Tue, Sep 5, 2017 at 5​:40 AM, jnthn@​jnthn.net via RT <
perl6-bugs-followup@​perl.org> wrote​:

Failing to close output handles has been clearly documented (and
yes,
documented well before the recent buffering change) as something
that can
cause data loss. Default output buffering just makes it quite a lot
more
likely to show up.

While there will be some ecosystem fallout like this, unfortunately
I
don't think it's avoidable. If we whip out the patch that turns
output
buffering on by default for non-TTYs for this release, then when
will we
include it? The longer we leave it, the more painful it will be,
because
more code will be written that is careless with handles.

I don't think "leave it off by default" is a good option either,
otherwise
we get to spend the next decade hearing "Perl 6 I/O is slow"
because it'd
be one of the only languages that doesn't buffer its output without
an
explicit flag being passed to enable that (which nearly nobody
doing quick
benchmarks will know to use).

Are we missing something to flush/close handles at exit? Leaving it
to a GC
that may not finalize before exit is not really an option.

To recap the IRC discussion yesterday​: no, we haven't had this so far
(except for stdout/stderr), and have gotten away with it due to the
lack of output buffering. At present, we can either choose between​:

1) Start keeping a list of open files, and at exit close them
(flushing is already part of closing). This can be done at Perl 6
level, in the same place we make sure to run END blocks.

2) Having unclosed handles possible to GC, and closing them if/when
they get GC'd.

Today we are doing #​2. We could switch to doing #​1. We can't
currently do both, because the moment we start keeping a list of open
handles then they can't be GC'd, and so #​2 can't happen.

My initial inclination was to preserve behavior #​2, though others
have pointed out that behavior #​1 is more useful for debugging in
that it ensures log files, for example, will be written in the event
of a crash, and a program relying on behavior #​2 could already run
out of handles today anyway if it were less lucky with GC timing.
This is a fair argument, and the automatic close at exit might be
softer on the ecosystem too (but would have done nothing for the
Text​::CSV case, which is the original subject of this ticket, because
it wrote a file, didn't close it, then separately opened it for
reading).

There's probably enough consensus to switch to option #​1, and
lizmat++ mentioned maybe looking into a patch to do that.

In the longer run, we can have both, but it depends on implementing
weak references. In terms of backend support, the JVM does have them,
and it seems there's an npm package [1] exposing v8 weak refs so a
Node.js backend could support that also. I'm OK with adding them to
MoarVM in the future, but both doing that and exposing weak
references at Perl 6 level would be a non-small, and certainly non-
trivial, task.

@p6rt
Copy link
Author

p6rt commented Sep 14, 2017

From @AlexDaniel

NCurses, Term​::Choose, Term​::Choose​::Util, and Term​::TablePrint​: azawawi/raku-ncurses#16

This one is weird because it seems to be working unless you run it under prove. Perhaps the module is alright but it's the test that is broken.

On 2017-09-13 18​:16​:56, alex.jakimenko@​gmail.com wrote​:

FWIW there is still some ecosystem fallout (possibly very minor). I'll be
linking issues here so that we have all things in one place.

IO​::MiddleMan and Lumberjack​:
raku-community-modules/IO-MiddleMan#5
On 2017-09-11 04​:18​:39, elizabeth wrote​:

Fixed with 3c9cfdba88287e23e0ced8 (and further refined by later
commits), tests needed.

On 6 Sep 2017, at 15​:38, jnthn@​jnthn.net via RT <perl6-bugs-
followup@​perl.org> wrote​:

On Tue, 05 Sep 2017 09​:11​:19 -0700, allbery.b@​gmail.com wrote​:

On Tue, Sep 5, 2017 at 5​:40 AM, jnthn@​jnthn.net via RT <
perl6-bugs-followup@​perl.org> wrote​:

Failing to close output handles has been clearly documented (and
yes,
documented well before the recent buffering change) as something
that can
cause data loss. Default output buffering just makes it quite a lot
more
likely to show up.

While there will be some ecosystem fallout like this, unfortunately
I
don't think it's avoidable. If we whip out the patch that turns
output
buffering on by default for non-TTYs for this release, then when
will we
include it? The longer we leave it, the more painful it will be,
because
more code will be written that is careless with handles.

I don't think "leave it off by default" is a good option either,
otherwise
we get to spend the next decade hearing "Perl 6 I/O is slow"
because it'd
be one of the only languages that doesn't buffer its output without
an
explicit flag being passed to enable that (which nearly nobody
doing quick
benchmarks will know to use).

Are we missing something to flush/close handles at exit? Leaving it
to a GC
that may not finalize before exit is not really an option.

To recap the IRC discussion yesterday​: no, we haven't had this so far
(except for stdout/stderr), and have gotten away with it due to the
lack of output buffering. At present, we can either choose between​:

1) Start keeping a list of open files, and at exit close them
(flushing is already part of closing). This can be done at Perl 6
level, in the same place we make sure to run END blocks.

2) Having unclosed handles possible to GC, and closing them if/when
they get GC'd.

Today we are doing #​2. We could switch to doing #​1. We can't
currently do both, because the moment we start keeping a list of open
handles then they can't be GC'd, and so #​2 can't happen.

My initial inclination was to preserve behavior #​2, though others
have pointed out that behavior #​1 is more useful for debugging in
that it ensures log files, for example, will be written in the event
of a crash, and a program relying on behavior #​2 could already run
out of handles today anyway if it were less lucky with GC timing.
This is a fair argument, and the automatic close at exit might be
softer on the ecosystem too (but would have done nothing for the
Text​::CSV case, which is the original subject of this ticket, because
it wrote a file, didn't close it, then separately opened it for
reading).

There's probably enough consensus to switch to option #​1, and
lizmat++ mentioned maybe looking into a patch to do that.

In the longer run, we can have both, but it depends on implementing
weak references. In terms of backend support, the JVM does have them,
and it seems there's an npm package [1] exposing v8 weak refs so a
Node.js backend could support that also. I'm OK with adding them to
MoarVM in the future, but both doing that and exposing weak
references at Perl 6 level would be a non-small, and certainly non-
trivial, task.

@p6rt
Copy link
Author

p6rt commented Sep 14, 2017

From @AlexDaniel

CSS​::Specification and CSS​::Module​: css-raku/CSS-Specification-raku#2

That's basically it. We won't see any other affected modules from toaster data.
On 2017-09-13 18​:54​:42, alex.jakimenko@​gmail.com wrote​:

NCurses, Term​::Choose, Term​::Choose​::Util, and Term​::TablePrint​:
azawawi/raku-ncurses#16

This one is weird because it seems to be working unless you run it
under prove.
Perhaps the module is alright but it's the test that is broken.

On 2017-09-13 18​:16​:56, alex.jakimenko@​gmail.com wrote​:

FWIW there is still some ecosystem fallout (possibly very minor).
I'll be
linking issues here so that we have all things in one place.

IO​::MiddleMan and Lumberjack​:
raku-community-modules/IO-MiddleMan#5
On 2017-09-11 04​:18​:39, elizabeth wrote​:

Fixed with 3c9cfdba88287e23e0ced8 (and further refined by later
commits), tests needed.

On 6 Sep 2017, at 15​:38, jnthn@​jnthn.net via RT <perl6-bugs-
followup@​perl.org> wrote​:

On Tue, 05 Sep 2017 09​:11​:19 -0700, allbery.b@​gmail.com wrote​:

On Tue, Sep 5, 2017 at 5​:40 AM, jnthn@​jnthn.net via RT <
perl6-bugs-followup@​perl.org> wrote​:

Failing to close output handles has been clearly documented
(and
yes,
documented well before the recent buffering change) as
something
that can
cause data loss. Default output buffering just makes it quite a
lot
more
likely to show up.

While there will be some ecosystem fallout like this,
unfortunately
I
don't think it's avoidable. If we whip out the patch that turns
output
buffering on by default for non-TTYs for this release, then
when
will we
include it? The longer we leave it, the more painful it will
be,
because
more code will be written that is careless with handles.

I don't think "leave it off by default" is a good option
either,
otherwise
we get to spend the next decade hearing "Perl 6 I/O is slow"
because it'd
be one of the only languages that doesn't buffer its output
without
an
explicit flag being passed to enable that (which nearly nobody
doing quick
benchmarks will know to use).

Are we missing something to flush/close handles at exit? Leaving
it
to a GC
that may not finalize before exit is not really an option.

To recap the IRC discussion yesterday​: no, we haven't had this so
far
(except for stdout/stderr), and have gotten away with it due to
the
lack of output buffering. At present, we can either choose
between​:

1) Start keeping a list of open files, and at exit close them
(flushing is already part of closing). This can be done at Perl 6
level, in the same place we make sure to run END blocks.

2) Having unclosed handles possible to GC, and closing them
if/when
they get GC'd.

Today we are doing #​2. We could switch to doing #​1. We can't
currently do both, because the moment we start keeping a list of
open
handles then they can't be GC'd, and so #​2 can't happen.

My initial inclination was to preserve behavior #​2, though others
have pointed out that behavior #​1 is more useful for debugging in
that it ensures log files, for example, will be written in the
event
of a crash, and a program relying on behavior #​2 could already
run
out of handles today anyway if it were less lucky with GC timing.
This is a fair argument, and the automatic close at exit might be
softer on the ecosystem too (but would have done nothing for the
Text​::CSV case, which is the original subject of this ticket,
because
it wrote a file, didn't close it, then separately opened it for
reading).

There's probably enough consensus to switch to option #​1, and
lizmat++ mentioned maybe looking into a patch to do that.

In the longer run, we can have both, but it depends on
implementing
weak references. In terms of backend support, the JVM does have
them,
and it seems there's an npm package [1] exposing v8 weak refs so
a
Node.js backend could support that also. I'm OK with adding them
to
MoarVM in the future, but both doing that and exposing weak
references at Perl 6 level would be a non-small, and certainly
non-
trivial, task.

@p6rt
Copy link
Author

p6rt commented Sep 17, 2017

From @AlexDaniel

Two of the mentioned modules have pending pull requests that add missing .close calls in tests. NCurses module is trickier, but arguably the failing test is a little bit wrong. See this discussion for more info​: azawawi/raku-ncurses#16

In other words, the ecosystem is fine. I'm still seeing some weirdness when running 「prove」, but hopefully that is unrelated.
On 2017-09-13 23​:06​:58, alex.jakimenko@​gmail.com wrote​:

CSS​::Specification and CSS​::Module​:
css-raku/CSS-Specification-raku#2

That's basically it. We won't see any other affected modules from
toaster data.
On 2017-09-13 18​:54​:42, alex.jakimenko@​gmail.com wrote​:

NCurses, Term​::Choose, Term​::Choose​::Util, and Term​::TablePrint​:
azawawi/raku-ncurses#16

This one is weird because it seems to be working unless you run it
under prove.
Perhaps the module is alright but it's the test that is broken.

On 2017-09-13 18​:16​:56, alex.jakimenko@​gmail.com wrote​:

FWIW there is still some ecosystem fallout (possibly very minor).
I'll be
linking issues here so that we have all things in one place.

IO​::MiddleMan and Lumberjack​:
raku-community-modules/IO-MiddleMan#5
On 2017-09-11 04​:18​:39, elizabeth wrote​:

Fixed with 3c9cfdba88287e23e0ced8 (and further refined by later
commits), tests needed.

On 6 Sep 2017, at 15​:38, jnthn@​jnthn.net via RT <perl6-bugs-
followup@​perl.org> wrote​:

On Tue, 05 Sep 2017 09​:11​:19 -0700, allbery.b@​gmail.com wrote​:

On Tue, Sep 5, 2017 at 5​:40 AM, jnthn@​jnthn.net via RT <
perl6-bugs-followup@​perl.org> wrote​:

Failing to close output handles has been clearly documented
(and
yes,
documented well before the recent buffering change) as
something
that can
cause data loss. Default output buffering just makes it quite
a
lot
more
likely to show up.

While there will be some ecosystem fallout like this,
unfortunately
I
don't think it's avoidable. If we whip out the patch that
turns
output
buffering on by default for non-TTYs for this release, then
when
will we
include it? The longer we leave it, the more painful it will
be,
because
more code will be written that is careless with handles.

I don't think "leave it off by default" is a good option
either,
otherwise
we get to spend the next decade hearing "Perl 6 I/O is slow"
because it'd
be one of the only languages that doesn't buffer its output
without
an
explicit flag being passed to enable that (which nearly
nobody
doing quick
benchmarks will know to use).

Are we missing something to flush/close handles at exit?
Leaving
it
to a GC
that may not finalize before exit is not really an option.

To recap the IRC discussion yesterday​: no, we haven't had this
so
far
(except for stdout/stderr), and have gotten away with it due to
the
lack of output buffering. At present, we can either choose
between​:

1) Start keeping a list of open files, and at exit close them
(flushing is already part of closing). This can be done at Perl
6
level, in the same place we make sure to run END blocks.

2) Having unclosed handles possible to GC, and closing them
if/when
they get GC'd.

Today we are doing #​2. We could switch to doing #​1. We can't
currently do both, because the moment we start keeping a list
of
open
handles then they can't be GC'd, and so #​2 can't happen.

My initial inclination was to preserve behavior #​2, though
others
have pointed out that behavior #​1 is more useful for debugging
in
that it ensures log files, for example, will be written in the
event
of a crash, and a program relying on behavior #​2 could already
run
out of handles today anyway if it were less lucky with GC
timing.
This is a fair argument, and the automatic close at exit might
be
softer on the ecosystem too (but would have done nothing for
the
Text​::CSV case, which is the original subject of this ticket,
because
it wrote a file, didn't close it, then separately opened it for
reading).

There's probably enough consensus to switch to option #​1, and
lizmat++ mentioned maybe looking into a patch to do that.

In the longer run, we can have both, but it depends on
implementing
weak references. In terms of backend support, the JVM does have
them,
and it seems there's an npm package [1] exposing v8 weak refs
so
a
Node.js backend could support that also. I'm OK with adding
them
to
MoarVM in the future, but both doing that and exposing weak
references at Perl 6 level would be a non-small, and certainly
non-
trivial, task.

@p6rt
Copy link
Author

p6rt commented Apr 7, 2018

From @AlexDaniel

Tests added in Raku/roast@40edf6d

I think we can close this.

On 2017-09-16 19​:58​:20, alex.jakimenko@​gmail.com wrote​:

Two of the mentioned modules have pending pull requests that add
missing .close
calls in tests. NCurses module is trickier, but arguably the failing
test is a
little bit wrong. See this discussion for more info​:
azawawi/raku-ncurses#16

In other words, the ecosystem is fine. I'm still seeing some weirdness
when
running 「prove」, but hopefully that is unrelated.
On 2017-09-13 23​:06​:58, alex.jakimenko@​gmail.com wrote​:

CSS​::Specification and CSS​::Module​:
css-raku/CSS-Specification-raku#2

That's basically it. We won't see any other affected modules from
toaster data.
On 2017-09-13 18​:54​:42, alex.jakimenko@​gmail.com wrote​:

NCurses, Term​::Choose, Term​::Choose​::Util, and Term​::TablePrint​:
azawawi/raku-ncurses#16

This one is weird because it seems to be working unless you run it
under prove.
Perhaps the module is alright but it's the test that is broken.

On 2017-09-13 18​:16​:56, alex.jakimenko@​gmail.com wrote​:

FWIW there is still some ecosystem fallout (possibly very minor).
I'll be
linking issues here so that we have all things in one place.

IO​::MiddleMan and Lumberjack​:
raku-community-modules/IO-MiddleMan#5
On 2017-09-11 04​:18​:39, elizabeth wrote​:

Fixed with 3c9cfdba88287e23e0ced8 (and further refined by later
commits), tests needed.

On 6 Sep 2017, at 15​:38, jnthn@​jnthn.net via RT <perl6-bugs-
followup@​perl.org> wrote​:

On Tue, 05 Sep 2017 09​:11​:19 -0700, allbery.b@​gmail.com
wrote​:

On Tue, Sep 5, 2017 at 5​:40 AM, jnthn@​jnthn.net via RT <
perl6-bugs-followup@​perl.org> wrote​:

Failing to close output handles has been clearly documented
(and
yes,
documented well before the recent buffering change) as
something
that can
cause data loss. Default output buffering just makes it
quite
a
lot
more
likely to show up.

While there will be some ecosystem fallout like this,
unfortunately
I
don't think it's avoidable. If we whip out the patch that
turns
output
buffering on by default for non-TTYs for this release, then
when
will we
include it? The longer we leave it, the more painful it
will
be,
because
more code will be written that is careless with handles.

I don't think "leave it off by default" is a good option
either,
otherwise
we get to spend the next decade hearing "Perl 6 I/O is
slow"
because it'd
be one of the only languages that doesn't buffer its output
without
an
explicit flag being passed to enable that (which nearly
nobody
doing quick
benchmarks will know to use).

Are we missing something to flush/close handles at exit?
Leaving
it
to a GC
that may not finalize before exit is not really an option.

To recap the IRC discussion yesterday​: no, we haven't had
this
so
far
(except for stdout/stderr), and have gotten away with it due
to
the
lack of output buffering. At present, we can either choose
between​:

1) Start keeping a list of open files, and at exit close them
(flushing is already part of closing). This can be done at
Perl
6
level, in the same place we make sure to run END blocks.

2) Having unclosed handles possible to GC, and closing them
if/when
they get GC'd.

Today we are doing #​2. We could switch to doing #​1. We can't
currently do both, because the moment we start keeping a list
of
open
handles then they can't be GC'd, and so #​2 can't happen.

My initial inclination was to preserve behavior #​2, though
others
have pointed out that behavior #​1 is more useful for
debugging
in
that it ensures log files, for example, will be written in
the
event
of a crash, and a program relying on behavior #​2 could
already
run
out of handles today anyway if it were less lucky with GC
timing.
This is a fair argument, and the automatic close at exit
might
be
softer on the ecosystem too (but would have done nothing for
the
Text​::CSV case, which is the original subject of this ticket,
because
it wrote a file, didn't close it, then separately opened it
for
reading).

There's probably enough consensus to switch to option #​1, and
lizmat++ mentioned maybe looking into a patch to do that.

In the longer run, we can have both, but it depends on
implementing
weak references. In terms of backend support, the JVM does
have
them,
and it seems there's an npm package [1] exposing v8 weak refs
so
a
Node.js backend could support that also. I'm OK with adding
them
to
MoarVM in the future, but both doing that and exposing weak
references at Perl 6 level would be a non-small, and
certainly
non-
trivial, task.

@p6rt
Copy link
Author

p6rt commented Apr 7, 2018

@AlexDaniel - Status changed from 'open' to 'resolved'

@p6rt p6rt closed this as completed Apr 7, 2018
@p6rt p6rt added regression Issue did not exist previously testneeded labels Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression Issue did not exist previously testneeded
Projects
None yet
Development

No branches or pull requests

1 participant