file tests and Failure do not interact as expected #6559

p6rt · 2017-09-29T21:43:22Z

Migrated from rt.perl.org#132185 (status was 'open')

Searchable as RT132185$

p6rt · 2017-09-29T21:43:22Z

From @geekosaur

This turns out to be fairly complex, and has implications that may go well
beyond file tests. (Again! It only caused a syntax rethink and the redesign
of smartmatching when I poked file test issues in Pugs in 2007....)

The original problem is that a fairly obvious (from shells or perl 5 or
etc.) test for whether a file (as such) exists or not, can yield surprises:

pyanfar Z$ 6 '".profileX".IO.f.say'

Failed to find '/home/allbery/.profileX' while trying to do '.f'
in block <unit> at -e line 1

Naïvely, I expect this to output False, not throw.

The reason for this is fairly obvious: if I use it in a Bool context then
the Failure gets coerced as I expect, But if I'm not aware that this relies
on Failure getting disarmed when coerced to Bool, using it with something
that accepts Any (like say) will throw instead of giving me False.
Accordingly, it works as expected if I force coercion to Bool.

pyanfar Z$ 6 'say ?".profileX".IO.f'
False

So, the first problem is that you have to be aware of the special behavior
of Failure and how it interacts with a method which is documented as
producing Bool.

If it stopped there, this might not even be worth a bug report except
possibly for documentation. But if you dig a little farther, things start
getting more complex:

pyanfar Z$ 6 '".profileX".IO.e.say'
False

The .e method behaves differently, and how I expected .f to behave!

Again, there is a rational explanation: it is, logically, a different
operation. In lower level terms, .e just checks whether stat() succeeded,
whereas .f needs to also look at the result and gives me Failure if the
stat() failed. But you need to know that this difference exists, because
it's not immediately clear from the documentation.

Perl 5 had a variant of this, and "leaked" a hint of it with its magic
_ parameter.
As a way of exposing the difference between just calling stat() and using
its result, though, its kinda the worst of all possible worlds. (Not to
mention the questions of thread safety, etc. that come up when you start
tossing such magic around.)

Things get deeper yet, though. Which kinds of failures of stat() result in
Failure, and which if any produce harder exceptions? An EIO return from
stat() is a much more fundamental failure than an ENOENT return. (tl;dr: EIO
means the filesystem is hosed. For a remote filesystem it may mean the
connection to the server has been lost; for a local one, it could mean
someone unplugged the USB hard drive or it could mean you need to
immediately shut down, fsck, and possibly dig out the backups. In all
cases, it's a deeper issue than a file simply not existing.) Is this a
situation where we might actually want a harder kind of Failure that
doesn't get disarmed on coercion to Bool, but does if tested with .defined?
Or does this justify a hard exception? And, there are likely to be
intermediate cases where the right answer is even less clear.

If you go back and look at the difference between .e and .f, you also get
other questions. Notably, if you decide that .f should behave like .e, do
you do this explicitly (and for each operation), or do you arrange for it
to be part of the signature, or do you perhaps handle any Failure return
through a declared Bool return type by coercing it to Bool? All of these
answers are unappealing, some moreso than others (unconditional coercion
might actually be right in the general case, but it scares me --- and
interacts strongly with the preceding question).

Making the return type of the file tests a coercion type to express the
notion that, here, Failure should coerce to Bool (but maybe not always?
gain see previous section) is tempting, but (a) I have no idea what the
syntax would be (b) currently that information is ignored, or possibly
throws at compile time (c) and the existing coercion machinery operates
inbound to a function, not outbound for its result. And is this situation
actually common enough to justify such a mechanism, especially considering
that it probably makes a relatively hot path more expensive?

So there's actually a fair amount to think about here. And, depending on
your early answers, this could potentially be three tickets or maybe even
more.

--
brandon s allbery kf8nh sine nomine associates
allbery.b@gmail.com ballbery@sinenomine.net
unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

p6rt · 2017-09-29T22:45:07Z

From @zoffixznet

On Fri, 29 Sep 2017 14:43:22 -0700, allbery.b@gmail.com wrote:

So, the first problem is that you have to be aware of the special behavior
of Failure and how it interacts with a method which is documented as
producing Bool.

That documentation also lists the conditions when the method `fail`s.
There's no interaction with any methods involved. The method simply returns
a Failure object on failure and failures are pretty ubiquitous in the
language and especially so in IO part of it.

The .e method behaves differently, and how I expected .f to behave!

Again, there is a rational explanation: it is, logically, a different
operation.

From my POV, it's that here it can reliably answer a True or a False to
whether a file exists, whereas .f has three answers to give:
True the path is a file, False the path is not a file, or Failure because
the path points to nothing. The behaviour is consistent, it's just .e never `fail`s.

Is this a situation where we might actually want a harder kind of Failure that
doesn't get disarmed on coercion to Bool, but does if tested with .defined?

I rather we don't invent any special cases for a small part of the language.
If harder errors need to be expressed, we should just throw an exception.

if you decide that .f should behave like .e, do
you do this explicitly (and for each operation), or do you arrange for it
to be part of the signature, or do you perhaps handle any Failure return
through a declared Bool return type by coercing it to Bool? All of these
answers are unappealing, some moreso than others (unconditional coercion
might actually be right in the general case, but it scares me --- and
interacts strongly with the preceding question).

I'm not fully following all the coercion talk here. The file test methods
return a Failure object. The .Bool and .defined methods on Failures disarm
them and return False. There are no coercers involved. Also, the user has
more than one option to disarm Failures, by smartmatching against Pair
objects and having the smartmatch disarm Failures:

say "z".IO ~~ :f
False

Or just using the .so method:

"x".IO.f.so.say
False

".bashrc".IO.f.so.say
True

I have no idea what the syntax would be [...]
So there's actually a fair amount to think about here.

I'm probably biased since I raked through this stuff during IO Grant, but
TBH I'm failing to see any problems so far. Certainly don't see anything
that'd involve inventing new syntax or special Failure types. If a person
picked up the language an hour ago, Failures might be a new concept to them
to learn, but they're as common as regexes, so inventing something extra to learn
just compounds the original problem instead of solving it.

p6rt · 2017-09-29T22:45:07Z

The RT System itself - Status changed from 'new' to 'open'

p6rt · 2017-09-30T20:35:02Z

From @smls

I agree with Zoffix that this seems to be fine as is.

Generally speaking, IO operations that logically require an existing path will return a Failure if the path does not in fact exist:

Slurp its content? Failure.
Rename/move/copy it? Failure.
Check its size? Failure.
Check if it is of type "directory? Failure.
Check if it is of type "file"? Failure.

Whereas `.e`, i.e. checking if a path exists, is by necessity *not* an operation that assumes an existing path.

The only thing that might be debatable, is whether `.f` should mean:

a) Check if it is of type "file".

b) Check if it exists, and if so, if is of type "file".

The current behavior (a) seems more natural and useful to me though.

Perl 5 does (a) as well, in the sense that it too distinguishes "failure" from "no" in the return value:

- "yes, it is of type 'file'": a defined truthy value
- "no, it is not of type 'file'": a defined falsy value
- "failure, it does not exist so its type can not be checked": `undef`

Perl 6 merely improves on that by promoting the failure condition from `undef` to `Failure`, which both carries more information and provides more safety by default.

To the extent that you're basing your expectations on the fact that a Perl 5 `undef` can be used in ways that a Perl 6 `Failure` cannot (without blowing up), well, that's just a matter of having to unlearn Perl 5 (or other programming languages) while learning Perl 6... :)

p6rt · 2017-09-30T20:47:22Z

From @geekosaur

On Sat, Sep 30, 2017 at 4:35 PM, Sam S. via RT <perl6-bugs-followup@perl.org

wrote:

To the extent that you're basing your expectations on the fact that a Perl
5 `undef` can be used in ways that a Perl 6 `Failure` cannot (without
blowing up), well, that's just a matter of having to unlearn Perl 5 (or
other programming languages) while learning Perl 6... :)

So I included at least one discussion in the ticket that was utterly
pointless and left unread. At this point I'm just going to assume only
half-exposing the underlying mechanism is considered a feature and I need
to use a different language when it's not.

--
brandon s allbery kf8nh sine nomine associates
allbery.b@gmail.com ballbery@sinenomine.net
unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

p6rt · 2017-09-30T22:00:18Z

From @smls

On Sat, 30 Sep 2017 13:47:22 -0700, allbery.b@gmail.com wrote:

So I included at least one discussion in the ticket that was utterly
pointless and left unread. At this point I'm just going to assume only
half-exposing the underlying mechanism is considered a feature and I need
to use a different language when it's not.

You mean the part about exposing different stat() error conditions?

A `Failure` wraps a typed exception, which you can get at by calling the `.exception` method on it:

my $is-file = do given $path.f -> $result {
with $result.?exception {
when X::IO::DoesNotExist { ... }
when ... { ... }
when ... { ... }
default { ... }
}
$result.so
}

(...or by letting it throw and then using a CATCH block.)

Exceptions also have type-specific attributes which can hold further details to differentiate similar but different error conditions.

If file tests should be made to expose more fine-grained error states (probably a good idea), they can use those two mechanisms without the need to change anything about Failure or add new syntax.

PS: At the end of the day, "only half-exposing the underlying mechanism" of low-level operating-system APIs is to be expected though, for a high-level language that wants to be cross-platform and user-friendly. Getting full access to specific operating-system APIs is what third-party modules such as [1] are for. Thankfully, Perl 6's NativeCall interface makes them relatively easy to write.

[1] https://github.com/cspencer/perl6-posix

p6rt added the LTA Less Than Awesome; typically an error message that could be better label Jan 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

file tests and Failure do not interact as expected #6559

file tests and Failure do not interact as expected #6559

p6rt commented Sep 29, 2017

p6rt commented Sep 29, 2017

p6rt commented Sep 29, 2017

p6rt commented Sep 29, 2017

p6rt commented Sep 30, 2017

p6rt commented Sep 30, 2017

p6rt commented Sep 30, 2017

file tests and Failure do not interact as expected #6559

file tests and Failure do not interact as expected #6559

Comments

p6rt commented Sep 29, 2017

p6rt commented Sep 29, 2017

From @geekosaur

p6rt commented Sep 29, 2017

From @zoffixznet

p6rt commented Sep 29, 2017

p6rt commented Sep 30, 2017

From @smls

p6rt commented Sep 30, 2017

From @geekosaur

p6rt commented Sep 30, 2017

From @smls