Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-T STDIN hangs #11867

Closed
p5pRT opened this issue Jan 15, 2012 · 15 comments
Closed

-T STDIN hangs #11867

p5pRT opened this issue Jan 15, 2012 · 15 comments

Comments

@p5pRT
Copy link

p5pRT commented Jan 15, 2012

Migrated from rt.perl.org#108278 (status was 'rejected')

Searchable as RT108278$

@p5pRT
Copy link
Author

p5pRT commented Jan 15, 2012

From @cpansprout

If STDIN is a tty, -T STDIN just hangs waiting for input.

Is there such a thing as a non-blocking getc that we can use to solve this?


Flags​:
  category=core
  severity=low


Site configuration information for perl 5.15.6​:

Configured by sprout at Sat Dec 31 10​:12​:16 PST 2011.

Summary of my perl5 (revision 5 version 15 subversion 6) configuration​:
  Local Commit​: b2635083831c8935c437465bbeb03aec8b599c01
  Ancestor​: 407287f
  Platform​:
  osname=darwin, osvers=10.5.0, archname=darwin-thread-multi-2level
  uname='darwin pint.local 10.5.0 darwin kernel version 10.5.0​: fri nov 5 23​:20​:39 pdt 2010; root​:xnu-1504.9.17~1release_i386 i386 '
  config_args='-de -Dusedevel -Duseithreads -DDEBUGGING -Dmad'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=define, usemultiplicity=define
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=undef, use64bitall=undef, uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-fno-common -DPERL_DARWIN -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include',
  optimize='-O3 -g',
  cppflags='-fno-common -DPERL_DARWIN -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='4.2.1 (Apple Inc. build 5664)', gccosandvers=''
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib
  libs=-ldbm -ldl -lm -lutil -lc
  perllibs=-ldl -lm -lutil -lc
  libc=, so=dylib, useshrplib=false, libperl=libperl.a
  gnulibc_version=''
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
  cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -L/usr/local/lib -fstack-protector'

Locally applied patches​:
 


@​INC for perl 5.15.6​:
  /usr/local/lib/perl5/site_perl/5.15.6/darwin-thread-multi-2level
  /usr/local/lib/perl5/site_perl/5.15.6
  /usr/local/lib/perl5/5.15.6/darwin-thread-multi-2level
  /usr/local/lib/perl5/5.15.6
  /usr/local/lib/perl5/site_perl
  .


Environment for perl 5.15.6​:
  DYLD_LIBRARY_PATH (unset)
  HOME=/Users/sprout
  LANG=en_US.UTF-8
  LANGUAGE (unset)
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)
  PATH=/usr/bin​:/bin​:/usr/sbin​:/sbin​:/usr/local/bin​:/usr/X11/bin​:/usr/local/bin
  PERL_BADLANG (unset)
  SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Jan 16, 2012

From zefram@fysh.org

Father Chrysostomos wrote​:

If STDIN is a tty, -T STDIN just hangs waiting for input.

That sounds like correct behaviour. -T is defined to examine file content.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Jan 16, 2012

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jan 16, 2012

From @craigberry

On Mon, Jan 16, 2012 at 7​:02 AM, Zefram <zefram@​fysh.org> wrote​:

Father Chrysostomos wrote​:

If STDIN is a tty, -T STDIN just hangs waiting for input.

That sounds like correct behaviour.  -T is defined to examine file content.

But if something is not a file, wouldn't it be fairly safe to say it's
also not a file containing a high proportion of ASCII characters in
the first 512 bytes? Why does pp_fttext even bother with filehandles
where S_ISREG isn't true?

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

From tchrist@perl.com

"Craig A. Berry" <craig.a.berry@​gmail.com> wrote
  on Mon, 16 Jan 2012 17​:25​:50 CST​:

On Mon, Jan 16, 2012 at 7​:02 AM, Zefram <zefram@​fysh.org> wrote​:

Father Chrysostomos wrote​:

If STDIN is a tty, -T STDIN just hangs waiting for input.

Is there such a thing as a non-blocking getc that we can use to solve this?

Yes and no.

==> Yes​: If isatty(0) and you're in canonical mode, then that is really
  going to get in your way. But it's not hard to deal with--on Unix.

==> No​: You can't do it portably without an incredible amount of pain.
  That's why people install Term​::ReadKey and such.

That sounds like correct behaviour.  -T is defined to examine file content.

Known issue. Way known.

Ever since back in 1988 when -T first appeared, I've told people a googolplex
times that it is disastrously unsafe to blindly call -T on anything you
haven't already determined is a "plain" file, and that I could think of no
realistic situation when your (-T arg) shouldn't always be preceded by a
guard test, as in probably (-f arg && -T arg). Maybe there is one; I dunno.

But if something is not a file, wouldn't it be fairly safe to say it's
also not a file containing a high proportion of ASCII characters in
the first 512 bytes?

That sounds anything but safe to me. Perl shouldn't make things up.

Just because I haven't thought of why there oughtn't be a -f guard
built into it doesn't mean that there isn't one that I just haven't
thought of yet. After all...

  "Unix was not designed to stop its users from doing stupid things,
  as that would also stop them from doing clever things. --Doug Gwyn"
  =~ s/Unix/Perl/r;

Why does pp_fttext even bother with filehandles where S_ISREG isn't true?

So it can look at the current buffer if it's already got data read into that,
perhaps? Notice that the function only does a getc/ungetc if the buffer is
empty, so it could use what's already there. Per perlfunc​:

If "-T" or "-B" is used on a filehandle, the current IO buffer
is examined rather than the first block.

The problem is what to do if you don't have any of them, since pipes/sockets
and character devices (and who knows what else) can block indefinitely. All I
can say is that is a *very* well-known/documented issue; again, per perlfunc​:

  Because you have to read a file to do the "-T" test, on most
  occasions you want to use a "-f" against the file first, as in
  "next unless -f $file && -T $file".

So it's not as though they haven't been warned. Repeatedly. For ages.

Yes, you can argue that that also means there's a problem. Perhaps
that's what this

--tom

@p5pRT
Copy link
Author

p5pRT commented Jan 28, 2013

From @jkeenan

On Mon Jan 16 16​:02​:32 2012, tom christiansen wrote​:

"Craig A. Berry" <craig.a.berry@​gmail.com> wrote
on Mon, 16 Jan 2012 17​:25​:50 CST​:

On Mon, Jan 16, 2012 at 7​:02 AM, Zefram <zefram@​fysh.org> wrote​:

Father Chrysostomos wrote​:

If STDIN is a tty, -T STDIN just hangs waiting for input.

Is there such a thing as a non-blocking getc that we can use to
solve this?

Yes and no.

==> Yes​: If isatty(0) and you're in canonical mode, then that is
really
going to get in your way. But it's not hard to deal with--on
Unix.

==> No​: You can't do it portably without an incredible amount of
pain.
That's why people install Term​::ReadKey and such.

That sounds like correct behaviour. �-T is defined to examine file
content.

Known issue. Way known.

Ever since back in 1988 when -T first appeared, I've told people a
googolplex
times that it is disastrously unsafe to blindly call -T on anything
you
haven't already determined is a "plain" file, and that I could think
of no
realistic situation when your (-T arg) shouldn't always be preceded by
a
guard test, as in probably (-f arg && -T arg). Maybe there is one; I
dunno.

But if something is not a file, wouldn't it be fairly safe to say
it's
also not a file containing a high proportion of ASCII characters in
the first 512 bytes?

That sounds anything but safe to me. Perl shouldn't make things up.

Just because I haven't thought of why there oughtn't be a -f guard
built into it doesn't mean that there isn't one that I just haven't
thought of yet. After all...

"Unix was not designed to stop its users from doing stupid things\,
 as that would also stop them from doing clever things\. \-\-Doug

Gwyn"
=~ s/Unix/Perl/r;

Why does pp_fttext even bother with filehandles where S_ISREG isn't
true?

So it can look at the current buffer if it's already got data read
into that,
perhaps? Notice that the function only does a getc/ungetc if the
buffer is
empty, so it could use what's already there. Per perlfunc​:

If "-T" or "-B" is used on a filehandle, the current IO buffer
is examined rather than the first block.

The problem is what to do if you don't have any of them, since
pipes/sockets
and character devices (and who knows what else) can block
indefinitely. All I
can say is that is a *very* well-known/documented issue; again, per
perlfunc​:

Because you have to read a file to do the "-T" test, on most
occasions you want to use a "-f" against the file first, as in
"next unless -f $file && -T $file".

So it's not as though they haven't been warned. Repeatedly. For
ages.

Yes, you can argue that that also means there's a problem. Perhaps
that's what this

--tom

Reviewing the discussion in this older ticket this evening, I believe
that Tom is correct in saying that we have duly warned people of the use
of '-T' nor preceded by '-f'. Specifically, in 'perlfunc' we have this​:

#####
Because you have to read a file to do the "-T" test, on most occasions
you want to use a "-f" against the file first, as in "next unless -f
$file && -T $file".
#####

So I don't see that we have a real bug here. If anyone disagrees,
please speak up now, as I am taking this ticket for the purpose of
closing it in seven days.

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Jan 28, 2013

From @cpansprout

On Jan 27, 2013, at 5​:31 PM, James E Keenan via RT wrote​:

Reviewing the discussion in this older ticket this evening, I believe
that Tom is correct in saying that we have duly warned people of the use
of '-T' nor preceded by '-f'. Specifically, in 'perlfunc' we have this​:

#####
Because you have to read a file to do the "-T" test, on most occasions
you want to use a "-f" against the file first, as in "next unless -f
$file && -T $file".
#####

So I don't see that we have a real bug here. If anyone disagrees,
please speak up now, as I am taking this ticket for the purpose of
closing it in seven days.

If it is not a bug, then it is a wishlist item. Perl has the feature whereby it will look inside the buffer for -T on something that is not a plain file. If this cannot be relied upon (because one does not know whether the buffer holds anything), then what is the point of having Perl do that? This is a gotcha that *could* be fixed (via a non-blocking read), so I think this ticket should stay open.

@p5pRT
Copy link
Author

p5pRT commented Jan 28, 2013

From @epa

James E Keenan via RT <perlbug-followup <at> perl.org> writes​:

Because you have to read a file to do the "-T" test, on most
occasions you want to use a "-f" against the file first, as in
"next unless -f $file && -T $file".

But the file with name $file could change in between the two tests. You still
have have the possibility of a hang subject to a race condition. Would it be
safe instead to use the following?

  -f $file && -T _

--
Ed Avis <eda@​waniasset.com>

@p5pRT
Copy link
Author

p5pRT commented Jan 28, 2013

From @Leont

On Mon, Jan 28, 2013 at 7​:02 AM, Father Chrysostomos <sprout@​cpan.org> wrote​:

If it is not a bug, then it is a wishlist item. Perl has the feature whereby it will look inside the buffer for -T on something that is not a plain file. If this cannot be relied upon (because one does not know whether the buffer holds anything), then what is the point of having Perl do that? This is a gotcha that *could* be fixed (via a non-blocking read), so I think this ticket should stay open.

I'm not sure non-blocking reads can easily be done portably. Not sure
we can do something sensible here​: what we would need is tribool
logic.

Leon

@p5pRT
Copy link
Author

p5pRT commented Jan 28, 2013

From @Leont

On Mon, Jan 28, 2013 at 12​:19 PM, Ed Avis <eda@​waniasset.com> wrote​:

But the file with name $file could change in between the two tests. You still
have have the possibility of a hang subject to a race condition. Would it be
safe instead to use the following?

\-f $file && \-T \_

_ is the old stat buffer, which isn't enough for -T. -f $fh && -T $fh
is what you're looking for, but it still has the issues Tom mentioned
above.

Leon

@p5pRT
Copy link
Author

p5pRT commented Jan 28, 2013

From @epa

Leon Timmermans <fawaka <at> gmail.com> writes​:

But the file with name $file could change in between the two tests

\-f $file && \-T \_

_ is the old stat buffer, which isn't enough for -T. -f $fh && -T $fh
is what you're looking for, but it still has the issues Tom mentioned
above.

So then the safe way to do it is to stat an open filehandle?

  open my $fh, '<', $file or die "cannot open $file​: $!";
  my $is_binary = (-f $fh && -T $fh);
  close $fh or die "cannot close $file​: $!";

Perhaps -T should do this internally so that it cannot hang?

--
Ed Avis <eda@​waniasset.com>

@p5pRT
Copy link
Author

p5pRT commented Jan 29, 2013

From @jkeenan

On Sun Jan 27 17​:31​:08 2013, jkeenan wrote​:

So I don't see that we have a real bug here. If anyone disagrees,
please speak up now, as I am taking this ticket for the purpose of
closing it in seven days.

Since there has been further discussion, meaning that the ticket cannot
simply be closed, I am un-Take-ing the ticket.

@p5pRT
Copy link
Author

p5pRT commented Nov 14, 2017

From zefram@fysh.org

A couple of points have been missed in this discussion. Firstly, the
willingness to read content from a non-regular file is not specific to
performing the check on an open filehandle. Perl is perfectly willing
to read content from a non-regular file specified by name​:

$ perl -lwe 'print -T "/dev/null" || 0; print -T "/dev/zero" || 0'
1
0

Secondly, there's been a bit of the mistaken idea that a non-regular
file is somehow "not a file". Looking at the documentation of the
file test operators, it's clearly been written using Unix terminology,
in which anything in the filesystem is a file​:

# -f File is a plain file.
# -d File is a directory.
# -b File is a block special file.

Given that understanding of what "file" means in this document, we can
better interpret the documentation for -T and -B​:

# -T File is an ASCII or UTF-8 text file (heuristic guess).
# -B File is a "binary" file (opposite of -T).
...
# The "-T" and "-B" switches work as follows. The first block or
# so of the file is examined to see if it is valid UTF-8 that
# includes non-ASCII characters. If, so it's a "-T" file.

It speaks of examining the content of "the file". Doesn't say anything
about it being a regular file. Just like the documentation for -w et
al doesn't say anything about the file being regular. -w, -f, and -T
are mutually orthogonal.

Clearly, reading from non-regular files is intentional behaviour,
implemented consistently, and documented. The choice of such a predicate
for this short spelling may be a questionable language design decision,
but it's too late to change that. There isn't any viable case for
changing the existing behaviour. If you want a regular-text-file
predicate, by all means write your own and stick it on CPAN.

This ticket should be closed.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Dec 28, 2017

From @jkeenan

On Tue, 14 Nov 2017 21​:42​:02 GMT, zefram@​fysh.org wrote​:

A couple of points have been missed in this discussion. Firstly, the
willingness to read content from a non-regular file is not specific to
performing the check on an open filehandle. Perl is perfectly willing
to read content from a non-regular file specified by name​:

$ perl -lwe 'print -T "/dev/null" || 0; print -T "/dev/zero" || 0'
1
0

Secondly, there's been a bit of the mistaken idea that a non-regular
file is somehow "not a file". Looking at the documentation of the
file test operators, it's clearly been written using Unix terminology,
in which anything in the filesystem is a file​:

# -f File is a plain file.
# -d File is a directory.
# -b File is a block special file.

Given that understanding of what "file" means in this document, we can
better interpret the documentation for -T and -B​:

# -T File is an ASCII or UTF-8 text file (heuristic
guess).
# -B File is a "binary" file (opposite of -T).
...
# The "-T" and "-B" switches work as follows. The first
block or
# so of the file is examined to see if it is valid UTF-8
that
# includes non-ASCII characters. If, so it's a "-T"
file.

It speaks of examining the content of "the file". Doesn't say
anything
about it being a regular file. Just like the documentation for -w et
al doesn't say anything about the file being regular. -w, -f, and -T
are mutually orthogonal.

Clearly, reading from non-regular files is intentional behaviour,
implemented consistently, and documented. The choice of such a
predicate
for this short spelling may be a questionable language design
decision,
but it's too late to change that. There isn't any viable case for
changing the existing behaviour. If you want a regular-text-file
predicate, by all means write your own and stick it on CPAN.

This ticket should be closed.

-zefram

I attempted to wrap up discussion on this ticket and close it almost five years ago -- and failed!

So I'll take that recommendation and try to get it done for real now. Closing.

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Dec 28, 2017

@jkeenan - Status changed from 'open' to 'rejected'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant