Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

magic open of ARGV #1566

Closed
p5pRT opened this issue Mar 28, 2000 · 153 comments
Closed

magic open of ARGV #1566

p5pRT opened this issue Mar 28, 2000 · 153 comments

Comments

@p5pRT
Copy link

p5pRT commented Mar 28, 2000

Migrated from rt.perl.org#2783 (status was 'rejected')

Searchable as RT2783$

@p5pRT
Copy link
Author

p5pRT commented Mar 28, 2000

From thospel@mail.dma.be

In article <E12V7Jw-0002PL-00@​ursa.cus.cam.ac.uk>,
  "M.J.T. Guy" <mjtg@​cus.cam.ac.uk> writes​:

No, I'm *not* trying to restart this flame war. But it was a "security"
issue, and security seems to be in fashion at the moment, and it *was*
left in a somewhat unsatisfactory state.

THe story so far, for the benefit of younger readers​:
[ with the usual IIRC caveats - go to the archives if you want the
real facts
]
There's a booby trap when magic open (i.e. initial/final special
characters like < > |) is used in conjunction with <>. Suppose
some devious person has left around a file such as "| rm -rf *;".
THen root's cron job comes along and does

       my\_scan\_command \*

and ... Boom! Here's a more innocent demonstration​:

$ cat >'| echo Bwahahahaha'
hkgfjhgfhgf
$ perl -wne '' *
Bwahahahaha
$

Note that the Perl script is obviously "so simple it can't have any
security holes".

There were two proposals for fixing this​: a maximal one which would
have banned all magic in association with <>, and a minimal one
(championed by Tom C) which would have made the open non-magic iff
a file of that name existed. So the minimal proposal is essentially
backwards compatible, and loses no functionality apart from active
malice.

In fact, there was a little known third proposal by yours truly (hi !)​:
Turn of magic <> if the perl command line contains an explicit --
Otherwise you are still hacked. Observe​:

mkdir /tmp/a
cd /tmp/a
echo > '-e;print("Bwahaha\n")'
echo foo > bar
perl -wne '' *

Will also give you the dreaded​:
Bwahaha

So, since a security aware person has to do

perl -wne '' -- *

anyways, let that remove the magicness

@p5pRT
Copy link
Author

p5pRT commented Sep 28, 2005

From @smpeters

[thospel@​mail.dma.be - Tue Mar 28 03​:56​:10 2000]​:

In article <E12V7Jw-0002PL-00@​ursa.cus.cam.ac.uk>,
"M.J.T. Guy" <mjtg@​cus.cam.ac.uk> writes​:

No, I'm *not* trying to restart this flame war. But it was a
"security"
issue, and security seems to be in fashion at the moment, and it *was*
left in a somewhat unsatisfactory state.

THe story so far, for the benefit of younger readers​:
[ with the usual IIRC caveats - go to the archives if you want the
real facts
]
There's a booby trap when magic open (i.e. initial/final special
characters like < > |) is used in conjunction with <>. Suppose
some devious person has left around a file such as "| rm -rf *;".
THen root's cron job comes along and does

       my\_scan\_command \*

and ... Boom! Here's a more innocent demonstration​:

$ cat >'| echo Bwahahahaha'
hkgfjhgfhgf
$ perl -wne '' *
Bwahahahaha
$

Note that the Perl script is obviously "so simple it can't have any
security holes".

There were two proposals for fixing this​: a maximal one which would
have banned all magic in association with <>, and a minimal one
(championed by Tom C) which would have made the open non-magic iff
a file of that name existed. So the minimal proposal is essentially
backwards compatible, and loses no functionality apart from active
malice.

In fact, there was a little known third proposal by yours truly (hi !)​:
Turn of magic <> if the perl command line contains an explicit --
Otherwise you are still hacked. Observe​:

mkdir /tmp/a
cd /tmp/a
echo > '-e;print("Bwahaha\n")'
echo foo > bar
perl -wne '' *

Will also give you the dreaded​:
Bwahaha

So, since a security aware person has to do

perl -wne '' -- *

anyways, let that remove the magicness

The flow Ton has just above seems to have been fixed.

steve@​kirk​:~/perl-current$ mkdir /tmp/a
steve@​kirk​:~/perl-current$ cd /tmp/a
steve@​kirk​:/tmp/a$ echo > '-e;print("Bwahaha\n")'
steve@​kirk​:/tmp/a$ echo foo > bar
steve@​kirk​:/tmp/a$ perl -wne '' *
steve@​kirk​:/tmp/a$ ls -ltr
total 8
-rw-r--r-- 1 steve steve 1 2005-09-27 23​:03 -e;print("Bwahaha\n")
-rw-r--r-- 1 steve steve 4 2005-09-27 23​:03 bar

Although the original flow that started this ticket still exists.

@p5pRT
Copy link
Author

p5pRT commented Sep 28, 2005

From perl5-porters@ton.iguana.be

In article <rt-3.0.11-2783-121717.9.08824524474802@​perl.org>,
  "Steve Peters via RT" <perlbug-followup@​perl.org> writes​:

The flow Ton has just above seems to have been fixed.

steve@​kirk​:~/perl-current$ mkdir /tmp/a
steve@​kirk​:~/perl-current$ cd /tmp/a
steve@​kirk​:/tmp/a$ echo > '-e;print("Bwahaha\n")'
steve@​kirk​:/tmp/a$ echo foo > bar
steve@​kirk​:/tmp/a$ perl -wne '' *
steve@​kirk​:/tmp/a$ ls -ltr
total 8
-rw-r--r-- 1 steve steve 1 2005-09-27 23​:03 -e;print("Bwahaha\n")
-rw-r--r-- 1 steve steve 4 2005-09-27 23​:03 bar

Although the original flow that started this ticket still exists.

Just downloaded and tried a bleadperl. Still works for me.
Nor do I think it CAN be solved without the user doing something
like adding the -- (well, anouther way would be to not accept a second
-e or even a first one after a non-option argument). It's the shell
that expands the *, so perl never sees anything different from

  perl -wne '' '-e;print("Bwahaha\n")'

which is *supposed* to work.

Maybe it's your shell that refuses to expand the file with an option
in the name ?

@p5pRT
Copy link
Author

p5pRT commented Sep 29, 2005

From prev a.r.ferreira@gmail.com

On 9/28/05, Ton Hospel <perl5-porters@​ton.iguana.be> wrote​:

In article <rt-3.0.11-2783-121717.9.08824524474802@​perl.org>,
"Steve Peters via RT" <perlbug-followup@​perl.org> writes​:

The flow Ton has just above seems to have been fixed.

steve@​kirk​:~/perl-current$ mkdir /tmp/a
steve@​kirk​:~/perl-current$ cd /tmp/a
steve@​kirk​:/tmp/a$ echo > '-e;print("Bwahaha\n")'
steve@​kirk​:/tmp/a$ echo foo > bar
steve@​kirk​:/tmp/a$ perl -wne '' *
steve@​kirk​:/tmp/a$ ls -ltr
total 8
-rw-r--r-- 1 steve steve 1 2005-09-27 23​:03 -e;print("Bwahaha\n")
-rw-r--r-- 1 steve steve 4 2005-09-27 23​:03 bar

Although the original flow that started this ticket still exists.

Just downloaded and tried a bleadperl. Still works for me.
Nor do I think it CAN be solved without the user doing something
like adding the -- (well, anouther way would be to not accept a second
-e or even a first one after a non-option argument). It's the shell
that expands the *, so perl never sees anything different from

perl -wne '' '-e;print("Bwahaha\n")'

which is *supposed* to work.

Maybe it's your shell that refuses to expand the file with an option
in the name ?

If I am not overlooking some point in the current discussion, the
"magic open of ARGV" is not different from the magic of C<open>, which
gives supports to the tricky/powerful pipes (including '|
print("Bwahaha\n")' or '| rm -rf *;'). So that's not a bug, but a
feature. This is all documented in C<perldoc -f open> and C<perldoc
perlopentut>. In the section "Dispelling the Dweomer" (perldoc
perlopentut) we read

  If you want to use "<ARGV>" processing in a totally boring and non-mag-
  ical way, you could do this first​:

  # "Sam sat on the ground and put his head in his hands.
  # 'I wish I had never come here, and I don't want to see
  # no more magic,' he said, and fell silent."
  for (@​ARGV) {
  s#^([^./])#./$1#;
  $_ .= "\0";
  }
  while (<>) {
  # now process $_
  }

So if you don't trust your script users, your code must be more robust
than using a naked <>. At least some preprocessing of @​ARGV applies
before calling <>. Talking like the Jarkko's histerical raisins, too
much code relies on the superpowers of C<open> which can be used to
process files prior to input with shell utilities (for example, ' |
gunzip dat.gz') and the like. This can't be changed globally​: but
everyone is welcome to add code to make it safer in certain
applications or even a reusable module which does this.

About the possibility to introduce an extra (potentially malicious)
C<-e> option via

perl -wne '' *

and files with weird name like '-e;print("Bwahaha\n")', this won't
work calling perl with a script​:

perl script.pl *

would get the -e as an argument. So don't use one-liners with
arguments like * if you think about security.

Maybe the documentation could include one or two phrases on the
potential for security breaches with open and <>, maybe not.

@p5pRT
Copy link
Author

p5pRT commented Sep 29, 2005

From @tamias

On Thu, Sep 29, 2005 at 09​:02​:27AM -0300, Adriano Ferreira wrote​:

About the possibility to introduce an extra (potentially malicious)
C<-e> option via

perl -wne '' *

and files with weird name like '-e;print("Bwahaha\n")', this won't
work calling perl with a script​:

perl script.pl *

would get the -e as an argument. So don't use one-liners with
arguments like * if you think about security.

Putting -- before the argument list should avoid that problem.

perl -wne '' -- *

Ronald

@p5pRT
Copy link
Author

p5pRT commented Jul 2, 2008

From @pjf

Raising this bug from the dead so we can lay it to rest at last.

The original bug report reads​:

There's a booby trap when magic open (i.e. initial/final special
characters like < > |) is used in conjunction with <>. Suppose
some devious person has left around a file such as "| rm -rf *;".

Yes, <> using 2-argument open just contain a nasty surprise. I don't
like it either. However I believe it's considered a feature, and I've
certainly seen a few tutorials, as well as working code that delights in
the ability to write​:

  myprog.pl log.0 log.1 'gunzip -c log.2.gz |'

and have <> work its magic.

This means I don't think we'll see <> changing to using 3-argument open
any time soon. Even if it did, all the existing code out there using
older Perls would still be vulnerable _anyway_, as well as the potential
for some existing code that uses this "feature" to break when Perl is
upgraded.

Luckily, there's a reasonably good work-around, and that's to use taint
mode. Because command-line arguments are always tainted, but Perl
doesn't check for taint when opening a file for *reading* (but it does
for writing and for pipes), starting Perl in taint mode practically
eliminates the problem of code injection attacks via command-line
arguments and <>.

If the program didn't intend to execute external commands to begin with,
then there should be no changes when the program uses taint. If it
*did* intend to execute external commands, but we're in an environment
where the filesystem itself may be considered hostile, then we
definitely want to be using taint anyway. ;)

One can still potentially use the arcane invocation '<&=0' to dup STDIN
(or another filehandle) without taint checks, but that's much less
serious than executing arbitrary code.

As such, I'm resolving this ticket and marking it as not-a-bug.

Cheerio,

  Paul

--
Paul Fenwick <pjf@​perltraining.com.au> | http​://perltraining.com.au/
Director of Training | Ph​: +61 3 9354 6001
Perl Training Australia | Fax​: +61 3 9354 2681

@p5pRT
Copy link
Author

p5pRT commented Jul 2, 2008

@pjf - Status changed from 'open' to 'rejected'

@p5pRT p5pRT closed this as completed Jul 2, 2008
@p5pRT
Copy link
Author

p5pRT commented Jul 14, 2008

From @jbenjore

On Wed, Jul 2, 2008 at 12​:33 AM, Paul Fenwick via RT
<perlbug-followup@​perl.org> wrote​:

...

This means I don't think we'll see <> changing to using 3-argument open
any time soon. Even if it did, all the existing code out there using
older Perls would still be vulnerable _anyway_, as well as the potential
for some existing code that uses this "feature" to break when Perl is
upgraded.

I'm of the opinion that working code should break in new versions of
perl if it is hitting this and people should thank us for it. Anyone
desiring to not have this break should continue to use old perl. Any
tutorial teaching this has always been broken. This has never been a
good feature and just because some people use it doesn't contradict
that.

I find it incredibly aggravating that my one-liners do the wrong thing
when they attempt to read my files <.st and >.st. Or rather - that I
must be careful to never let any one-liners touch some files.

As such, I'm resolving this ticket and marking it as not-a-bug.

No.

Cheerio,

No.

Josh

@p5pRT
Copy link
Author

p5pRT commented Jul 15, 2008

From @ysth

On Mon, July 14, 2008 3​:55 pm, Joshua ben Jore wrote​:

On Wed, Jul 2, 2008 at 12​:33 AM, Paul Fenwick via RT wrote​:

This means I don't think we'll see <> changing to using 3-argument open
any time soon. Even if it did, all the existing code out there using
older Perls would still be vulnerable _anyway_, as well as the
potential for some existing code that uses this "feature" to break when
Perl is upgraded.

I'm of the opinion that working code should break in new versions of
perl if it is hitting this and people should thank us for it. Anyone
desiring to not have this break should continue to use old perl.

No.

Any
tutorial teaching this has always been broken. This has never been a good
feature and just because some people use it doesn't contradict that.

I'd be fine with breaking the feature in 5.12 by default, but
would like a pragma to re-enable it.

Note that changing from 2-arg to 3-arg open breaks the following idiom
(command line args are files of filenames to read)​:

  @​ARGV = <>;
  while (<>) { ... }

because trailing newlines on the filenames are no longer ignored.

@p5pRT
Copy link
Author

p5pRT commented Jul 15, 2008

From @ap

* Yitzchak Scott-Thoennes <sthoenna@​efn.org> [2008-07-15 03​:10]​:

Note that changing from 2-arg to 3-arg open breaks the
following idiom (command line args are files of filenames to
read)​:

@​ARGV = <>;
while (<>) { ... }

because trailing newlines on the filenames are no longer
ignored.

Writing

  chomp(@​ARGV = <>);

is no great hardship.

(Note that at this time I am not taking either side in the
greater debate; the above is not an argument either way.)

--
*AUTOLOAD=*_;sub _{s/(.*)​::(.*)/print$2,(",$\/"," ")[defined wantarray]/e;$1}
&Just->another->Perl->hack;
#Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

p5pRT commented Jul 15, 2008

From ben@morrow.me.uk

Quoth twists@​gmail.com ("Joshua ben Jore")​:

On Wed, Jul 2, 2008 at 12​:33 AM, Paul Fenwick via RT
<perlbug-followup@​perl.org> wrote​:

...

This means I don't think we'll see <> changing to using 3-argument open
any time soon. Even if it did, all the existing code out there using
older Perls would still be vulnerable _anyway_, as well as the potential
for some existing code that uses this "feature" to break when Perl is
upgraded.

I'm of the opinion that working code should break in new versions of
perl if it is hitting this and people should thank us for it. Anyone
desiring to not have this break should continue to use old perl. Any
tutorial teaching this has always been broken. This has never been a
good feature and just because some people use it doesn't contradict
that.

I find it incredibly aggravating that my one-liners do the wrong thing
when they attempt to read my files <.st and >.st. Or rather - that I
must be careful to never let any one-liners touch some files.

How about making the implicit open done by <> use either main​::open (if
defined) or CORE​::GLOBAL​::open (if defined), so that it's possible to
write a SafeOpen.pm that overrides one of these to map
 
  open my $FH, '<foo';

to

  open my $FH, '<', '<foo';

?

Ben

--
Like all men in Babylon I have been a proconsul; like all, a slave ... During
one lunar year, I have been declared invisible; I shrieked and was not heard,
I stole my bread and was not decapitated.
~ ben@​morrow.me.uk ~ Jorge Luis Borges, 'The Babylon Lottery'

@p5pRT
Copy link
Author

p5pRT commented Jul 15, 2008

From @jbenjore

On Tue, Jul 15, 2008 at 6​:10 AM, Ben Morrow <ben@​morrow.me.uk> wrote​:

How about making the implicit open done by <> use either main​::open (if
defined) or CORE​::GLOBAL​::open (if defined), so that it's possible to
write a SafeOpen.pm that overrides one of these to map

Ok, but name it UnsafeOpen.pm because the default should work
properly. That is, 5.12 should out of the box not do anything weird
when given a file with any of the <, >, or | characters anywhere in
it. It should just read it. I'm ok with writing 5.10 off. I didn't
want to but if that's what it takes to get this change, ok.

Josh

@p5pRT
Copy link
Author

p5pRT commented Jul 17, 2008

From @epa

Joshua ben Jore <twists <at> gmail.com> writes​:

How about making the implicit open done by <> use either main​::open (if
defined) or CORE​::GLOBAL​::open (if defined), so that it's possible to
write a SafeOpen.pm that overrides one of these to map

Ok, but name it UnsafeOpen.pm because the default should work
properly. That is, 5.12 should out of the box not do anything weird
when given a file with any of the <, >, or | characters anywhere in
it. It should just read it.

FWIW, I completely agree with this. In my opinion it is much, much too
dangerous to have a common construct - one which is taught to beginners in every
Perl tutorial and looks innocuous - be tripped up so easily as by a file called
'|x' or anything else containing magic characters.

Yes, taint mode does prevent this, but unless taint mode is on by default for
5.12 it doesn't address the problem. The simple, default code should be 100%
safe.

Perl's motto is that easy things should be easy​: surely reading some files
specified on the command line, without barfing or worse on special characters,
is one of those easy things. Hard things should be possible, and magical open()
is certainly a useful feature in some situations, but magic can be dangerous.
By all means have it if you ask for it but the default, simplest code must be
safe for all situations.

Please could I ask the perl5 core team to have another look at this bug report.

--
Ed Avis <eda@​waniasset.com>

@p5pRT
Copy link
Author

p5pRT commented Jul 20, 2008

From tchrist@perl.com

I'm of the opinion that working code should break in new versions
of perl if it is hitting this and people should thank us for it.
Anyone desiring to not have this break should continue to use old
perl. Any tutorial teaching this has always been broken. This has
never been a good feature and just because some people use it
doesn't contradict that.

It's not obvious where in your text you've made the transition
from mere opinion to trying to make people believe that such
is fact; but that you've sneakily done so, there's no doubt.

I find it incredibly aggravating that my one-liners do the wrong thing
when they attempt to read my files <.st and >.st. Or rather - that I
must be careful to never let any one-liners touch some files.

People who create files with aggravating names are aggravating people;
just say no. Don't you remember

  chdir "/tmp";
  mkdir "etc\n";
  chdir("/etc\n");
  (naughty games with passwd I shan't detail)

Then there's the protection measure of "touch ./-i"; think about it.

If it takes a syswide find to locate unpleasant filenames and send
whingeing or abusive or threatening mail (interpretation rests on the
receiver) about the inadvisability of files with whitespace or brackets,
or question-marks or stars, (and for perl, also minuses, ampersands,
equals, or pipes), then that's a local administration issue. Look to
the plank in thine own eye...

And the day that I can no longer rely upon the overlying system to
automatically understand that "-" is STDIN for input or STDOUT for output
is the day that I will fork a parallel copy of Perl that maintains
traditional and expected behavior. However, I trust that will never need
to occur, ofr no pumpking has ever been so blindly cavalier--which can
probably be read as "foolish" if you're of that bent.

What you don't seem to understand is that taking a homogeneous approach to
argument processing is not a bug, but a feature--a BIG FEATURE. Perhaps
you're too much of a youngster to remember the days that shells didn't glob
command-line arguments for you, and each program had to parse and process
its own command line string. But I am not. Those were dark days of
unpredictability. It's the wrong way to (not) go about things.

When you make every program encode the logic of treating "-" as special,
you are making a very big mistake. Some programs shall, others shan't.
And therein lies the PITA.

Those ignorant of Unix are doomed to reinvent it--poorly.

Don't go there.

*DON'T!*

You would ask this sort of idiocy​:

  % perl -le 'open(IT, ">", "-")||die; print IT "stuff"; close(IT)||die'
  % cat ./-
  stuff

Just say no. I could put it more strongly, but if you haven't figured out
the serious flaw in your "thinking" by now, strong talk won't help that.

As such, I'm resolving this ticket and marking it as not-a-bug.

No.

Not merely wrong, but exceptionally so.

--tom

@p5pRT
Copy link
Author

p5pRT commented Jul 20, 2008

From mark@mark.mielke.cc

Tom Christiansen wrote​:

[ in response to a no doubt interesting thread ... ]
And the day that I can no longer rely upon the overlying system to
automatically understand that "-" is STDIN for input or STDOUT for output
is the day that I will fork a parallel copy of Perl that maintains
traditional and expected behavior. However, I trust that will never need
to occur, ofr no pumpking has ever been so blindly cavalier--which can
probably be read as "foolish" if you're of that bent.

What you don't seem to understand is that taking a homogeneous approach to
argument processing is not a bug, but a feature--a BIG FEATURE. Perhaps
you're too much of a youngster to remember the days that shells didn't glob
command-line arguments for you, and each program had to parse and process
its own command line string. But I am not. Those were dark days of
unpredictability. It's the wrong way to (not) go about things.

That open(handle, arg) doesn't translate 1​:1 as a system call open()
request *is* a feature - a feature that could provide more value than it
does today.

I often find myself wishing Perl accepted many MORE magical open
syntaxes than present. I think the following replacement for simple
'wget -o-' would be very cool​:

  perl -pe 1 http​://www.cpan.org/index.html

For applications with security concerns, the argument to open() still
needs to be checked whether 2-arg or 3-arg. I don't see the logic behind
an argument that would suggest the existing behaviour should be changed.
If a person doesn't like 2-arg - don't use it?

Cheers,
mark

--
Mark Mielke <mark@​mielke.cc>

@p5pRT
Copy link
Author

p5pRT commented Jul 21, 2008

From @arc

Tom Christiansen writes​:

taking a homogeneous approach to argument processing is not a bug, but
a feature--a BIG FEATURE. Perhaps you're too much of a youngster to
remember the days that shells didn't glob command-line arguments for
you, and each program had to parse and process its own command line
string. But I am not. Those were dark days of unpredictability.
It's the wrong way to (not) go about things.

When you make every program encode the logic of treating "-" as
special, you are making a very big mistake. Some programs shall,
others shan't.

And yet, in the context of Unix as a whole, every program _already
does_ have to treat "-" as special. In 7th Edition, cat, cmp, comm,
diff, join, sort, and split all handle an argument of "-" to mean
standard input, while egrep, fgrep, grep, od, sum, tail, and uniq
don't. Perl programs using two-argument C<open> handle "-". Those
using three-argument C<open>, or written in other languages, don't,
unless they include specific code to accomplish that. Furthermore,
while this situation is clearly suboptimal, it doesn't seem to be
intolerable​: we might be annoyed when program behaviour is at odds
with our expectations, but we can work around it.

My strong suspicion is that a large part of what to consider correct
behaviour in this area comes down to personal preference. I always
want my command-line programs to treat "-" as a reference to standard
input, but I never want leading or trailing pointies or pipes to mean
anything other than the files with the names as given. You clearly
differ on that, and I don't think your preferences are invalid.

I'm certainly not suggesting that Perl's long-standing behaviour here
should be changed, but I don't think the situation is as simple as is
implied by declaring that "taking a homogeneous approach to argument
processing is [...] a feature".

I also note that some of the most interesting uses of magical
two-argument C<open> -- things like

  perl -lne '...' 'command1 | filter1 |' 'command2 | filter2 |'

-- can in modern shells be handled with process substitution​:

  perl -lne '...' <(command1 | filter1) <(command2 | filter2)

I'm not convinced that that really constitutes an argument one way or
the other, though.

--
Aaron Crane ** http​://aaroncrane.co.uk/

@p5pRT
Copy link
Author

p5pRT commented Jul 24, 2008

From zefram@fysh.org

Aaron Crane wrote​:

                                                      I always

want my command-line programs to treat "-" as a reference to standard
input,

I think this convention is obsolete. If you want to refer to standard
input in a filename context, you can use /dev/stdin. It's a true
filename, and is available to all programs without any of them having
to do anything special.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Jul 24, 2008

From @arc

Zefram writes​:

Aaron Crane wrote​:

"-" as a reference to standard input

I think this convention is obsolete. If you want to refer to
standard input in a filename context, you can use /dev/stdin. It's
a true filename, and is available to all programs without any of
them having to do anything special.

True, but it's a lot more work to type.

Also, it doesn't behave quite the same way on all operating systems.

  $ uname
  Linux
  $ head -1 /usr/share/dict/words

  $ tail -c +52 /usr/share/dict/words | head -1
  Aaron
  $ cat stdin.pl
  seek STDIN, 51, 0 or die "seek​: $!\n";
  open my $fh, A R G V [ 0 ] o r d i e " o p e n : !\n";
  print tell $fh, "\n";
  print scalar <$fh>;
  $ perl -w stdin.pl /dev/stdin < /usr/share/dict/words
  0

  $ perl -w stdin.pl - < /usr/share/dict/words
  51
  Aaron

On some OSes (including BSD), opening /dev/stdin is equivalent to
calling dup(2) on fd 0, so that (for seekable file descriptors) the
new file descriptor shares a file offset pointer with stdin.

Linux differs; opening /dev/stdin gives you a file descriptor open on
the same file as fd 0, but the two descriptors are distinct, and do
not share a file offset pointer. (And since /dev/stdin is actually a
symlink to /proc/self/fd/0, an equivalent statement applies to opening
anything under /proc/*/fd.)

As it happens, Perl's magical open of "-" is neither dup(0) nor
open("/dev/stdin", O_RDONLY). Instead, it gives you a new stdio-ish
filehandle open on file descriptor 0; that's why the behaviour under
Linux differs from opening /dev/stdin.

--
Aaron Crane ** http​://aaroncrane.co.uk/

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2008

From @epa

Tom Christiansen <tchrist <at> perl.com> writes​:

I find it incredibly aggravating that my one-liners do the wrong thing
when they attempt to read my files <.st and >.st.

People who create files with aggravating names are aggravating people;
just say no.

If it takes a syswide find to locate unpleasant filenames and send
whingeing or abusive or threatening mail (interpretation rests on the
receiver) about the inadvisability of files with whitespace or brackets,
or question-marks or stars, (and for perl, also minuses, ampersands,
equals, or pipes), then that's a local administration issue.

You seem to have in mind the comfortable Unix environment most of us remember
from university days (or elsewhere), when your local bearded sysadmin would keep
a watchful eye on the students and make sure that the local collection of
eclectic, flaky but nonetheless lovable administrative shell scripts kept
working smoothly. I am not saying that is a bad ideal at all. But it's not the
issue here.

You can't run a system-wide check for 'unpleasant' filenames before every
invocation of your perl program. And although someone deliberately making a
file '| rm -rf /' is the example we all use (and a good enough example in its
own right to make this worth fixing, IMHO), in practice a successful attack to
get control of a computer often uses several stages. The first stage might be
to trick a slightly dopey CGI script into making a filename containing a '|'
character in its temporary directory - not a security hole in itself, right? and
then wait for an administrative Perl script run by root to 'while (<>)' in that
directory. Or indeed the same CGI script might use the 'while (<>)' construct -
which is surely just reading some files and should be safe, right?

Remember that here we're not talking about some obscure construct that only
experts know, where people can be expected to read about the security risks and
gotchas before using it. This is pretty much in the first chapter of every
beginner's Perl tutorial - yet few of them have any warning that it might do
unexpected things if a file in the current directory is 'unpleasant'. Surely we
have a duty to make sure that the code recommended to novices is totally safe to
use.

But security holes are just one specific class of bugs. In principle, the issue
is that asking perl to read some files should work 100% of the time. Not some
of the time depending on what filenames exist.

And the day that I can no longer rely upon the overlying system to
automatically understand that "-" is STDIN for input or STDOUT for output
is the day that I will fork a parallel copy of Perl that maintains
traditional and expected behavior.

If I might go rhetorical for a moment​: to rely upon the overlying system? That
seems like a very good idea. So the C library itself should interpret '-' to
mean stdin or stdout, it shouldn't be implemented afresh in every application
like perl, grep, tar and so on.

So why doesn't the C library automatically treat '-' as stdin or stdout?

What you don't seem to understand is that taking a homogeneous approach to
argument processing is not a bug, but a feature--a BIG FEATURE. Perhaps
you're too much of a youngster to remember the days that shells didn't glob
command-line arguments for you, and each program had to parse and process
its own command line string. But I am not. Those were dark days of
unpredictability. It's the wrong way to (not) go about things.

Absolutely. It would be crazy for perl or any other program to start doing
argument processing that better belongs in the shell. Imagine if shells didn't
support commands like

% cat <(echo hello; echo goodbye)

and cat, and every other program, had to have special code to recognize '<' in
its arguments. It would be a terrible mess - and you'd need another layer of
mess to handle the case when '<' really is in a filename. Much better to let
the shell do it.

When you make every program encode the logic of treating "-" as special,
you are making a very big mistake. Some programs shall, others shan't.

You would ask this sort of idiocy​:

% perl -le 'open(IT, ">", "-")||die; print IT "stuff"; close(IT)||die'
% cat ./-
stuff

That's exactly the behaviour I see with perl 5.10. What result do you get?

--
Ed Avis <eda@​waniasset.com>

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2008

From @epa

Mark Mielke <mark <at> mark.mielke.cc> writes​:

For applications with security concerns, the argument to open() still
needs to be checked whether 2-arg or 3-arg. I don't see the logic behind
an argument that would suggest the existing behaviour should be changed.
If a person doesn't like 2-arg - don't use it?

The issue here is that

  while (<>) { ... }

is using the magical 2-argument open, with all the implications of running
external commands depending on what filenames happen to be in the current
directory, yet this code is taught to everyone as the standard way to open
the files in the command line.

The un-magic alternative requires a lot more code. Surely this is backwards​:
the simple short code should be safe to use under all circumstances, and if
you want the more dangerous (though certainly very useful) behaviour you
should have to ask for it.

As I see it, then, the choices are

- Change while (<>) and while (<ARGV>) to use 3-argument open, perhaps with
an exception for '-' to mean stdin, since reading from stdin is not normally
dangerous.

- Or change every Perl tutorial starting with Perl's own documentation to note
that while (<>) can do funny things and is not to be used unless you trust
everyone and every program that could have created a file in the current
directory.

- Or introduce a new language construct 'the safe way to read command line
arguments' and change all the tutorials to recommend that instead.

--
Ed Avis <eda@​waniasset.com>

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2008

From @Abigail

On Fri, Jul 25, 2008 at 03​:13​:02PM +0000, Ed Avis wrote​:

Mark Mielke <mark <at> mark.mielke.cc> writes​:

For applications with security concerns, the argument to open() still
needs to be checked whether 2-arg or 3-arg. I don't see the logic behind
an argument that would suggest the existing behaviour should be changed.
If a person doesn't like 2-arg - don't use it?

The issue here is that

while \(\<>\) \{ \.\.\. \}

is using the magical 2-argument open, with all the implications of running
external commands depending on what filenames happen to be in the current
directory, yet this code is taught to everyone as the standard way to open
the files in the command line.

The un-magic alternative requires a lot more code. Surely this is backwards​:
the simple short code should be safe to use under all circumstances, and if
you want the more dangerous (though certainly very useful) behaviour you
should have to ask for it.

As I see it, then, the choices are

- Change while (<>) and while (<ARGV>) to use 3-argument open, perhaps with
an exception for '-' to mean stdin, since reading from stdin is not normally
dangerous.

- Or change every Perl tutorial starting with Perl's own documentation to note
that while (<>) can do funny things and is not to be used unless you trust
everyone and every program that could have created a file in the current
directory.

What does the current directory have to do with while (<>)?

while (<>) reads from filenames from @​ARGV, not from the current directory.

- Or introduce a new language construct 'the safe way to read command line
arguments' and change all the tutorials to recommend that instead.

- Teach people not to use "funny" characters in the filenames lightly,
  since whatever one may or may not do to Perl, it *will* bite them
  if they treat them carelessly; after all, the set of of characters
  that are special to Perl are (with the exception of -) a subset of
  the characters that are special to most shells anyway.

Abigail

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2008

From @Abigail

On Fri, Jul 25, 2008 at 03​:05​:30PM +0000, Ed Avis wrote​:

Tom Christiansen <tchrist <at> perl.com> writes​:

I find it incredibly aggravating that my one-liners do the wrong thing
when they attempt to read my files <.st and >.st.

People who create files with aggravating names are aggravating people;
just say no.

If it takes a syswide find to locate unpleasant filenames and send
whingeing or abusive or threatening mail (interpretation rests on the
receiver) about the inadvisability of files with whitespace or brackets,
or question-marks or stars, (and for perl, also minuses, ampersands,
equals, or pipes), then that's a local administration issue.

You seem to have in mind the comfortable Unix environment most of us remember
from university days (or elsewhere), when your local bearded sysadmin would ke
a watchful eye on the students and make sure that the local collection of
eclectic, flaky but nonetheless lovable administrative shell scripts kept
working smoothly. I am not saying that is a bad ideal at all. But it's not t
issue here.

You can't run a system-wide check for 'unpleasant' filenames before every
invocation of your perl program. And although someone deliberately making a
file '| rm -rf /' is the example we all use (and a good enough example in its
own right to make this worth fixing, IMHO), in practice a successful attack to
get control of a computer often uses several stages. The first stage might be
to trick a slightly dopey CGI script into making a filename containing a '|'
character in its temporary directory - not a security hole in itself, right? a
then wait for an administrative Perl script run by root to 'while (<>)' in tha
directory. Or indeed the same CGI script might use the 'while (<>)' construct
which is surely just reading some files and should be safe, right?

Oh, come on. This is a solved problem. The answer is -T​:

  $ perl -wTE 'while (<>) {print}' '> foo'
  Insecure dependency in open while running with -T switch at -e line 1.
  $

-T is recommended both for CGI programs, and programs running as root.

Abigail

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2008

From @Abigail

On Fri, Jul 25, 2008 at 04​:36​:56PM +0100, Ed Avis wrote​:

Abigail asked​:

What does the current directory have to do with while (<>)?

Sorry, I was thinking of the common usage of

% my_program *

- Teach people not to use "funny" characters in the filenames lightly,
since whatever one may or may not do to Perl, it *will* bite them
if they treat them carelessly; after all, the set of of characters
that are special to Perl are (with the exception of -) a subset of
the characters that are special to most shells anyway.

I agree that using funny characters deliberately is not a good idea. However,

If you are running "my_program *" as root in a directory where black hat people
could have created files, you have a problem anyway.

For instance, 'rm -i *' won't ask for each file whether it should be deleted
or not if someone creates a file '-f' in said directory.

The answer here is to teach people to not blindly assume their insecure
environment is save. And Perl can help you there​: it's called -T. Which
not only prevents the "while (<>) {}" problem, but a host of other problems
as well.

Abigail

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2008

From @epa

Oh, come on. This is a solved problem. The answer is -T​:

$ perl -wTE 'while (<>) {print}' '> foo'
Insecure dependency in open while running with -T switch at -e line 1.

That deals with a lot of it, but the -T flag is not the default. By default if you write the innocuous-looking perl program

  use warnings;
  use strict;
  use 5.010;
  while (<>) { print }

you will get a program with all the magical, dangerous behaviour discussed earlier in this thread. It would be better if safety were the default, with some special command line flag or 'use' to turn on the unsafe behaviour.

To get some real-world data​: if you pick the first page of results from
<http​://www.google.com/codesearch?as_q=while+\(%3C%3E&btnG=Search+Code&hl=en&as_lang=perl&as_license_restrict=i&as_license=&as_package=&as_filename=&as_case=>,
which should be close to a random sample, you see that none of these programs uses the -T flag, nor does any of them document that it will do odd things if passed a filename that contains funny characters, and so is unsafe (in the general case) to use with shell wildcards. And why should they? They were just writing the normal perl code to read some files specified on the command line.

I suppose that 'use warnings' could print out a warning on 'while (<>)' saying 'this construct is not safe unless you use -T', but that seems daft to me. Just make it safe to use, no ifs and no buts.

Even with -T, the program will abort when the '>foo' filename is found, which is not great for something that's purportedly meant to just read some files.

The issue is not that expert programmers should be able to turn on some particular flag or use some particular incantation to read files given on the command line in a safe way. The issue is that the simple, small, innocuous-looking code is dangerous. IMHO, the simple code 'while (<>)' should be safe for all uses, and the flags, bells and whistles can be added to turn on the magical, risky behaviour if wanted.

--
Ed Avis <eda@​waniasset.com>

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http​://www.messagelabs.com/email
______________________________________________________________________

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2008

From @epa

Abigail asked​:

What does the current directory have to do with while (<>)?

Sorry, I was thinking of the common usage of

% my_program *

- Teach people not to use "funny" characters in the filenames lightly,
since whatever one may or may not do to Perl, it *will* bite them
if they treat them carelessly; after all, the set of of characters
that are special to Perl are (with the exception of -) a subset of
the characters that are special to most shells anyway.

I agree that using funny characters deliberately is not a good idea. However, they do sometimes appear by accident, and they can be made to appear by anyone who has write access to the directory where 'my_program *' is run. If my_program uses 'while (<>)', then if 'my_program *' is run by root in a certain directory, you are effectively granting root command access to anyone who can create a file in that directory.

Even in less drastic cases than the above (which is the worst case, but certainly not impossible) a similar bug can be used as part of a multi-step exploit; perhaps the web server has a bug that means an attacker can cause it to write a zero-length file in a certain logfile directory; then a log grepper script might get caught. Again this is just an example.

Even if there are no security implications because only one person uses the computer and there are no external-facing daemons to be compromised, it is still a peculiar behaviour for the program to go off and try to run the command 'x' when the filename '>x' was given on its command line. No standard Unix utility does this​:

% grep hello '>x'
grep​: >x​: No such file or directory

although many standard Unix programs take an argument of '-' to mean read from standard input, which is not usually dangerous.

--
Ed Avis <eda@​waniasset.com>

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http​://www.messagelabs.com/email
______________________________________________________________________

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2008

From @epa

Abigail wrote​:

If you are running "my_program *" as root in a directory where
black hat people could have created files, you have a problem anyway.

For instance, 'rm -i *' won't ask for each file whether it
should be deleted or not if someone creates a file '-f' in
said directory.

That is true. But I wouldn't expect 'grep *' to go off running random external programs. Not even if grep were implemented in perl.

The answer here is to teach people to not blindly assume their
insecure environment is save.

That is a good thing to teach, and the first lesson would be 'do not use while (<>) unless you also use -T'. However that is not mentioned in any perl tutorial I know of.

If you have a tool which is potentially dangerous, one alternative to giving this kind of warning is to provide beginners with a safer (if somewhat blunter) tool for everyday use. They can graduate to the more dangerous one when they are ready for it, and understand the risks.

That's why I think that 'while (<>)' would better be the safe kind of 'read all the files', and the more dangerous kind should have a syntax that marks it as such and cautions you not to blindly assume it will be safe.

--
Ed Avis <eda@​waniasset.com>

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http​://www.messagelabs.com/email
______________________________________________________________________

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2008

From @davidnicol

On Fri, Jul 25, 2008 at 11​:14 AM, Ed Avis <eda@​waniasset.com> wrote​:

That's why I think that 'while (<>)' would better be the safe kind of 'read all the files', and the more dangerous kind should have a syntax that marks it as such and cautions you not to blindly assume it will be safe.

perhaps, 'while(<<>>)' could be the current 2-arg semantics from 5.11
on, and while(<>) would do three-arg opens?

--
Cheer up, sad person

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2008

From @Abigail

On Fri, Jul 25, 2008 at 02​:39​:16PM -0500, David Nicol wrote​:

On Fri, Jul 25, 2008 at 11​:14 AM, Ed Avis <eda@​waniasset.com> wrote​:

That's why I think that 'while (<>)' would better be the safe kind of 'read all the files', and the more dangerous kind should have a syntax that marks it as such and cautions you not to blindly assume it will be safe.

perhaps, 'while(<<>>)' could be the current 2-arg semantics from 5.11
on, and while(<>) would do three-arg opens?

So, you are willing to break programs that currently use the fact <>
is 2-arg open and work as intended in order that some dimwit that isn't
using -T doesn't run into problems this morning, but this afternoon?

The problem here Ed is painting here isn't 2-arg open; it's people not
considering file names may have characters that are special. And if they
won't get into trouble by <>, they'll get into problems by the shell.
Or some other program. Or because they use 2-arg open in their programs.

(You know, not all Perl tutorials rewrote themselves the moment 3-arg
open became available. Nor did all Perl programs. Perhaps we should make
it that using 2-arg is a compile time error. Of course, the people that
Ed is going to save won't be saved until they upgrade their perl.)

Now, the idea of having both '<>' and '<<>>', and have one of them do 2-arg
open, and the other 3-arg open is interesting. But I'd prefer not to break
existing programs, and would rather see 'while (<<>>)' do 3-arg open, while
leaving while (<>) as is.

Abigail

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2008

From perl@nevcal.com

On approximately 7/25/2008 12​:53 PM, came the following characters from
the keyboard of Abigail​:

On Fri, Jul 25, 2008 at 02​:39​:16PM -0500, David Nicol wrote​:

On Fri, Jul 25, 2008 at 11​:14 AM, Ed Avis <eda@​waniasset.com> wrote​:

That's why I think that 'while (<>)' would better be the safe kind of 'read all the files', and the more dangerous kind should have a syntax that marks it as such and cautions you not to blindly assume it will be safe.
perhaps, 'while(<<>>)' could be the current 2-arg semantics from 5.11
on, and while(<>) would do three-arg opens?

So, you are willing to break programs that currently use the fact <>
is 2-arg open and work as intended in order that some dimwit that isn't
using -T doesn't run into problems this morning, but this afternoon?

One could speculate about "while(<>)" using 2-arg open if -T is set and
3-arg open otherwise, with "while(<<>>)" or "use magical while;" causing
2-arg open to be used even without -T.

But to answer your question, yes.

How else can we encourage the dimwits to continue using perl, after they
get burned by stuff like this, if we don't improve the language?

So open docs do say​:

One should conscientiously choose between the magic and 3\-arguments form of open\(\)&#8203;:

    open IN\, $ARGV\[0\];

will allow the user to specify an argument of the form "rsh cat file |"\, but will not work on a filename which happens to have a trailing space\, while

    open IN\, '\<'\, $ARGV\[0\];

will have exactly the opposite restrictions\.

But there is no warning of all that under control flow statements in
perlsyn where while(<>) is discussed, nor in perlintro where while(<>)
is discussed.

I've long since learned not to use certain characters in filenames, and
especially not at the beginning.

I, myself, have had to agree to use Python for a current project,
because Perl doesn't seem to have a Unicode-supporting, cross-platform,
cross-platform-printing-capable GUI environment (the last point being
the sticker). That's bad enough, having quite esoteric requirements,
but for simple things to have such complex gotchas buried within is
really bad.

--
Glenn -- http​://nevcal.com/

A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From @moritz

Abigail wrote​:

If security is an issue, I think the safest way is to tell people *NOW*.
Patch the documentation if you think it's not clear enough. Write articles
on Perlmonks. Send errata to book publishers. Speak at a conference.
Surely that would beat waiting for everyone to upgrade to 5.12.

I think they don't conflict​: patch the docs for 5.8.9 and 5.10.1, and
the code for 5.12.

The problem with informing the people out there is that you don't reach
the bulk of perl programmers. Most aren't involved in the community at
all, don't read perlmonks, don't read use.perl.org, don't attend
conferences. That's easy to forget for somebody who is active in the
community and meets all those people who are active as well.

Hell, if they know about basic use of <> already they won't read the
documentation again, even when they upgrade to the next perl version.

(I programmed in perl for about 3 or 4 years before having any contact
to the community. And I didn't know about ARGV's magic. During $work I
made contact with a few other perl programmers in similar circumstances).

Moritz

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From @Abigail

On Tue, Jul 29, 2008 at 11​:52​:01AM +0100, Ed Avis wrote​:

Abigail wrote​:

You'd be better off to make it a feature; using the feature will
prevent the code from running on older perls.

Is it possible to backport features via a CPAN module?

'say' existed as a CPAN module long before 5.10 was there.
(Perl6​::Say, IIRC).

I thought by 'feature' you meant as in 'use feature'. Or is that the wrong way to do it?

I think we are misunderstanding each other. And now I no longer know what
you are asking.

Abigail

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From @epa

I thought you were suggesting something like

  use feature 'diamond_is_safe';

if the existing default behaviour of <> is kept, or

  use feature 'diamond_is_magic';

if <> is changed to be safe by default.

'use'ing either of these features would prevent the code running on older perls.

I asked, is it possible for a CPAN module to provide a 'feature' for older perl versions? So that if the module is installed, 'use feature X' loads the module and turns on that feature, even though perl doesn't have it builtin?

--
Ed Avis <eda@​waniasset.com>

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http​://www.messagelabs.com/email
______________________________________________________________________

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From @davidnicol

Whoever is compiling the module of handy preprocesses for ARGV may
wish to consider

BEGIN { @​ARGV = map "< \Q$_\E" @​ARGV }

as in

cat <<EOM > SafeMagicalFootball.pm
package SafeMagicalFootball;

sub import { @​ARGV = map "<\Q$_\E" @​ARGV }

1;
EOM

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From hashin_p@thbs.com

What do \Q and \E stand for?

On Tue, Jul 29, 2008 at 2​:22 PM, Mark Mielke <mark@​mark.mielke.cc> wrote​:

David Nicol wrote​:

Whoever is compiling the module of handy preprocesses for ARGV may
wish to consider

BEGIN { @​ARGV = map "< \Q$_\E" @​ARGV }

as in

cat <<EOM > SafeMagicalFootball.pm
package SafeMagicalFootball;

sub import { @​ARGV = map "<\Q$_\E" @​ARGV }

1;
EOM

It's a good thought that those who are very concerned about <> can use an
ARGV preprocessor. Unfortunately, you're suggestion doesn't work. :-)

Specifically, \Q...\E will prefix certain characters with '\', and this
will make it impossible to represent file names with the same special
characters (the theoretical filenames with '|', '<' or '>') but that are not
escaped in the file name.

Cheers,
mark

--
Mark Mielke <mark@​mielke.cc>

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From @davidnicol

On Tue, Jul 29, 2008 at 8​:22 AM, Mark Mielke <mark@​mark.mielke.cc> wrote​:

Specifically, \Q...\E will prefix certain characters with '\', and this will
make it impossible to represent file names with the same special characters
(the theoretical filenames with '|', '<' or '>') but that are not escaped in
the file name.

phooey.

Prefixing < takes care of opening and reading files with pipes and so
on in them, but leading/trailing space seems impossible to represent
with a 2-arg open.

--
Cheer up, sad person

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From @Abigail

On Tue, Jul 29, 2008 at 09​:20​:28AM -0500, David Nicol wrote​:

On Tue, Jul 29, 2008 at 8​:22 AM, Mark Mielke <mark@​mark.mielke.cc> wrote​:

Specifically, \Q...\E will prefix certain characters with '\', and this will
make it impossible to represent file names with the same special characters
(the theoretical filenames with '|', '<' or '>') but that are not escaped in
the file name.

phooey.

Prefixing < takes care of opening and reading files with pipes and so
on in them, but leading/trailing space seems impossible to represent
with a 2-arg open.

It has been documented for a long, long time how to solve this​:

  my $file = " hello ";
  open my $fh, "< ./$file\0";

Granted, "\0" is a bit of an oddity, but it is possibly. And you don't
need sysopen.

I think the current documentation is in perlopentut.

Abigail

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From @davidnicol

On Tue, Jul 29, 2008 at 9​:32 AM, Abigail <abigail@​abigail.be> wrote​:

It has been documented for a long, long time how to [handle leading/trailing space in 2-arg open]​:

my $file = " hello ";
open my $fh, "< ./$file\0";

Granted, "\0" is a bit of an oddity, but it is possibly. And you don't
need sysopen.

so would

  $_ = (m|^/| ? "< $_\0" : "< ./$_\0") for @​ARGV;

work as a football safety device, inserted into the execution process
at the appropriate time (i.e. after getopts)?

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From @ap

Hi Tom,

* Tom Christiansen <tchrist@​perl.com> [2008-07-29 05​:40]​:

The thought of updating triple-digit numbers of my happily
running scripts that certain individuals would just as well see
broken is really beyond the conscionable--or its promulgators,
conscientiousness.

do these scripts enable warnings?

* Abigail <abigail@​abigail.be> [2008-07-28 21​:30]​:

- Programs that wouldn't use while (<>) pre-5.12 (because they
might run in an environment where file names may start with
'|' or '>') will use 3-arg "safe" while (<>), will be,
silently, a security issue when run with a pre-5.12.

If you make "while (<<>>)" to be 3-arg open, then at least such
programs will fail to compile when run with a pre-5.12 perl.

Exactly. I want to highlight this again​: in my opinion, having
code that is safe under 5.12 (or 5.10.1 or whenever) not silently
become unsafe under 5.10.0 or earlier is an incontrovertible
argument for introducing a new safe diamond-like operator as
incompatible syntax.

We can discourage the unconsidered use of magic ARGV with a
warning. This would be the exact same strategy that C compilers
followed WRT `gets`, which it seems to me worked well for C. It
also seems to me that the people who are certain enough that they
want this feature are also people who won’t shy away from muting
a warning.

Regards,
--
Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From tchrist@perl.com

In-Reply-To​: Message from Mark Mielke <mark@​mark.mielke.cc>
  of "Tue, 29 Jul 2008 01​:08​:21 EDT." <488EA5C5.9050500@​mark.mielke.cc>

Tom Christiansen wrote​:

[ how to support magical @​ARGV better than today ]
If you feed an @​ARGV or <STDIN> like this​:

/etc
\-
/etc/passwd 
/tmp/\.X11\-unix/X0 
foo\.gz 
http&#8203;://somewhere\.com/path/to/foofile\.gz 
/etc/motd 
/dev/tty 
/tmp/fifo 
~/\.exrc

Very cool. Obviously, I wouldn't use it in a place where @​ARGV
can be passed in via a web page form.

"Obvious", you say. And obvious I suppose it is--to me and thee.

But I'm increasingly susiciousness that we two may *not* be common case-
examples for adjudging obviousness. That anyone would ever supply commands
with untested data coming from a potentially hostile entity isn't obvious.

Well, to me.

What instead seems obvious is people without competence in security
have been inappropriately tasked (or have tasked themselves) to create
code beyond their ability.

If you don't understand fundamental matters of security but are
writing security-related code, at least one tacit problem exists,
and probably more.

Perl arose in a culture where the security exploits most of us were worried
about were those involving trusted code run by untrusted users. Setuid
programs (and to a lesser extent, setgid ones) that could be subverted to
misbehave were real concerns. They were they were executing at a privilege
level distinct (and higher) than people running them.

Perl's dataflow-tracing that's engaged in taint mode was a real innovation
here. The simple maxim that data taken from outside the program couldn't
be used to affect the state of anything outside your program,
*transitively*, was and is a great boon. Per its design philosphy, Perl
keeps track of things even if you forget to. Combined with indirect
execution and automatic memory management, Perl *can* make for a much safer
place to write secure code in than C does.

That "*can*" is a big one. Just because it can, doesn't mean it does.

I once ran a code-review for company that did online Perl training. As part
of their course, they'd let the remote type arbitrary Perl code into their
text input widgets, than blindly eval it. They'd never heard of taintmode,
Safe compartments, per-user chrooting to sandboxes via loopback read-only
mounts, nor anything else that they should have been expert in.

Incredible, you may say, but perfectly and terrifyingly true.

More than once upon showing users the rename/relink/pathedit
script, I've had to address loud cries security-cries. You'll
remember that this program, at its most basic level, is no more
than this​:

  $op = shift;
  for (@​ARGV) {
  $was = $_;
  eval $op;
  die if @​_;
  next if $was eq $_;
  rename($was, $_) || die "can't rename $was to : !";
  }

Why were they screaming? Because of the eval. They say, but now
people can say

  % rename 'system("/bin/rm -rf / &")' *.c

or some similar mayhem.

I then ask them how likely it is that they are ever going
to type

  % /bin/rm -rf /

into their own shells, and of course they say, "Never, that would
be really stupid."

But as you see, there's a misconnect here, because it's no different.
Why would a user ever want to hurt himself?

If you are the author of the code that you're running, you can trust
yourself not to intentionally harm yourself. You don't need a sandbox
to protect you.

If you are not that code's author, merely its executor, then you
do not trust that code, and so you do.

If you write code that's run by untrustworthy agents,
use and understand taint mode.

If you run code that was written by untrustworthy agents,
use and understand Safe compartments--at least.

To my mind, it's a bug that while(<>) in taint mode doesn't
realize that a raw @​ARGV from the command line is unsafe.

  % perl -ne 'print ; exit' /etc/passwd
  root​:*​:0​:0​:Charlie &​:/root​:/bin/csh

  % perl -Tne 'print ; exit' /etc/passwd
  root​:*​:0​:0​:Charlie &​:/root​:/bin/csh

Since it knows that that's tainted data​:

  % perl -MDevel​::Peek -E '$s = shift; say Dump($s)' /etc/passwd
  SV = PV(0x3c029038) at 0x3c0398c0
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x3c022560 "/etc/passwd"\0
  CUR = 11
  LEN = 12

  % perl -MDevel​::Peek -TE '$s = shift; say Dump($s)' /etc/passwd
  SV = PVMG(0x3c035b80) at 0x3c03a8ac
  REFCNT = 1
  FLAGS = (GMG,SMG,pPOK)
  IV = 0
  NV = 0
  PV = 0x3c022210 "/etc/passwd"\0
  CUR = 11
  LEN = 12
  MAGIC = 0x3c03c460
  MG_VIRTUAL = &PL_vtbl_taint
  MG_TYPE = PERL_MAGIC_taint(t)
  MG_LEN = 1

And knows better than to let you use it in external commands​:

  % perl -MDevel​::Peek -TE '$ENV{PATH} = "/bin​:/usr/bin"; $s = shift; print Dump($s); system("head", -1, $s)' /etc/passwd
  SV = PVMG(0x3c035b80) at 0x3c03a938
  REFCNT = 1
  FLAGS = (GMG,SMG,pPOK)
  IV = 0
  NV = 0
  PV = 0x3c022210 "/etc/passwd"\0
  CUR = 11
  LEN = 12
  MAGIC = 0x3c03c460
  MG_VIRTUAL = &PL_vtbl_taint
  MG_TYPE = PERL_MAGIC_taint(t)
  MG_LEN = 1
  Insecure dependency in system while running with -T switch at -e line 1.
  Exit 19

--tom

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From @rgs

2008/7/29 Aristotle Pagaltzis <pagaltzis@​gmx.de>​:

* Abigail <abigail@​abigail.be> [2008-07-28 21​:30]​:

- Programs that wouldn't use while (<>) pre-5.12 (because they
might run in an environment where file names may start with
'|' or '>') will use 3-arg "safe" while (<>), will be,
silently, a security issue when run with a pre-5.12.

If you make "while (<<>>)" to be 3-arg open, then at least such
programs will fail to compile when run with a pre-5.12 perl.

Exactly. I want to highlight this again​: in my opinion, having
code that is safe under 5.12 (or 5.10.1 or whenever) not silently
become unsafe under 5.10.0 or earlier is an incontrovertible
argument for introducing a new safe diamond-like operator as
incompatible syntax.

If I parse you well, that's indeed a compelling argument. Finding a
balance between security and compatibility isn't very easy.

We can discourage the unconsidered use of magic ARGV with a
warning. This would be the exact same strategy that C compilers
followed WRT `gets`, which it seems to me worked well for C. It
also seems to me that the people who are certain enough that they
want this feature are also people who won't shy away from muting
a warning.

Recapitulating what was proposed by you, we are getting to :
* not changing <>
* introducing new, safer <<>> (or «» if I may joke about the
utf8-cleanliness of the tokeniser)
* a feature or a pragma then becomes not useful
* a way to extend ARGV's magic would be nice, but needs not to be in the core

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From tchrist@perl.com

In-Reply-To​: Message from Abigail <abigail@​abigail.be>
  of "Tue, 29 Jul 2008 09​:29​:13 +0200." <20080729072913.GM30221@​almanda>

On Tue, Jul 29, 2008 at 01​:08​:21AM -0400, Mark Mielke wrote​:

If I want to write a secure application, I'm not sure I would choose
Perl. If I did use Perl, or any other language, I would expect to have
to put in effort and have a clue.

I wouldn't expect anyone to write a non-trivial secure application without
having to put in effort and having a clue. Regardless of the language.

It's amazing what people expect. If I've heard it once--and I have--then
I've heard it said a hundred times that, "We can't use {grep or ++ or
multiple inheritance or WHATEVER} in our Perl code because the Virtual
Basic programmers we've hired to maintain our code don't understand
{WHATEVER}." And nobody (but me) even ever laughs, despite how if you'd
just s/Perl/C++/, then I find that everyone *would* laugh.

These aren't competent professionals. They're programming strumpets just
doing what they're paid to do, and thinking doesn't appear part of the job
description (or of their managers').

But if you think this is an important issue, wouldn't it make much
more sense to teach people RIGHT NOW that they shouldn't rely on
while (<>) automatically open files for them, then to wait a couple
of years before 5.12 is released, and then a few more years before
everyone has upgraded to 5.12?

Because while you can lead a horse to water, but you can't make it drink.
And if you let it drink freely on its own, it may drink itself to death.

Which just shows that you can't win for trying.

I spent years on it--YEARS. They still just won't learn. Things like this
have been in the FAQ since BEFORE Camel-1 even; they've been in plenty of
the available documentation since then--a body of writing I note with
extreme dismay and contemptuous disdain, has been remarkably bowdlerized
with Pythonesque Right-Thinking-Only since last I visited it.

It used to be right there prominently in open in the perl 4.036 manpage​:

  The filename that is passed to open will have
  leading and trailing whitespace deleted. In order
  to open a file with arbitrary weird characters in
  it, it's necessary to protect any leading and
  trailing whitespace thusly​:

  $file =~ s#^(\s)#./$1#;
  open(FOO, "< $file\0");

But that is now hidden away at best. And it's no longer
even in the FAQ at all, which is where it belongs!

Certainly the 1997 version of perlfaq5 covered it. It once read​:

  How can I open a file with a leading ">" or trailing blanks?

  Normally perl ignores trailing blanks in filenames, and interprets
  certain leading characters (or a trailing "|") to mean something
  special. To avoid this, you might want to use a routine like this. It
  makes incomplete pathnames into explicit relative ones, and tacks a
  trailing null byte on the name to make perl leave it alone​:

  sub safe_filename {
  local $_ = shift;
  return m#^/#
  ? "$_\0"
  : "./$_\0";
  }

  $fn = safe_filename("<<<something really wicked ");
  open(FH, "> $fn") or "couldn't open f n : !";

  You could also use the sysopen function (see sysopen).

What do we get today? Bowdlerization!

  How can I open a file with a leading ">" or trailing blanks?

  (contributed by Brian McCauley)

  The special two argument form of Perl's open() function ignores
  trailing blanks in filenames and infers the mode from certain leading
  characters (or a trailing "|"). In older versions of Perl this was the
  only version of open() and so it is prevalent in old code and books.

  Unless you have a particular reason to use the two argument form you
  should use the three argument form of open() which does not treat any
  characters in the filename as special.

  open FILE, "<", " file "; # filename is " file "
  open FILE, ">", ">file"; # filename is ">file"

HELLO? What happened to the right answer that was there before?

But what should I expect? The perl faq is now a document that thinks
this is somehow clear and consistent code, and it's anything but​:

  # ...
  {
  local($^I, @​ARGV) = ('.orig', glob("*.c"));
  while (<>) {
  if ($. == 1) {
  print "This line should appear at the top of each file\n";
  }
  s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case
  print;
  close ARGV if eof; # Reset $.
  }
  }
  # $^I and @​ARGV return to their old values here

It's *terribly* misleading! Notice that while loop's close curly. Oh
wait, got it wrong, didn't you? This is very poor style. Worse, it
doesn't protect the things that it should protect, and doesn't even do what
it claims to do! At the very least, it needs to be rewritten as​:

  # ...
  {
  local $^I = ".orig";
  local($_, *ARGV, *ARGVOUT);
  @​ARGV = glob("*.c");
  while (<>) {
  if ($. == 1) {
  print "This line should appear at the top of each file\n";
  }
  s/\b(p)earl\b/${1}erl/i; # Correct 1st typo; preserve *only* initial case
  print; # This will go to ARGVOUT, the temp file
  close ARGV if eof; # Reset $.
  } # and rename files on each new $ARGV;
  # XXX​: pity we didn't check the close, but neither did Perl
  }
  # Here $^I, $_, @​ARGV, $ARGV, *ARGV{IO}, and *ARGVOUT{IO}
  # all return to any previous values held before block entrance.

There are entries like that throughout. Makes me want to
pull my hair out for wasted effort.

Which brings me to my final complaint of the day. That code is
wrong because Perl itself is wrong. No really, it is.

Watch​:

  % df -h .
  Filesystem Size Used Avail Capacity Mounted on
  /dev/wd0a 124M 117M -5.0K 100% /

  % ls -l it*
  -rw-r--r-- 1 tchrist wheel 0 Jul 29 14​:06 it

  % perl -i.orig -pe 'print "this is more stuff\n"' it

  % echo $?
  0

  % ls -l it*
  0 -rw-r--r-- 1 tchrist wheel 0 Jul 29 15​:05 it
  0 -rw-r--r-- 1 tchrist wheel 0 Jul 29 14​:06 it.orig

To this day, Perl's implicit closing of files doesn't warn you of errors,
let alone exit nonzero. This makes it do wrong thing and not even tell you
it did them wrong. This is a *true* problem, because checking for the
success of print() is neither necessary nor sufficient to detect the
success of print(). Yes, you read that correctly. It's because of
buffering, plus the persistence of the err flag on the file structure.

I've never convinced anybody this is important. Since *every*
program should do this for correctness, it has to be in the run-time
system to avoid it ever being forgotten. Sure, there's stuff
like this you can do​:

  END { close(STDOUT) || die "can't close stdout​: $!" }

But that sort of thing should happen on all implicitly closed things. And
it really must. Even IO​::Handle​::close doesn't bother. Perl knows what
handles it's flushing closing during global destruction, at least if you
don't just blindly fflush(0).

Watch again, starting from before the bogus, ill-reported rename attempt​:

  % df -h .
  Filesystem Size Used Avail Capacity Mounted on
  /dev/wd0a 124M 117M -5.0K 100% /

  % ls -l it
  -rw-r--r-- 1 tchrist wheel 0 Jul 29 14​:06 it

  % perl -e 'print "stuff\n"' >> it; echo $?
  0

  % perl -e 'open(my f h , " >> i t " ) | | d i e " o p e n !"; print $fh "stuff\n"; print STDOUT "ok now\n"'
  ok now

  % echo $?
  0

  % ls -l it
  -rw-r--r-- 1 tchrist wheel 0 Jul 29 14​:06 it

This is all incorrect behavior on Perl's part. Even cat knows better!

  % echo foo | cat >> it; echo $?
  cat​: stdout​: No space left on device
  1

+-------------------------------------------------------+
| Notice how even my cat is smarter than your perl!? :( |
+-------------------------------------------------------+

What's that about, eh?

But I've been saying this for years, just like everything else.
Makes no difference. Depressing, eh?

BTW, this proves my point about checking for print's status
being a waste of time, neither necessary nor sufficient to
catch failed prints!

  % perl -e 'open(my f h , " >> i t " ) | | d i e " o p e n !"; print f h " s t u f f \n " o r d i e " p r i n t !"; print STDOUT "ok now\n"'; echo exit status was $?
  ok now
  exit status was 0

And you can't get the danged thing to detect its folly​:

  % perl -WE 'open(my f h , " >> i t " ) | | d i e " o p e n !"; say f h " s t u f f " o r d i e " p r i n t !"; say "ok now"' ; echo exit status was $?
  ok now
  exit status was 0

I firmly believe that this needs to be a part of *EVERY* Perl program,
which consequently means it should *not* be there, but in the core itself​:

  END { close(STDOUT) || die "can't close stdout​: $!" }

And I believe this *much* more than some here believe <ARGV> needs to
implicitly map {"< $_\0"} @​ARGV (since that breaks existing working code,
whereas mine fixes existing broken code). Furthermore, I also believe all
*IMPLICIT* closes that fail need to generate a mandatory io warning, but
that of STDOUT should be a fatal. I've believed all this for a long time;
said it often enough. Anything else is just plain wrong behavior. I'm not
quite sure whether ARGVOUT failure should be a warning or a fatal, but it
should be *something* suitably noisy.

And no, Abigail, *please* understand that I don't mean *you* personally in
any way in this. Why, I'd even bet a beverage you agree with me about this
fh silent-closing problem! Rather I'm referring to the overall groupthink
and practice here, whereby simple stuff that's wrong just *never* gets
addressed, let alone fixed, but esoteric stuff that's well-documented and
at best a fringe issue gets all the airspace. Meanwhile, the documentation
continues to--well, you know. Best be charitable and just say "ramify with
baroque embellishments", but you quite know what that means, I'm sure.

'Nuff of this​: time for sunshine.

--tom
--

  Quotes below are all from Dorothy Parker​:

  "You can lead a whore to culture, but you can't make her think."

  "You can't teach an old dogma new tricks."

  "If you want to know what God thinks of money, just look at
  the people he gave it to."

  "All I need is room enough to lay a hat and a few friends."

  "Brevity is the soul of lingerie."

  "The cure for boredom is curiosity. There is no cure for curiosity."

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From tchrist@perl.com

On Tue, Jul 29, 2008 at 01​:39​:59PM -0600, Tom Christiansen wrote​:

To my mind, it's a bug that while(<>) in taint mode doesn't
realize that a raw @​ARGV from the command line is unsafe.

It doesn't?

As I myself tried to show, it depends on what you're doing.

Apparently, there's some idea that merely opening a file
read-only is always a safe thing, no matter who's telling
you what file to open. [Haven't looked at C code]

I'm not 100% sure that's true. But building in -f tests etc
seems overboard, too, and possibly even the wrong thing.

--tom

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From @epa

Tom Christiansen <tchrist <at> perl.com> writes​:

To my mind, it's a bug that while(<>) in taint mode doesn't
realize that a raw @​ARGV from the command line is unsafe.

FWIW, I agree. Since currently <> uses unsafe open, taint should flag it. At
the moment you get no error, until one of the arguments happens to contain a
shell metacharacter, at which point the program dies with a taint error. It
would be better to die for all cases because then the programmer has a chance to
spot the problem sooner.

But, again, this might prompt you to ask why just 'reading the files' should
need taint checking. After all, there is no taint error for

  #!/usr/bin/perl -T
  use warnings;
  use strict;
  my $filename = <STDIN>;
  chomp $filename;
  open my $fh, '<', f i l e n a m e o r d i e !;
  close f h o r d i e !;

It executes just fine, and that is entirely correct. The open() call is safe,
in that no matter what filename it is passed it will do what it says on the tin
and try to open a file of that name for reading.

Taint checking and safe, predictable I/O commands are orthogonal. If you have a
command like 3-arg open which doesn't rely on magic characters interpolated into
a string to change its behaviour, then taint checking is not needed. Only
inherently unsafe (powerful, but potentially dangerous) operations like eval
"$code", /$regexp/, open($fh, "$magic_string") need the extra check.

--
Ed Avis <eda@​waniasset.com>

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From @ap

* Rafael Garcia-Suarez <rgarciasuarez@​gmail.com> [2008-07-29 22​:45]​:

2008/7/29 Aristotle Pagaltzis <pagaltzis@​gmx.de>​:

* Abigail <abigail@​abigail.be> [2008-07-28 21​:30]​:

- Programs that wouldn't use while (<>) pre-5.12 (because
they might run in an environment where file names may
start with '|' or '>') will use 3-arg "safe" while (<>),
will be, silently, a security issue when run with a
pre-5.12.

If you make "while (<<>>)" to be 3-arg open, then at least
such programs will fail to compile when run with a pre-5.12
perl.

Exactly. I want to highlight this again​: in my opinion, having
code that is safe under 5.12 (or 5.10.1 or whenever) not
silently become unsafe under 5.10.0 or earlier is an
incontrovertible argument for introducing a new safe
diamond-like operator as incompatible syntax.

If I parse you well, that's indeed a compelling argument.

I think you are. The argument in full length is​: if someone
writes `while (<>)` under 5.12, and this uses 3-arg open in 5.12,
then takes that code and runs it under 5.8, it will silently
change behaviour. Whereas if `while (<>)` stays the same, and
someone instead writes `while (<<>>)` in 5.12, then takes that
code and runs it under 5.8, the program won’t run at all. As well
it shouldn’t.

And we know the legions of ancient perls that are still deployed,
and that few people arm all their scripts with `require 5.whatev`,
so this is quite a likely scenario.

For that reason, changing the semantics of `while (<>)` is a bad
idea.

We can discourage the unconsidered use of magic ARGV with a
warning. This would be the exact same strategy that C
compilers followed WRT `gets`, which it seems to me worked
well for C. It also seems to me that the people who are
certain enough that they want this feature are also people who
won't shy away from muting a warning.

Recapitulating what was proposed by you, we are getting to :
* not changing <>

Apart from the warning, obviously, per the paragraph you quoted
right above your recapulation.

* introducing new, safer <<>> (or «» if I may joke about the
utf8-cleanliness of the tokeniser)
* a feature or a pragma then becomes not useful
* a way to extend ARGV's magic would be nice, but needs not to
be in the core

Yes.

I want to note that I’m not enamoured with the choice of `<<>>`
as the operator’s glyph, but I have no better proposal and I’m
not overly invested in that bikeshed. If anyone feels they have
a better idea, pipe up (most specifically, I wish Larry would);
either way though, so long as it breaks loudly in existing perls,
it’s good enough.

Anyway, I *think* this approach satisfies everyone’s concerns.

Regards,
--
Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2008

From @ap

* Tom Christiansen <tchrist@​perl.com> [2008-07-29 23​:25]​:

To this day, Perl's implicit closing of files doesn't warn you
of errors, let alone exit nonzero. This makes it do wrong thing
and not even tell you it did them wrong. This is a *true*
problem, because checking for the success of print() is neither
necessary nor sufficient to detect the success of print(). Yes,
you read that correctly. It's because of buffering, plus the
persistence of the err flag on the file structure.

I've never convinced anybody this is important.

It absolutely is. I had no idea, and as far as I’m concerned it’s
broken obviously enough that it needs no supporting argument.

* Tom Christiansen <tchrist@​perl.com> [2008-07-29 21​:45]​:

To my mind, it's a bug that while(<>) in taint mode doesn't
realize that a raw @​ARGV from the command line is unsafe.

Yes.

Regards,
--
Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

p5pRT commented Jul 30, 2008

From tchrist@perl.com

In-Reply-To​: Your message of "Wed, 30 Jul 2008 01​:57​:14 +0200."
  <20080729235714.GN9326@​klangraum.plasmasturm.org>

* Tom Christiansen <tchrist@​perl.com> [2008-07-29 23​:25]​:

To this day, Perl's implicit closing of files doesn't warn you
of errors, let alone exit nonzero. This makes it do wrong thing
and not even tell you it did them wrong. This is a *true*
problem, because checking for the success of print() is neither
necessary nor sufficient to detect the success of print(). Yes,
you read that correctly. It's because of buffering, plus the
persistence of the err flag on the file structure.

I've never convinced anybody this is important.

It absolutely is. I had no idea, and as far as I'm concerned it's
broken obviously enough that it needs no supporting argument.

Hello, Aristotle.

I'm quite glad to hear you say that.

Maybe one's grandparents are right, and all one has to do
is outlive one's detractors. :-) Then again, that's what
the warrior says, and that's not the same thing.

--tom

@p5pRT
Copy link
Author

p5pRT commented Jul 30, 2008

From tchrist@perl.com

In-Reply-To​: Message from Aristotle Pagaltzis <pagaltzis@​gmx.de>
  of "Wed, 30 Jul 2008 01​:30​:30 +0200." <20080729233030.GL9326@​klangraum.plasmasturm.org>

I want to note that I'm not enamoured with the choice of <<>>
as the operator's glyph,

Indeed; nor am I.

but I have no better proposal and I'm not overly invested in that
bikeshed. If anyone feels they have a better idea, pipe up (most
specifically, I wish Larry would);

I wouldn't hold my breath; but you never know.

I do note how Larry seems never to have thought a warning merited
for sub-3-arg opens of any variety.

I opine that it's always been the position that hostile-environment
operations should be dealt with as exceptional ones not standard ones.
That means that they're for running with -T and/or Safe.

I don't mean real security problems are ever treated lightly.

I just find it hard to see that this anymore than another of many
other things that just fall out of the Unix environment, like fifos
and such.

either way though, so long as it breaks loudly in existing perls,
it's good enough.

Anyway, I *think* this approach satisfies everyone's concerns.

I'd like to think about it a bit more.

I still have a vague hunch like a module, or here even a pragma,
might be a good idea. It's vague, and undeveloped. I'd like
to give that time to grow.

In any event, I don't think that the alarmicists' loudness should cause
anyone to make quick, undeliberated actions. I am especially reminded of
that annoying period in our history when we were all forced to write

  while (defined ($data = <FH>)) { ... }
  while (defined ($data = readline(*FH) { ... }
  while (defined ($filename = readdir(DH)) { ... }
  if (!defined ($linkee = readlink($filename))) { ... }

To quiet the very, very annoying warnings that came from the risk of
getting back "0", which is a false but defined, and couldn't be replaced
by the (notorious?) "0 but true". (See footnote)

It didn't really become too noticeable until Windows, which thought that
textfiles were CRLF-separated, not newline-terminated, sequences of lines.
Using chomp over chop fixed them, but not this.

So for a while, *EVERYBODY* had to change their programs. That alone
should have been enough to show something wasn't right. You can't demand
all users be smarter, because it will never happen. But you can make the
compiler smarter. Finally, the compiler got smart enough to insert an
implicit defined() when it recognized

  while ($var = readXXX()) { ...}

for XXX = {line,dir,link}, which was the *much* better solution, by far.

There are plenty of similarities here to that situation.

I'm afraid we may be heading down, if not break-their-programs, at
least the annoying-warning route, and that now as then there might
well be a cleaner and less noisily troubling solution. That's why
I don't think loudness of complaint should turn into quick action.
Without a dampening period of contemplation and consideration, the
feedback loop would whiplash the language too much, and thence its
users as well.

---tom

FN​: "0 but true" is exempt from numeric warnings, just like the very
  special form of "" returned by relationals, which is PL_sv_no.
  Sure, undef is exempt from them for ++ and += and .=, but the
  special "" (PL_sv_no) is except in all situations, just as "0
  but true" is.

  % perl -WE 'say "001"+"000"'
  1

  % perl -WE 'say 1+"3 blind mice"'
  Argument "3 blind mice" isn't numeric in addition (+) at -e line 1.
  4

  % perl -WE 'say 1+"0 but true"'
  1

  % perl -WE 'say 1+""'
  Argument "" isn't numeric in addition (+) at -e line 1.
  1

  % perl -WE 'say 1+(2==3)'
  1

Although​:

  % perl -Mbignum -WE 'say "000"+"001"'
  1
  % perl -Mbignum -WE 'say 1+(2==3)'
  NaN
  % perl -Mbignum -WE 'say 1+""'
  NaN
  % perl -Mbignum -WE 'say 1+"3 blind mice"'
  NaN

And​:

  % perl -WE 'say "Inf" + 0'
  0
  % perl -WE 'say "Inf" + 1'
  1
  % perl -WE 'say "Inf" + "-Inf"'
  0
  % perl -WE 'say "Inf" * "Inf"'
  0

vs

  % perl -Mbignum -WE 'say "Inf" + 0'
  NaN
  % perl -Mbignum -WE 'say "Inf" + 1'
  NaN
  % perl -Mbignum -WE 'say "Inf" + "-Inf"'
  0
  % perl -Mbignum -WE 'say "Inf" * "Inf"'
  0

@p5pRT
Copy link
Author

p5pRT commented Jul 30, 2008

From @rgs

2008/7/30 Aristotle Pagaltzis <pagaltzis@​gmx.de>​:

* Tom Christiansen <tchrist@​perl.com> [2008-07-29 23​:25]​:

To this day, Perl's implicit closing of files doesn't warn you
of errors, let alone exit nonzero. This makes it do wrong thing
and not even tell you it did them wrong. This is a *true*
problem, because checking for the success of print() is neither
necessary nor sufficient to detect the success of print(). Yes,
you read that correctly. It's because of buffering, plus the
persistence of the err flag on the file structure.

I've never convinced anybody this is important.

It absolutely is. I had no idea, and as far as I'm concerned it's
broken obviously enough that it needs no supporting argument.

I concur. A patch would help, but I know that PerlIO is not the simplest
thing to patch, and the problem is probably a bix complex now.

A proper bug report would help, too, because currently this one is deep
buried in another thread.

@p5pRT
Copy link
Author

p5pRT commented Jul 30, 2008

From @epa

Roland Giersig <rgiersig <at> cpan.org> writes​:

How about this​:

in v5.12​:

* add the '<<>>' operator as the now-standard magic 2-arg-open (clone it
from '<>')

* issue a warning whenever '<>' invokes its magic, telling about the
coming change, i.e.

"You are using the magical behaviour of the <> operator with regards
to command pipes| or file redirections in ARGV. Please note that <> will
lose its magic in the next version. To keep the magical behaviour, use
the new <<>> operator instead."

Either have a warning or change the semantics; I don't think you need both.
Once people see the warning they will change their code to either explicit magic
or explicit boring-file-opening. Anyway, we can't plan as far ahead as 5.14.

* add a pragma, e.g. "use feature safe_diamond" or equivalent to already
switch '<>' over to use the 3-args-open. That way developers can already
use the new behaviour, avoiding those dreaded version-checks.

Pragmas are useful when you want to change the global behaviour of a program.
But typically <> is used in just one place, the main loop, and certainly in just
one source file (by an unfortunate accident of perl's implementation, you cannot
in general pass the ARGV filehandle to subroutines expecting a filehandle).

So I think a pragma is overkill here, better a way to explicitly say what you
want​: <SAFE_ARGV> or <MAGIC_ARGV>, with appropriate syntactic sugar to provide a
<<>> operator or whatever.

* "use v5.x" (for x < 12) of course should switch magical behaviour back
on for '<>'.

I think this is also getting a bit hairy and tangled.

Coming to think of, this argument is so strong that the
two-step-approach now seems overkill to me. Just making sure that "use
5.x" switches the magic back on for '<>' should be sufficient.

Um, I don't know, after all one of the main points is that people were and are
using perl-5.10, perl-5.8 and older versions and believing <> reads files given
on the command line. If you put 'use 5.6' in your program it means it will not
work with perl older than that, not 'preserve 5.6's bugs for all time'.
Otherwise I had better go through and remove 'use 5.10' from all my programs
lest I miss out on some bug present in 5.10 but fixed in later perls!

Let's not put extra bizarre stuff into 'use 5.xx', it is awkward enough already.

--
Ed Avis <eda@​waniasset.com>

@p5pRT
Copy link
Author

p5pRT commented Jul 30, 2008

From @Abigail

On Wed, Jul 30, 2008 at 02​:49​:51PM +0200, Roland Giersig wrote​:

Rafael Garcia-Suarez wrote​:

2008/7/29 Aristotle Pagaltzis <pagaltzis@​gmx.de>​:

We can discourage the unconsidered use of magic ARGV with a
warning. This would be the exact same strategy that C compilers
followed WRT `gets`, which it seems to me worked well for C. It
also seems to me that the people who are certain enough that they
want this feature are also people who won't shy away from muting
a warning.

Recapitulating what was proposed by you, we are getting to :
* not changing <>
* introducing new, safer <<>> (or «» if I may joke about the
utf8-cleanliness of the tokeniser)
* a feature or a pragma then becomes not useful
* a way to extend ARGV's magic would be nice, but needs not to be in the core

Sounds good, but leaves the issue of fixing <> (which is important
IMHO). How about this​:

[ Proposal to change the meaning of <> ]

This approach means that people have plenty of time to adapt their apps
if they really rely on magical behaviour by either changing '<>' to
'<<>>' or adding 'use v5.x'. If they do nothing, their apps just become
a little bit safer.

Eh, the argument Aristotle and I used, and Rafael agrees with isn't that
programs will break with newer versions of Perl, it's that programs
written to be safe in 5.12 (or whenever <> defaults to 3-arg), become
unsafe when run with an older perl. But if you leave <> as is, and use
<<>> for 3-arg open, a program using safe opens won't run on a perl that
doesn't have the feature.

Surely you must agree that a program is safer if it refuses to run on a
perl that doesn't use 3-arg open than a program that silently uses 2-arg
open?

Abigail

@p5pRT
Copy link
Author

p5pRT commented Jul 30, 2008

From zefram@fysh.org

Tom Christiansen wrote​:

What do we get today? Bowdlerization!

How can I open a file with a leading ">" or trailing blanks?
...
Unless you have a particular reason to use the two argument form you
should use the three argument form of open() which does not treat any
characters in the filename as special.

I don't see Bowdlerisation here. I see a better way to achieve the
objective.

HELLO? What happened to the right answer that was there before?

It was superseded by this simpler right answer, when the three-argument
form of open() became available.

But what should I expect? The perl faq is now a document that thinks
this is somehow clear and consistent code, and it's anything but​:

I agree with your problems with this one, however.

To this day, Perl's implicit closing of files doesn't warn you of errors,
let alone exit nonzero. This makes it do wrong thing and not even tell you
it did them wrong.

I wholeheartedly agree with you in this gripe. I often (but, admittedly,
not always) end up writing wrappers along the lines of

  sub IO​::Handle​::safe_print { shift->print(@​_) or die $! }
  sub IO​::Handle​::safe_flush { shift->flush or die $! }

and I wish such dying versions of I/O functions were standardised and
the norm. Proper use of the exception mechanism makes it a lot easier
to write correct programs. Roll on autodie.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Jul 30, 2008

From @Abigail

On Wed, Jul 30, 2008 at 05​:01​:38PM +0200, Roland Giersig wrote​:

Abigail wrote​:

Eh, the argument Aristotle and I used, and Rafael agrees with isn't that
programs will break with newer versions of Perl, it's that programs
written to be safe in 5.12 (or whenever <> defaults to 3-arg), become
unsafe when run with an older perl. But if you leave <> as is, and use
<<>> for 3-arg open, a program using safe opens won't run on a perl that
doesn't have the feature.

Surely you must agree that a program is safer if it refuses to run on a
perl that doesn't use 3-arg open than a program that silently uses 2-arg
open?

This can also be accomplished with "use v5.12", no?

Do we assume that somebody who knows or learns about the 3-arg-diamond
also knows about the security-implications? I would say 'yes', so this
programmer will probably care enough to insert a 'use v5.12' to prevent
running on older perls.

A programmer who in your case uses the 3-arg-open '<<>>' already gives
up backward-compatibility, preventing usage of the script in an unsafe
environment.

Which he also could do in my case by using the '<>' changed to
3-arg-open and adding 'use v5.12'.

The case discussed here is not smart programmers. If we had programmers
that did the right thing, we wouldn't even have this discussion, as then
programmers wouldn't use while(<>) in environments where it's unsafe.

And as long as we have dumb programmers who should need protection,
I wouldn't count on them adding 'use v5.12' to their programs. Or someone
else taking their code, and removing the 'use 5.12' (because it runs on
5.10 anyway).

Abigail

@p5pRT
Copy link
Author

p5pRT commented Jul 30, 2008

From @davidnicol

On Tue, Jul 29, 2008 at 6​:30 PM, Aristotle Pagaltzis <pagaltzis@​gmx.de> wrote​:

I want to note that I'm not enamoured with the choice of `<<>>`
as the operator's glyph, but I have no better proposal and I'm
not overly invested in that bikeshed.

As the overcaffeinated fool who suggested that operator, in the ensuing
days I have longed for the opportunity to revise my proposed bikeshed color
to '>>>' which hopefully will provide neat symmetry with the proposed
'<<<TOKEN' which
might be just like '<<TOKEN' except strip leading indentation (like ksh's '<<-)'

anyway after learning of the possibility of altering the filenames as
tchrist has pointed out
and which used to be inthe faq before the 3-arg open craze, I no
longer think wide football
is a good idea, under any syntax. (has it been a straw man all along?)

@p5pRT
Copy link
Author

p5pRT commented Jul 30, 2008

From rick@bort.ca

On Jul 30 2008, Aristotle Pagaltzis wrote​:

* Tom Christiansen <tchrist@​perl.com> [2008-07-29 21​:45]​:

To my mind, it's a bug that while(<>) in taint mode doesn't
realize that a raw @​ARGV from the command line is unsafe.

Yes.

Please submit a (more descriptive) bug report if you are serious. I,
for one, am more inclined to fix an actual bug in tainting than I am to
implement a new feature, pragma, or operator for anyone who laments
Perl's crappy security but can't be bothered to stick -T on their
shebang line.

--
Rick Delaney
rick@​bort.ca

@p5pRT
Copy link
Author

p5pRT commented Jul 31, 2008

From @sciurius

Ed Avis <eda@​waniasset.com> writes​:

Pragmas are useful when you want to change the global behaviour of a
program.

Now that pragmas are lexcially scoped, this assumption no longer
holds.

-- Johan

@p5pRT
Copy link
Author

p5pRT commented Jul 31, 2008

From @sciurius

Tom Christiansen <tchrist@​perl.com> writes​:

I still have a vague hunch like a module, or here even a pragma,
might be a good idea.

I'd go for a nice iterator class instead of <<<<>>>> weirdness.

I don't mind typing a few more characters, especially since (as
pointed out several times now) it is functionality that often occurs
only once in a program -- if at all.

-- Johan

@ayyangigi
Copy link

From thospel@mail.dma.be

In article <E12V7Jw-0002PL-00@​ursa.cus.cam.ac.uk>,   "M.J.T. Guy" <mjtg@​cus.cam.ac.uk> writes​:

No, I'm not trying to restart this flame war. But it was a "security"
issue, and security seems to be in fashion at the moment, and it was
left in a somewhat unsatisfactory state.
THe story so far, for the benefit of younger readers​:
[ with the usual IIRC caveats - go to the archives if you want the
real facts
]
There's a booby trap when magic open (i.e. initial/final special
characters like < > |) is used in conjunction with <>. Suppose
some devious person has left around a file such as "| rm -rf *;".
THen root's cron job comes along and does

       my\_scan\_command \*

and ... Boom! Here's a more innocent demonstration​:
$ cat >'| echo Bwahahahaha'
hkgfjhgfhgf
$ perl -wne '' *
Bwahahahaha
$
Note that the Perl script is obviously "so simple it can't have any
security holes".
There were two proposals for fixing this​: a maximal one which would
have banned all magic in association with <>, and a minimal one
(championed by Tom C) which would have made the open non-magic iff
a file of that name existed. So the minimal proposal is essentially
backwards compatible, and loses no functionality apart from active
malice.

In fact, there was a little known third proposal by yours truly (hi !)​: Turn of magic <> if the perl command line contains an explicit -- Otherwise you are still hacked. Observe​:

mkdir /tmp/a cd /tmp/a echo > '-e;print("Bwahaha\n")' echo foo > bar perl -wne '' *

Will also give you the dreaded​: Bwahaha

So, since a security aware person has to do

perl -wne '' -- *

anyways, let that remove the magicness

Closed 🔐

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants