Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File::Glob issue #14967

Open
p5pRT opened this issue Oct 5, 2015 · 15 comments
Open

File::Glob issue #14967

p5pRT opened this issue Oct 5, 2015 · 15 comments

Comments

@p5pRT
Copy link

p5pRT commented Oct 5, 2015

Migrated from rt.perl.org#126271 (status was 'open')

Searchable as RT126271$

@p5pRT
Copy link
Author

p5pRT commented Oct 5, 2015

From @khwilliamson

These 2 lines​:

my @​b = glob(qq{$path/mp_[0123456789]*.dat
  $path/md_[0123456789]*.dat});

from https://rt-archive.perl.org/perl5/Ticket/Display.html?id=114984 (and now in
ext/File-Glob/t/rt114984.t) now work fine on Linux unless it is compiled
with -DPERL_EXTERNAL_GLOB. In that case, a shell is run instead of F​:G,
and the shell thinks the 2nd line is a shell command instead of part of
what to glob.

I am not conversant in all the nuances of what bsd vs csh vs other
shells accept, but I believe this means that F​:G is not an accurate
reproducer of what most shells would do with this, and I think it is
meant to be. Zefram on IRC pointed out that shell meta characters are
probably also handled differently than in a real shell.

I'm not sure what to do. The .t can be made to pass on
-DPERL_EXTERNAL_GLOB systems simply by making the \n into a space.
(This problem was found in testing on z/OS, which is a
PERL_EXTERNAL_GLOB system.)

@p5pRT
Copy link
Author

p5pRT commented Oct 5, 2015

From j.imrie1@virginmedia.com

On 05/10/2015 19​:01, karl williamson (via RT) wrote​:

# New Ticket Created by karl williamson
# Please include the string​: [perl #126271]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=126271 >

These 2 lines​:

my @​b = glob(qq{$path/mp_[0123456789]*.dat
$path/md_[0123456789]*.dat});

What happens if you try this

my @​b = glob(qq{$path/mp_[0123456789]*.dat\
  $path/md_[0123456789]*.dat});

with -DPERL_EXTERNAL_GLOB

John

@p5pRT
Copy link
Author

p5pRT commented Oct 5, 2015

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Oct 5, 2015

From @epa

karl williamson <perlbug-followup <at> perl.org> writes​:

my @​b = glob(qq{$path/mp_[0123456789]*.dat
$path/md_[0123456789]*.dat});

from https://rt-archive.perl.org/perl5/Ticket/Display.html?id=114984 (and now in
ext/File-Glob/t/rt114984.t) now work fine on Linux unless it is compiled
with -DPERL_EXTERNAL_GLOB. In that case, a shell is run instead of F​:G,
and the shell thinks the 2nd line is a shell command instead of part of
what to glob.

That sounds like a bug in the PERL_EXTERNAL_GLOB case. Having your glob
kick off a random shell command cannot be the intended behaviour. I suggest
that PERL_EXTERNAL_GLOB should convert \n to space or otherwise sanitize
the glob pattern slightly before invoking the shell. (It may never be
possible to completely make the shell invocation safe while keeping
compatibility with the current behaviour, but certainly this case can be
fixed.)

I am not conversant in all the nuances of what bsd vs csh vs other
shells accept, but I believe this means that F​:G is not an accurate
reproducer of what most shells would do with this, and I think it is
meant to be.

Well... not in that way IMHO. It should provide the same kind of glob
expansion as shells do. It doesn't and shouldn't try to reimplement all
the interesting quoting bugs and code injection bugs that running an
external shell is prone to. (To do that it would indeed need to fork a
shell and see what happens, and then there would be no point having F​::G.)

--
Ed Avis <eda@​waniasset.com>

@p5pRT
Copy link
Author

p5pRT commented Oct 5, 2015

From @khwilliamson

On 10/05/2015 12​:58 PM, John Imrie wrote​:

On 05/10/2015 19​:01, karl williamson (via RT) wrote​:

# New Ticket Created by karl williamson
# Please include the string​: [perl #126271]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=126271 >

These 2 lines​:

my @​b = glob(qq{$path/mp_[0123456789]*.dat
$path/md_[0123456789]*.dat});

What happens if you try this

my @​b = glob(qq{$path/mp_[0123456789]*.dat\
$path/md_[0123456789]*.dat});

with -DPERL_EXTERNAL_GLOB

sh​: 2​: /var/tmp/Alv2wNyeph/md_1.dat​: Permission denied

However, this works

@​b = glob(qq{$path/mp_[0123456789]*.dat \\
$path/md_[0123456789]*.dat});

Both the space and extra backslash are required

@p5pRT
Copy link
Author

p5pRT commented Oct 6, 2015

From @khwilliamson

On 10/05/2015 01​:59 PM, Ed Avis wrote​:

karl williamson <perlbug-followup <at> perl.org> writes​:

my @​b = glob(qq{$path/mp_[0123456789]*.dat
$path/md_[0123456789]*.dat});

from https://rt-archive.perl.org/perl5/Ticket/Display.html?id=114984 (and now in
ext/File-Glob/t/rt114984.t) now work fine on Linux unless it is compiled
with -DPERL_EXTERNAL_GLOB. In that case, a shell is run instead of F​:G,
and the shell thinks the 2nd line is a shell command instead of part of
what to glob.

That sounds like a bug in the PERL_EXTERNAL_GLOB case. Having your glob
kick off a random shell command cannot be the intended behaviour. I suggest
that PERL_EXTERNAL_GLOB should convert \n to space or otherwise sanitize
the glob pattern slightly before invoking the shell. (It may never be
possible to completely make the shell invocation safe while keeping
compatibility with the current behaviour, but certainly this case can be
fixed.)

I am not conversant in all the nuances of what bsd vs csh vs other
shells accept, but I believe this means that F​:G is not an accurate
reproducer of what most shells would do with this, and I think it is
meant to be.

Well... not in that way IMHO. It should provide the same kind of glob
expansion as shells do. It doesn't and shouldn't try to reimplement all
the interesting quoting bugs and code injection bugs that running an
external shell is prone to. (To do that it would indeed need to fork a
shell and see what happens, and then there would be no point having F​::G.)

I have pushed a fix to the .t file that was failing; I simply joined the
two lines to make one.

It turns out that porting/maintainers.t also fails with
PERL_EXTERNAL_GLOB on Linux. I haven't looked at this.

It is my supposition that the core converting to use File​::Glob was for
performance reasons. (Perhaps it was to also get more uniform handling
across platforms.) Since it's undocumented, perhaps someone could
enlighten me. One solution I thought of (that Zefram) doesn't like is
for F​:G to fork a shell if and only if it finds a shell metacharacter.
That way the performance wouldn't suffer except in edge cases.

@p5pRT
Copy link
Author

p5pRT commented Oct 6, 2015

From zefram@fysh.org

Karl Williamson wrote​:

                One solution I thought of \(that Zefram\) doesn't

like is for F​:G to fork a shell if and only if it finds a shell
metacharacter.

That would mean that it would acquire the newline misbehaviour,
and equivalent misbehaviour on semicolon and other metacharacters.
That is not a desirable feature of the glob-via-shell mechanism, it's
a problem to be avoided. We should prefer to bring the two globbing
mechanisms into synchrony by tweaking the glob-via-shell code to escape
non-globbing metacharacters.

A problem with this is that the globbing mechanism might be intentionally
used to get types of shell expansion other than globbing, such as
parameter expansion and command substitution, which don't interfere
with the glob-via-shell framework. In the general case these are
shell-specific, and can contain arbitrary embedded shell commands,
so it's impossible to distinguish metacharacters that are part of
these acceptable constructions from metacharacters that will break the
framework. But these uses have already been broken by the File​::Glob
implementation, so it should be fine to break them in the glob-via-shell
implementation too.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Oct 6, 2015

From @epa

Out of interest why does glob-via-shell still exist? It would seem that
the pure Perl implementation in File​::Glob would be more portable, more
correct and faster in all cases. Truly wacky platforms may need special
globbing rules, but they're not likely to have /bin/csh either.

--
Ed Avis <eda@​waniasset.com>

@p5pRT
Copy link
Author

p5pRT commented Oct 6, 2015

From zefram@fysh.org

Ed Avis wrote​:

Out of interest why does glob-via-shell still exist?

Historical. I think it was meant to be the lazy way to get globbing,
but given that Perl already has a serviceable pattern-matching engine,
and the embuggerance involved in getting the shell thing to work, it's
probably not saving code any more.

I think it would be a good idea to move to always performing the
globbing ourselves. The only trickiness in doing so is concerned
with having the code located in File​::Glob​: miniperl needs to be able
to glob without loading XS. The critical code from File​::Glob would
have to be present in the core proper, and there's then a choice to be
made as to whether File​::Glob (to implement its distinctive interfaces)
should have a duplicate of that code or attempt to use the core's copy
(presumably via a new core API function).

-zefram

@p5pRT
Copy link
Author

p5pRT commented Oct 13, 2015

From @ap

* Karl Williamson <public@​khwilliamson.com> [2015-10-06 18​:05]​:

One solution I thought of (that Zefram doesn't like) is for F​:G to
fork a shell if and only if it finds a shell metacharacter. That way
the performance wouldn't suffer except in edge cases.

You brought this up multiple times. I don’t understand the logic behind
that. What makes you think that the fact that unusual inputs to glob()
cause the execution of shell commands under PERL_EXTERNAL_GLOB is really
a feature that needs to be replicated in its other incarnations? Is your
premise hat the purpose of glob() is not just to perform globbing, but
also to provide an undocumented poor-usability alternative to system()?

I’m at a loss. It doesn’t seem plausible that you thought this through
particularly carefully, but maybe there’s some other consideration that
drives you in this direction? That would be unlikely (to say the least)
to change my conclusion – but even so, it would be interesting to know,
and so far you haven’t stated your premises.

(I consider this a security hole in PERL_EXTERNAL_GLOB, though the low
use of that feature means there is only a low-grade vulnerability here.
Nevertheless it should be closed; security breaches usually result from
compositions of (mostly) individually-low-grade vulnerabilities.)

Regards,
--
Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

p5pRT commented Oct 13, 2015

From @khwilliamson

On 10/13/2015 03​:13 AM, Aristotle Pagaltzis wrote​:

* Karl Williamson <public@​khwilliamson.com> [2015-10-06 18​:05]​:

One solution I thought of (that Zefram doesn't like) is for F​:G to
fork a shell if and only if it finds a shell metacharacter. That way
the performance wouldn't suffer except in edge cases.

You brought this up multiple times.

Huh!? Unless I'm losing my mind, this thread is the only time I've ever
posted on this. And unless I'm conflating this with something else, the
only other time I've mentioned this at all was shortly before the
original post, when I asked a question about it on #irc, and Zefram and
I quickly concluded it was best handled via email; hence this thread.

I don’t understand the logic behind

that. What makes you think that the fact that unusual inputs to glob()
cause the execution of shell commands under PERL_EXTERNAL_GLOB is really
a feature that needs to be replicated in its other incarnations? Is your
premise hat the purpose of glob() is not just to perform globbing, but
also to provide an undocumented poor-usability alternative to system()?

I’m at a loss. It doesn’t seem plausible that you thought this through
particularly carefully, but maybe there’s some other consideration that
drives you in this direction? That would be unlikely (to say the least)
to change my conclusion – but even so, it would be interesting to know,
and so far you haven’t stated your premises.

(I consider this a security hole in PERL_EXTERNAL_GLOB, though the low
use of that feature means there is only a low-grade vulnerability here.
Nevertheless it should be closed; security breaches usually result from
compositions of (mostly) individually-low-grade vulnerabilities.)

And I in turn am at a loss to understand where you're coming from. My
original post was because a test was failing on a -DPERL_EXTERNAL_GLOB
system. (And there is another test in the suite that also fails) And I
didn't know how to proceed. This feature, like too many others, is
undocumented.

More complete context of what I said that you quoted above is this

"It is my supposition that the core converting to use File​::Glob was for
performance reasons. (Perhaps it was to also get more uniform handling
across platforms.) Since it's undocumented, perhaps someone could
enlighten me. One solution I thought of (that Zefram) doesn't like is
for F​:G to fork a shell if and only if it finds a shell metacharacter.
That way the performance wouldn't suffer except in edge cases."

So of course I haven't thought this through. I said I was asking for
guidance. And I did state my premise there. If the purpose of using
F​:G is performance, then use it for inputs where all shells agree, and
do the fork to get the accustomed behavior only for the other cases.
And if the purpose isn't for performance, why have -DPERL_EXTERNAL_GLOB
at all. It's only use seems to me to get the local behavior instead of
Perl's.

I do not believe I'm advocating for anything here; I admittedly have too
little knowledge in the matter to do so. And I'm certainly not
persistently advocating for anything.

@p5pRT
Copy link
Author

p5pRT commented Oct 13, 2015

From @ap

* Karl Williamson <public@​khwilliamson.com> [2015-10-13 19​:25]​:

On 10/13/2015 03​:13 AM, Aristotle Pagaltzis wrote​:

You brought this up multiple times.

Huh!? Unless I'm losing my mind, this thread is the only time I've
ever posted on this. And unless I'm conflating this with something
else, the only other time I've mentioned this at all was shortly
before the original post, when I asked a question about it on #irc,
and Zefram and I quickly concluded it was best handled via email;
hence this thread.

Sorry. I thought I had seen you propose this twice on this thread. You
mentioned it on IRC as well; possibly you also mentioned it only once
there, and my impression that you had suggested it more than once there
too was equally mistaken. Too lazy to check. :-) So it felt to me like
you were re-proposing this repeatedly in the face of disagreement from
multiple sides, and as it seemed an evidently bad idea to me as well,
I wondered what on Earth could be compelling you.

But you are right​: on second look, none of that took place.

My bad.

So of course I haven't thought this through. I said I was asking for
guidance. And I did state my premise there.

Well you didn’t state your premise for doing what you weren’t actually
doing. ;-)

Anyway, on the subject itself, as I said, the fact that it’s possible to
get shell command invocation out of glob() at all is – inarguably, to my
mind – a security vulnerability. And the reason PERL_EXTERNAL_GLOB even
exists seems to be hysterical raisins. Maybe it’s time to revisit that
and get rid of it.

And I'm certainly not persistently advocating for anything.

Yes; sorry about that.

Regards,
--
Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

p5pRT commented Oct 13, 2015

From @craigberry

On Tue, Oct 6, 2015 at 11​:01 AM, Karl Williamson
<public@​khwilliamson.com> wrote​:

It is my supposition that the core converting to use File​::Glob was for
performance reasons. (Perhaps it was to also get more uniform handling
across platforms.) Since it's undocumented, perhaps someone could enlighten
me.

I have some memory of being told it was about "eating our own dog
food," i.e., it's doing something Perl should be good at, so let's use
Perl to do it. It initially caused various bootstrapping problems and
always seemed like something of a layer violation to me. That could
be dealt with by bringing the relevant bits deeper into core as
described elsewhere in this thread by Zefram.

Commit 52bb067 doesn't say much except in its perldelta
addition mentions, "avoids using an external csh process and the
problems associated with it." There was likely some discussion of
what those problems are on-list, but I couldn't find anything relevant
quickly.

One solution I thought of (that Zefram) doesn't like is for F​:G to fork
a shell if and only if it finds a shell metacharacter. That way the
performance wouldn't suffer except in edge cases.

Do all shells have the same metacharacters? What if the CLI from
which Perl was invoked was not a Unix shell at all?

@p5pRT
Copy link
Author

p5pRT commented Oct 13, 2015

From @pjcj

On Tue, Oct 06, 2015 at 09​:53​:45PM +0100, Zefram wrote​:

Ed Avis wrote​:

Out of interest why does glob-via-shell still exist?

Historical. I think it was meant to be the lazy way to get globbing,
but given that Perl already has a serviceable pattern-matching engine,
and the embuggerance involved in getting the shell thing to work, it's
probably not saving code any more.

I think it would be a good idea to move to always performing the
globbing ourselves. The only trickiness in doing so is concerned
with having the code located in File​::Glob​: miniperl needs to be able
to glob without loading XS. The critical code from File​::Glob would
have to be present in the core proper, and there's then a choice to be
made as to whether File​::Glob (to implement its distinctive interfaces)
should have a duplicate of that code or attempt to use the core's copy
(presumably via a new core API function).

A long time ago (probably last millennium) someone (probably Chip)
discussed the possibility of nicking the zsh globbing code and using
that, along with the extended globbing syntax.

Not sure how feasible that is, but I do wish it were possible every time
I write something horrible like​:

  my @​files = split " ", `zsh -c 'echo **/*.pm'`;

--
Paul Johnson - paul@​pjcj.net
http​://www.pjcj.net

@p5pRT
Copy link
Author

p5pRT commented Oct 14, 2015

From @epa

Craig A. Berry <craig.a.berry <at> gmail.com> writes​:

Do all shells have the same metacharacters? What if the CLI from
which Perl was invoked was not a Unix shell at all?

It doesn't matter where perl was run from or what the user's preferred shell
is; globbing always follows 'the csh rules', and where it still works by
running an external process, that will be csh or some variant.

But anyway, I think everyone is now in agreement that doing something wacky
on glob patterns that contain shell metacharacters unrelated to globbing is
to be considered a bug not a feature.

--
Ed Avis <eda@​waniasset.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants