Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

various glob() bugs #1496

Open
p5pRT opened this issue Mar 24, 2000 · 3 comments
Open

various glob() bugs #1496

p5pRT opened this issue Mar 24, 2000 · 3 comments

Comments

@p5pRT
Copy link

p5pRT commented Mar 24, 2000

Migrated from rt.perl.org#2707 (status was 'open')

Searchable as RT2707$

@p5pRT
Copy link
Author

p5pRT commented Mar 24, 2000

From tchrist@chthon.perl.com

Summary​:

  1. Prototype change breaks documented examples.
  2. :globally problems
  3. Incompatible breakage
  4. Missing docs

+--------------------------------------------------+
| 1. Prototype change breaks documented examples. |
+--------------------------------------------------+

I seem to recall that this is my own fault​:

[ 5163] By​: gsar on 2000/02/20 16​:34​:33
  Log​: glob() takes one or no user arguments and a non-user-visible second
  hidden argument, fix its prototype-checking accordingly
  Branch​: perl

That made this​:

  glob glob ck_glob t@​ S? S?
become this
  glob glob ck_glob t@​ S?

And that breaks this​:

  use File​::Glob '​:glob';
  @​list = glob('*.[ch]');
  $homedir = glob('~gnat', GLOB_TILDE | GLOB_ERR);
  if (GLOB_ERROR) {
  print "can't glob ~gnat​: $!\n";
  } else {
  print "Gnat lives in $homedir\n";
  }

Which used to say

  Gnat lives in /home/gnat

But now says

  Too many arguments for glob at - line 3, near "GLOB_ERR)"

So even though you import it, and even though it goes into <*> fileglobs,
you have to say this​:

  use File​::Glob '​:glob';
  @​list = glob('*.[ch]');
  $homedir = &glob('~gnat', GLOB_TILDE | GLOB_ERR);
  if (GLOB_ERROR) {
  print "can't glob ~gnat​: $!\n";
  } else {
  print "Gnat lives in $homedir\n";
  }

Yes, you have to use &glob to give another argument.

I don't know a perfect solution here, but right now, there's
a problem. Well, yes, I do know a perfect solution​: if one
could (effectively) frob the opcode.pl output so that

  *CORE​::GLOBAL​::glob = \&File​::Glob​::csh_glob;

would make the parser tolerate 0 or 1 arguments, but with

  *CORE​::GLOBAL​::glob = \&File​::Glob​::glob;

it would tolerate 0, 1, or 2 arguments.

+------------------------+
| 2. :globally problems |
+------------------------+

There are other problems with this module. The import

  use File​::Glob '​:globally';

is a silent no-op on systems compiled with -DPERL_EXTERNAL_GLOB.
That's because all it does is

  *CORE​::GLOBAL​::glob = \&File​::Glob​::csh_glob;

but that's what you have already. If you want to have space-sensitive
globbing, then you use

  use File​::Glob '​:glob';

But that's just that package. You can't use '​:globally'
to do a

  *CORE​::GLOBAL​::glob = \&File​::Glob​::glob;

So you have to do it yourself, which feels sleasy. Here's the demo.
You can't get qw/​:glob :globally/ or or qw/​:globally :glob/ to do
what you need done. (Yeah, I know, CORE​::GLOBAL is "evil".)

  #!/bin/sh -x
  rm -rf /tmp/fred "/tmp/fred stuff"
  mkdir "/tmp/fred stuff"
  touch "/tmp/fred stuff/a"
  touch "/tmp/fred stuff/b"
  perl -We '
  use File​::Glob qw/​:glob :globally/;
  #use File​::Glob qw/​:globally :glob/;
  # BEGIN { *CORE​::GLOBAL​::glob = \&File​::Glob​::glob };
  package bad;
  @​a = </tmp/fred stu*>;
  print "File Glob globbed @​a\n"
  ';

I think I prefer "​:everywhere" to "​:globally". It's far far too
close to "​:glob", and means something completely different.

It would also be nice to get at File​::Glob​::glob without overriding the
built-in. But then you don't get the flags and such you need. Too
bad it's not "POSIX​::glob" -- less typing.

I think you should be able to say whether

  This package uses POSIX glob for the Perl fileglob operator.
  This package uses csh glob for the Perl fileglob operator.
  All packages use POSIX glob for the Perl fileglob operator.
  All packages use csh glob for the Perl fileglob operator.

+----------------------------+
| 3. Incompatible breakage |
+----------------------------+

  % perl5.004 -le 'print glob("*.[^x]")'

That gets all the files that end in a dot followed by anything not
an x. But it is silently broken now​:

  % perl -le 'print glob("*.[^x]")'

Because of this *.[!x] thing. And there's no way to get back
what it used to do.

+-----------------+
| 4. Missing docs |
+-----------------+

I notice in passing that csh_glob is not documented. And when it is,
the space bug should be explained. Also, this $^O oddity isn't
explained, either. Iff you call glob with only one argument, then
iff you're on a case-screwed system (happy unicode, mate!), then
you get the default weirdness.

All of these things need doc'ing, including the various breakages.
What you import to get what to happen is highly unclear. It could
really use some work.

--tom

@p5pRT
Copy link
Author

p5pRT commented Mar 24, 2000

From [Unknown Contact. See original ticket]

Here's more.

  % touch foo.x foo.X

  % perl -le 'use File​::Glob qw/​:nocase glob/; print join(" ", glob("*.x"))'
  foo.x

  % perl -le 'use File​::Glob qw/​:nocase glob/; print join(" ", &glob("*.x"))'
  foo.x foo.X

It is hard to see how that is an expected feature.

  % perl -le 'use File​::Glob qw/​:nocase :globally/; print join(" ", glob("*.x"))'
  foo.x foo.X

  % perl -le 'use File​::Glob qw/​:nocase :globally glob/; print join(" ", glob("*.x"))'
  foo.x

  % perl -le '{package XXX; use File​::Glob qw/​:nocase :globally/} print join(" ", glob("*.x"))'
  foo.x foo.X

--tom

@p5pRT
Copy link
Author

p5pRT commented Dec 3, 2005

From @smpeters

This is an old bug, but much of it appears to be fixed. My comments are
interspersed within what's below.

[tchrist@​chthon.perl.com - Fri Mar 24 09​:13​:34 2000]​:

Summary​:

1\.  Prototype change breaks documented examples\.
2\.  :globally problems
3\.  Incompatible breakage
4\.  Missing docs

+--------------------------------------------------+
| 1. Prototype change breaks documented examples. |
+--------------------------------------------------+

I seem to recall that this is my own fault​:

[ 5163] By​: gsar on 2000/02/20
16​:34​:33
Log​: glob() takes one or no user arguments and a non-user-
visible second
hidden argument, fix its prototype-checking accordingly
Branch​: perl

That made this​:

glob            glob                    ck\_glob         t@&#8203;      S?

S?
become this
glob glob ck_glob t@​ S?

And that breaks this​:

 use File&#8203;::Glob '&#8203;:glob';
 @&#8203;list = glob\('\*\.\[ch\]'\);
 $homedir = glob\('~gnat'\, GLOB\_TILDE | GLOB\_ERR\);
 if \(GLOB\_ERROR\) \{
     print "can't glob ~gnat&#8203;: $\!\\n";
 \} else \{
  print "Gnat lives in $homedir\\n";
 \}

Which used to say

Gnat lives in /home/gnat

But now says

Too many arguments for glob at \- line 3\, near "GLOB\_ERR\)"

So even though you import it, and even though it goes into <*>
fileglobs,
you have to say this​:

 use File&#8203;::Glob '&#8203;:glob';
 @&#8203;list = glob\('\*\.\[ch\]'\);
 $homedir = &glob\('~gnat'\, GLOB\_TILDE | GLOB\_ERR\);
 if \(GLOB\_ERROR\) \{
     print "can't glob ~gnat&#8203;: $\!\\n";
 \} else \{
  print "Gnat lives in $homedir\\n";
 \}

Yes, you have to use &glob to give another argument.

I don't know a perfect solution here, but right now, there's
a problem. Well, yes, I do know a perfect solution​: if one
could (effectively) frob the opcode.pl output so that

\*CORE&#8203;::GLOBAL&#8203;::glob = \\&File&#8203;::Glob&#8203;::csh\_glob;

would make the parser tolerate 0 or 1 arguments, but with

\*CORE&#8203;::GLOBAL&#8203;::glob = \\&File&#8203;::Glob&#8203;::glob;

it would tolerate 0, 1, or 2 arguments.

Magic seems to have been performed somewhere within the bowels. With
Perl-5.8.6, I get...

perl rt_2707_part1.pl
Steve lives in /home/steve

+------------------------+
| 2. :globally problems |
+------------------------+

There are other problems with this module. The import

 use File&#8203;::Glob '&#8203;:globally';

is a silent no-op on systems compiled with -DPERL_EXTERNAL_GLOB.
That's because all it does is

\*CORE&#8203;::GLOBAL&#8203;::glob = \\&File&#8203;::Glob&#8203;::csh\_glob;

but that's what you have already. If you want to have space-sensitive
globbing, then you use

use File&#8203;::Glob '&#8203;:glob';

But that's just that package. You can't use '​:globally'
to do a

\*CORE&#8203;::GLOBAL&#8203;::glob = \\&File&#8203;::Glob&#8203;::glob;

So you have to do it yourself, which feels sleasy. Here's the demo.
You can't get qw/​:glob :globally/ or or qw/​:globally :glob/ to do
what you need done. (Yeah, I know, CORE​::GLOBAL is "evil".)

\#\!/bin/sh \-x
rm \-rf /tmp/fred "/tmp/fred stuff"
mkdir "/tmp/fred stuff"
touch "/tmp/fred stuff/a"
touch "/tmp/fred stuff/b"
perl \-We '
use File&#8203;::Glob qw/&#8203;:glob :globally/;
\#use File&#8203;::Glob qw/&#8203;:globally :glob/;
\# BEGIN \{ \*CORE&#8203;::GLOBAL&#8203;::glob = \\&File&#8203;::Glob&#8203;::glob \};
package bad;
@&#8203;a = \</tmp/fred stu\*>;
print "File Glob globbed @&#8203;a\\n"
';

I think I prefer "​:everywhere" to "​:globally". It's far far too
close to "​:glob", and means something completely different.

It would also be nice to get at File​::Glob​::glob without overriding
the
built-in. But then you don't get the flags and such you need. Too
bad it's not "POSIX​::glob" -- less typing.

I think you should be able to say whether

This package uses POSIX glob for the Perl fileglob operator\.
This package uses csh   glob for the Perl fileglob operator\.
All packages use  POSIX glob for the Perl fileglob operator\.
All packages use  csh   glob for the Perl fileglob operator\.

File​::Glob and its funkiness has been embedded within Perl long enough
like this that changing it would break backwards compatibility. Maybe
one solution would be to implement a pragma or hint to clean it up. So,

  use glob 'POSIX'; # Explicitly take the POSIX behavior, or
  use glob 'csh'; # Explicitly take the csh behavior

For the global usage (although I personally believe this could break
things)...

  use glob qw(POSIX everywhere); # Explicitly take the POSIX behavior
globally, or
  use glob qw(csh everywhere); # Explicitly take the csh behavior globally

This is mostly an incomplete thought, but I'd be happy to listen to
suggestions, criticisms, complaints, etc. But, this looks like even
less typing :)

+----------------------------+
| 3. Incompatible breakage |
+----------------------------+

% perl5\.004 \-le 'print glob\("\*\.\[^x\]"\)'

That gets all the files that end in a dot followed by anything not
an x. But it is silently broken now​:

% perl \-le 'print glob\("\*\.\[^x\]"\)'

Because of this *.[!x] thing. And there's no way to get back
what it used to do.

That's due to a POSIX change (see
http​://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_13_01)
for an explanation.) Maybe we could support both to do the same, but
since its been this way now for five years, do we need too?

+-----------------+
| 4. Missing docs |
+-----------------+

I notice in passing that csh_glob is not documented. And when it is,
the space bug should be explained. Also, this $^O oddity isn't
explained, either. Iff you call glob with only one argument, then
iff you're on a case-screwed system (happy unicode, mate!), then
you get the default weirdness.

All of these things need doc'ing, including the various breakages.
What you import to get what to happen is highly unclear. It could
really use some work.

The current description contains the following.

"The glob angle-bracket operator <> is a pathname generator that
implements the rules for file name pattern matching used by Unix-like
shells such as the Bourne shell or C shell.

File​::Glob​::bsd_glob() implements the FreeBSD glob(3) routine, which is
a superset of the POSIX glob() (described in IEEE Std 1003.2 "POSIX.2").
bsd_glob() takes a mandatory pattern argument, and an optional flags
argument, and returns a list of filenames matching the pattern, with
interpretation of the pattern modified by the flags variable.

Since v5.6.0, Perl's CORE​::glob() is implemented in terms of bsd_glob().
Note that they don't share the same prototype--CORE​::glob() only accepts
a single argument. Due to historical reasons, CORE​::glob() will also
split its argument on whitespace, treating it as multiple patterns,
whereas bsd_glob() considers them as one pattern."

As far as documenting csh_glob, its documented internally with the
following comment.

  "csh_glob() should not be used directly, unless you know what you're
doing."

I'm not sure why this is, but this makes it difficult for me to suggest
documenting it. There is also a note that a nice thing to do would be
to create a flag to avoid the default space handling behavior (reading
this, though, is giving me what I need to close another more bug). That
will take a bit of thought on how best to implement it (a hint, perhaps?
Sorry I wrote this before what I wrote above, but it could be a
separate tweak all on its own). Overall, I think I'll take a deeper
look into the docs to see if more clarity is needed. Perhaps, a quick
example in the synopsis would help.

The problems encountered by Mac OS Classic users has been documented, as
well as issues for Win32 users. Mac OS X is also case-insensative. A
small note that essentially says "we do what your libc glob wants us to
do" would probably be nice.

Overall, though, things seem better than the world of File​::Glob at the
time you opened this ticket, but I agree that some things still need to
be looked into further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants