Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib.pm module pollutes @INC with duplicates under mod_perl #460

Closed
p5pRT opened this issue Aug 31, 1999 · 17 comments
Closed

lib.pm module pollutes @INC with duplicates under mod_perl #460

p5pRT opened this issue Aug 31, 1999 · 17 comments

Comments

@p5pRT
Copy link

p5pRT commented Aug 31, 1999

Migrated from rt.perl.org#1301 (status was 'resolved')

Searchable as RT1301$

@p5pRT
Copy link
Author

p5pRT commented Aug 31, 1999

From irwin@stockmaster.com

A minor correction I've made to the module easily resolves the problem
and I think it worthwhile to share with the world.

Modify the import() method as follows (my changes shown with a leading
"|"​:

sub import {
  shift;
| my %inc;

| map(++$inc{$_}, @​INC);

  foreach (reverse @​_) {
  ## Ignore this if not defined.
  next unless defined($_);
  if ($_ eq '') {
  require Carp;
  Carp​::carp("Empty compile time value given to use lib");
  # at foo.pl line
...
  }
| next if exists $inc{$_};
  unshift(@​INC, $_);
  # Put a corresponding archlib directory infront of $_ if it
  # looks like $_ has an archlib directory below it.
  if (-d "$_/$archname") {
  unshift(@​INC, "$_/$archname") if -d "$_/$archname/auto";
  unshift(@​INC, "$_/$archname/$]") if -d
"$_/$archname/$]/auto";
  }
  }
}

--
Tod Irwin
Principal Engineer
Stockmaster.com
(408) 861-3425

@p5pRT
Copy link
Author

p5pRT commented Oct 9, 1999

From @gsar

On Tue, 31 Aug 1999 12​:30​:49 PDT, Tod Irwin wrote​:

The lib.pm module delivered with the core Perl package does not check
for duplicate entries when it appends paths to the @​INC array. This is
normally not a problem but becomes annoying when used in a mod_perl
environment where hundreds/thousands of entries can pile up.

I've changed your fix slightly to keep any new entries at the
beginning of @​INC.

Sarathy
gsar@​activestate.com

Inline Patch
-----------------------------------8<-----------------------------------
Change 4327 by gsar@auger on 1999/10/10 02:23:52

	avoid duplicates in @INC, they cause leaks in mod_perl etc
	(suggested by Tod Irwin <irwin@stockmaster.com>)

Affected files ...

... //depot/perl/lib/lib.pm#7 edit

Differences ...

==== //depot/perl/lib/lib.pm#7 (text) ====
Index: perl/lib/lib.pm
--- perl/lib/lib.pm.~1~	Sat Oct  9 19:25:43 1999
+++ perl/lib/lib.pm	Sat Oct  9 19:25:43 1999
@@ -10,13 +10,12 @@
 
 sub import {
     shift;
+
+    my %names;
     foreach (reverse @_) {
-	## Ignore this if not defined.
-	next unless defined($_);
 	if ($_ eq '') {
 	    require Carp;
 	    Carp::carp("Empty compile time value given to use lib");
-							# at foo.pl line ...
 	}
 	if (-e && ! -d _) {
 	    require Carp;
@@ -29,27 +28,23 @@
 	    unshift(@INC, "$_/$archname")    if -d "$_/$archname/auto";
 	    unshift(@INC, "$_/$archname/$]") if -d "$_/$archname/$]/auto";
 	}
+	# remove trailing duplicates
+	@INC = grep { ++$names{$_} == 1 } @INC;
     }
 }
 
 
 sub unimport {
     shift;
-    my $mode = shift if $_[0] =~ m/^:[A-Z]+/;
 
     my %names;
-    foreach(@_) {
+    foreach (@_) {
 	++$names{$_};
 	++$names{"$_/$archname"} if -d "$_/$archname/auto";
     }
 
-    if ($mode and $mode eq ':ALL') {
-	# Remove ALL instances of each named directory.
-	@INC = grep { !exists $names{$_} } @INC;
-    } else {
-	# Remove INITIAL instance(s) of each named directory.
-	@INC = grep { --$names{$_} < 0   } @INC;
-    }
+    # Remove ALL instances of each named directory.
+    @INC = grep { !exists $names{$_} } @INC;
 }
 
 1;
@@ -74,7 +69,7 @@
 that later C<use> or C<require> statements will find modules which are
 not located on perl's default search path.
 
-=head2 ADDING DIRECTORIES TO @INC
+=head2 Adding directories to @INC
 
 The parameters to C<use lib> are added to the start of the perl search
 path. Saying
@@ -90,10 +85,10 @@
 If so the $dir/$archname directory is assumed to be a corresponding
 architecture specific directory and is added to @INC in front of $dir.
 
-If LIST includes both $dir and $dir/$archname then $dir/$archname will
-be added to @INC twice (if $dir/$archname/auto exists).
+To avoid memory leaks, all trailing duplicate entries in @INC are
+removed.
 
-=head2 DELETING DIRECTORIES FROM @INC
+=head2 Deleting directories from @INC
 
 You should normally only add directories to @INC.  If you need to
 delete directories from @INC take care to only delete those which you
@@ -101,25 +96,16 @@
 in your script.  Other modules may have added directories which they
 need for correct operation.
 
-By default the C<no lib> statement deletes the I<first> instance of
-each named directory from @INC.  To delete multiple instances of the
-same name from @INC you can specify the name multiple times.
+The C<no lib> statement deletes all instances of each named directory
+from @INC.
 
-To delete I<all> instances of I<all> the specified names from @INC you can
-specify ':ALL' as the first parameter of C<no lib>. For example:
-
-    no lib qw(:ALL .);
-
 For each directory in LIST (called $dir here) the lib module also
 checks to see if a directory called $dir/$archname/auto exists.
 If so the $dir/$archname directory is assumed to be a corresponding
 architecture specific directory and is also deleted from @INC.
 
-If LIST includes both $dir and $dir/$archname then $dir/$archname will
-be deleted from @INC twice (if $dir/$archname/auto exists).
+=head2 Restoring original @INC
 
-=head2 RESTORING ORIGINAL @INC
-
 When the lib module is first loaded it records the current value of @INC
 in an array C<@lib::ORIG_INC>. To restore @INC to that value you
 can say
@@ -136,4 +122,3 @@
 Tim Bunce, 2nd June 1995.
 
 =cut
-
End of Patch.

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From [Unknown Contact. See original ticket]

Gurusamy Sarathy <gsar@​ActiveState.com> wrote

On Tue, 31 Aug 1999 12​:30​:49 PDT, Tod Irwin wrote​:

The lib.pm module delivered with the core Perl package does not check
for duplicate entries when it appends paths to the @​INC array. This is
normally not a problem but becomes annoying when used in a mod_perl
environment where hundreds/thousands of entries can pile up.

I've changed your fix slightly to keep any new entries at the
beginning of @​INC.

This change needs to be considered carefully. It removes a
deliberately designed feature. It's not a feature I've ever felt
the need for, but it *is* explicitly documented.

And surely mod_perl should be cleaning up @​INC anyway. What happens
if various broken user libraries are added to @​INC ?

Mike Guy

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From @gsar

On Mon, 11 Oct 1999 17​:46​:00 BST, "M.J.T. Guy" wrote​:

Gurusamy Sarathy <gsar@​ActiveState.com> wrote

I've changed your fix slightly to keep any new entries at the
beginning of @​INC.

This change needs to be considered carefully. It removes a
deliberately designed feature. It's not a feature I've ever felt
the need for, but it *is* explicitly documented.

I did consider it very carefully, and came to the conclusion that
that "feature" is impossible to use correctly, unless you happen
to know what order the C<use/no lib> declarations happen to be
executed at compile time.

Sarathy
gsar@​activestate.com

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From [Unknown Contact. See original ticket]

Gurusamy Sarathy <gsar@​ActiveState.com> wrote

I did consider it very carefully,

OK - sorry to have implied otherwise. But I would normally
expect that a patch changing a deliberate feature (as opposed to
cleaning up an accidental mess) would have been clearly flagged as
such.

                              and came to the conclusion that

that "feature" is impossible to use correctly, unless you happen
to know what order the C<use/no lib> declarations happen to be
executed at compile time.

What's so difficult about it? I always assumed the intended style
to be

  use lib "my/private/library/for/this/module/only";
  use Mymodule1;
  use Mymodule2;
  no lib "my/private/library/for/this/module/only";

Where's the difficulty in ordering?

But, as I said, I've never felt tempted to use this feature. So I
won't cry too much over its passing.

Mike Guy

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From [Unknown Contact. See original ticket]

Gurusamy Sarathy <gsar@​ActiveState.com> writes​:

On Mon, 11 Oct 1999 17​:46​:00 BST, "M.J.T. Guy" wrote​:

Gurusamy Sarathy <gsar@​ActiveState.com> wrote

I've changed your fix slightly to keep any new entries at the
beginning of @​INC.

This change needs to be considered carefully. It removes a
deliberately designed feature. It's not a feature I've ever felt
the need for, but it *is* explicitly documented.

I did consider it very carefully, and came to the conclusion that
that "feature" is impossible to use correctly, unless you happen
to know what order the C<use/no lib> declarations happen to be
executed at compile time.

Why is that so hard? In almost all cases use lib only occurs in the top
level script - and by necessity in the 1st few lines before any other use-s.
Modules which inject things into @​INC are highly suspect beasts - its like
lacing the fruit juice with vodka.

I have more then once done​:

use lib "/somepath/in/inc/already";

Just to make it be searched before it would otherwise have been.
The most common case being​:
use lib ".";
in simple hackery which makes (say) ./Mail/Internet.pm override
the normal one.

This "trick" is particularly handy in Makefile.PLs - e.g. Tk's does
(IRRC)

use lib '.';
use Tk​::MMutil;

To grab ./Tk/MMutil.pm

So I don't mind duplicates being pruned but the ordering should
be as documented. i.e. add to front, drop any occurances in the rest of the
list.

--
Nick Ing-Simmons

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From [Unknown Contact. See original ticket]

"Mike" == M J T Guy <mjtg@​cus.cam.ac.uk> writes​:

  Mike> Gurusamy Sarathy <gsar@​ActiveState.com> wrote
  >> I did consider it very carefully,

  Mike> OK - sorry to have implied otherwise. But I would normally
  Mike> expect that a patch changing a deliberate feature (as
  Mike> opposed to cleaning up an accidental mess) would have been
  Mike> clearly flagged as such.

  >> and came to the conclusion that that "feature" is impossible to
  >> use correctly, unless you happen to know what order the
  >> C<use/no lib> declarations happen to be executed at compile
  >> time.

  Mike> What's so difficult about it? I always assumed the intended
  Mike> style to be

  Mike> use lib "my/private/library/for/this/module/only";
  Mike> use Mymodule1;
  Mike> use Mymodule2;
  Mike> no lib "my/private/library/for/this/module/only";

  Mike> Where's the difficulty in ordering?

  Mike> But, as I said, I've never felt tempted to use this feature.
  Mike> So I won't cry too much over its passing.

Actually, I've used this feature, and have been tearing my head out
over other aspects of lib.

I might suggest that what are needed are new functions within the lib
module which 1.) prepend the given paths, removing any duplicates;
2.) simply ensure the existence of the given paths, prepending only if
necessary. The original behavior is still supported by default.

Unfortunately, the structure and usage of lib.pm argue against having
extra functions, so what about​:

use lib​::ensure;
## any use lib from now isn't done if the dir is already there

no lib​::ensure;
use lib​::unique;
## any use lib from now is done uniquely (i.e. duplicates are removed,
## and the library is prepended)

use lib​::append;
## any use lib from now is done uniqely and appended

Do note that I've got a first hack this coded up, and will be happy to
send to anyone who wishes it.

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From @gsar

On Mon, 11 Oct 1999 18​:59​:14 BST, "M.J.T. Guy" wrote​:

                              and came to the conclusion that

that "feature" is impossible to use correctly, unless you happen
to know what order the C<use/no lib> declarations happen to be
executed at compile time.

What's so difficult about it? I always assumed the intended style
to be

  use lib "my/private/library/for/this/module/only";
  use Mymodule1;
  use Mymodule2;
  no lib "my/private/library/for/this/module/only";

This still works after the change, assuming Mymodule* don't
have C<no lib "my/private/library/for/this/module/only">.

Where's the difficulty in ordering?

But, as I said, I've never felt tempted to use this feature. So I
won't cry too much over its passing.

I was referring to behavior that depended on the order in which
multiple imports were done. C<no lib "foo"> is only ever useful
when you can be sure that every module that said C<use lib "foo">
also said C<no lib "foo">. Given that the pod actually discourages
the use of the unimport, C<no lib "foo"> is likely to be pretty
useless anyway.

Sarathy
gsar@​activestate.com

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From @gsar

On Mon, 11 Oct 1999 19​:29​:05 BST, Nick Ing-Simmons wrote​:

So I don't mind duplicates being pruned but the ordering should
be as documented. i.e. add to front, drop any occurances in the rest of the
list.

Yes, that's exactly what happens.

Sarathy
gsar@​activestate.com

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From @gsar

On Mon, 11 Oct 1999 19​:29​:05 BST, Nick Ing-Simmons wrote​:

Modules which inject things into @​INC are highly suspect beasts - its like
lacing the fruit juice with vodka.

FWIW, I consider it poor style to write *modules* that contain
hard-wired paths in them. The use of C<use lib> should be discouraged
in anything but the top level script, IMO.

Sarathy
gsar@​activestate.com

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From [Unknown Contact. See original ticket]

Brent B. Powers <p5p@​b2pi.com> writes​:

Unfortunately, the structure and usage of lib.pm argue against having
extra functions, so what about​:

use lib​::ensure;
## any use lib from now isn't done if the dir is already there

Rather than messing with lib.pm with these 'pragmas' why not just
make them do what you want​:

use lib​::ensure '/foo/bar';
use lib​::append '/tmp/wurble';

--
Nick Ing-Simmons

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From [Unknown Contact. See original ticket]

"Nick" == Nick Ing-Simmons <nick@​ing-simmons.net> writes​:

  Nick> Brent B. Powers <p5p@​b2pi.com> writes​:
  >> Unfortunately, the structure and usage of lib.pm argue against
  >> having extra functions, so what about​:
  >>
  >> use lib​::ensure; ## any use lib from now isn't done if the dir
  >> is already there

  Nick> Rather than messing with lib.pm with these 'pragmas' why not
  Nick> just make them do what you want​:

  Nick> use lib​::ensure '/foo/bar';
  Nick> use lib​::append '/tmp/wurble';

The combinations get exasperating. (Since ensure implies unique, there
are 5, I think, possible combinations, but then there's order)

use lib​::unique​::append '/foo/bar'; # ??
use lib​::append​::unique '/foo/bar'; # ??

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From [Unknown Contact. See original ticket]

Brent B . Powers <bpowers@​ms.com> writes​:

Nick> use lib​::ensure '/foo/bar';
Nick> use lib​::append '/tmp/wurble';

The combinations get exasperating. (Since ensure implies unique,

says you ;-)

there
are 5, I think, possible combinations, but then there's order)

use lib​::unique​::append '/foo/bar'; # ??
use lib​::append​::unique '/foo/bar'; # ??

Hmm. The unique bit can be done after-the-fact on another call​:

Just because lib does not allow for extensions does not mean you cannot
have an extensible extension - how about operators in the list :

use lib​::hackery push => '/foo/bar', unshift => '/tmp/wurble', unique;

Hey - what about allowing perl code :

BEGIN
{
  push(@​INC,'/foo/bar');
  unshift(@​INC,'/tmp/wurble');
  ...
}

--
Nick Ing-Simmons

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From [Unknown Contact. See original ticket]

"Nick" == Nick Ing-Simmons <nick@​ing-simmons.net> writes​:

Brent B . Powers <bpowers@​ms.com> writes​:

Nick> use lib​::ensure '/foo/bar';
Nick> use lib​::append '/tmp/wurble';

The combinations get exasperating. (Since ensure implies unique,

  Nick> says you ;-)

Well, me, and the demons telling me to kill the president (They happen
to agree with me) :)

there
are 5, I think, possible combinations, but then there's order)

use lib​::unique​::append '/foo/bar'; # ??
use lib​::append​::unique '/foo/bar'; # ??

  Nick> Hmm. The unique bit can be done after-the-fact on another call​:

  Nick> Just because lib does not allow for extensions does not mean
  Nick> you cannot have an extensible extension - how about
  Nick> operators in the list :

  Nick> use lib​::hackery push => '/foo/bar', unshift => '/tmp/wurble', unique;

Ummmm, ughh. Certainly someone else could write lib​::hackery. I was
interested primarily in using as little code as possible. I've
attached my code below.

  Nick> Hey - what about allowing perl code :

  Nick> BEGIN
  Nick> {
  Nick> push(@​INC,'/foo/bar');
  Nick> unshift(@​INC,'/tmp/wurble');
  Nick> ...
  Nick> }

May I presume that that's a joke?

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From [Unknown Contact. See original ticket]

lib.tar.gz

@p5pRT
Copy link
Author

p5pRT commented Oct 11, 1999

From @vanstyn

In <199910111905.MAA05725@​activestate.com>, Gurusamy Sarathy writes​:
:On Mon, 11 Oct 1999 19​:29​:05 BST, Nick Ing-Simmons wrote​:
:>Modules which inject things into @​INC are highly suspect beasts - its like
:>lacing the fruit juice with vodka.
:
:FWIW, I consider it poor style to write *modules* that contain
:hard-wired paths in them. The use of C<use lib> should be discouraged
:in anything but the top level script, IMO.

I'm not particularly defending the practice, but in my current work all
but a couple of security modules reside out of the main perl library
paths - as well as implementing the security policy, these modules take
it upon themselves to put the appropriate additional paths into @​INC.

This is very useful in this particular environment, not least because
it helps to discourage programmers from bypassing the security policy.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Oct 13, 1999

From [Unknown Contact. See original ticket]

Good afternoon Sarthy​:

Since you are contemplating changes (or have already done so),
consider this interaction.

The environment variable PERL5LIB also appends paths to
@​INC. Consider the interaction of 'use lib' and this variable.

Do you want the 'use lib' statement to take precedence over the
environment variable setting?

I have seen/heard of two uses of the PERL5LIB variable​:

1. Set path to modules to be used by a set of scripts.
2. Set path to modules to override released path to scripts.

The existing functionality really only supports #1 when
'use lib' is combined. You can't obtain the functionality of #2
when also using 'use lib' because the paths are placed in front
of the environment variables.

One can't always install modules in the site_perl directories
when there is a need to maintain controlled tool sets.
The 'use lib' functionality hence is used to refer to these
controlled library sets.

Given this type of environment, it would be nice if the documented
PERL5LIB variable would always be arranged at the front of @​INC.

Could we at least write up the intentions of the interaction between
PERL5LIB and 'use lib'.

  Adam Krolnik
  Verification Mgr.
  LSI Logic, Inc.
  krolnik@​lsil.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant