Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

find2perl -ls very slow on some LDAP based getpwent systems #8678

Open
p5pRT opened this issue Nov 14, 2006 · 8 comments
Open

find2perl -ls very slow on some LDAP based getpwent systems #8678

p5pRT opened this issue Nov 14, 2006 · 8 comments

Comments

@p5pRT
Copy link

p5pRT commented Nov 14, 2006

Migrated from rt.perl.org#40867 (status was 'open')

Searchable as RT40867$

@p5pRT
Copy link
Author

p5pRT commented Nov 14, 2006

From david.dyck@fluke.com

This is a bug report for perl from david.dyck@​fluke.com,
generated with the help of perlbug 1.35 running under perl v5.8.8.


[Please enter your report here]

find2perl . -ls
generates a little stub to pre-collect the mapping of uid to
user names for ls to use later.

while (($name, $pw, $uid) = getpwent) {
  $user{$uid} = $name unless $user{$uid};
}

I just ran into a system where this loop takes many minutes to
complete. It seems that find2perl assumes that this
loop will make the -ls code run faster, but there
needs to be a way to defer (and cache) the lookups
on computers like this SuSE 10.1 with LDAP accessed passwords

I did a test where I patched the OUTPUT of find2perl
and saw the desired results, but I wanted to hear
from other p5p users before applying this kind
of a change to the sources of find2perl, since a cleaner
solution would require more extensive changes scattered
throughout the find2perl code.

What cautions would others have
about this idea of no longer pre-initializing the %user and %uid
tables using getwpent?

How could this be made configurable for the find2perl users,
since the system "find" command does not suffer from this
performance hit. (In fact, the system find command was faster
even with my optimization to not pre-read in the whole password
table - which turned out to be due to the cost of reading in the
group file through ldap -- the group file was cached locally
but, due to its length, and its ldap library access there
were over 30,000 extra stat calls, so any optimization should
look at replacing both the getpwent/getgrent calls with
getpwuid/getgrgid)

Here's my 2n'd patch where I got rid of the preloading of
the password and group files.

Inline Patch
--- f2p.ls	2006-11-13 11:12:03.000021000 -0800
+++ f2p.ls-optg	2006-11-13 16:14:32.000080000 -0800
@@ -5,13 +5,7 @@
 @rwx = ('---','--x','-w-','-wx','r--','r-x','rw-','rwx');
 @moname = (Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec);
 
-while (($name, $pw, $uid) = getpwent) {
-    $user{$uid} = $name unless $user{$uid};
-}
 
-while (($name, $pw, $gid) = getgrent) {
-    $group{$gid} = $name unless $group{$gid};
-}
 
 require "find.pl";
 
@@ -59,6 +53,15 @@
     substr($tmp,8,1) =~ tr/-x/Tt/ if -k _;
     $perms .= $tmp;
 
+    unless (exists $user{$uid}) {
+        $user{$uid} = getpwuid($uid);
+        warn "looked up $uid => $user{$uid}\n";
+    }
+
+    unless (exists $group{$gid}) {
+        $group{$gid} = getgrgid($gid);
+        warn "looked up group $gid => $group{$gid}\n";
+    }
     $user = $user{$uid} || $uid;
     $group = $group{$gid} || $gid;
 

This message (including any attachments) contains confidential and/or proprietary information intended only for the addressee\. Any unauthorized disclosure\, copying\, distribution or reliance on the contents of this information is strictly prohibited and may constitute a violation of law\. If you are not the intended recipient\, please notify the sender immediately by responding to this e\-mail\, and delete the message from your system\. If you have any questions about this e\-mail please notify the sender immediately\.

@p5pRT
Copy link
Author

p5pRT commented Nov 23, 2006

From @rgarcia

On 14/11/06, via RT David Dyck <perlbug-followup@​perl.org> wrote​:

find2perl . -ls
generates a little stub to pre-collect the mapping of uid to
user names for ls to use later.
...
I just ran into a system where this loop takes many minutes to
complete. It seems that find2perl assumes that this
loop will make the -ls code run faster, but there
needs to be a way to defer (and cache) the lookups
on computers like this SuSE 10.1 with LDAP accessed passwords

I did a test where I patched the OUTPUT of find2perl
and saw the desired results, but I wanted to hear
from other p5p users before applying this kind
of a change to the sources of find2perl, since a cleaner
solution would require more extensive changes scattered
throughout the find2perl code.

I think that your (future) patch is acceptable since every used user
id is looked up only once. Also, this is used only for options -ls,
-tar and -user, so code generated for other invocations of find2perl
will not be modified.

@p5pRT
Copy link
Author

p5pRT commented Nov 23, 2006

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Feb 2, 2014

From @jkeenan

On Mon Nov 13 17​:57​:52 2006, dcd wrote​:

This is a bug report for perl from david.dyck@​fluke.com,
generated with the help of perlbug 1.35 running under perl v5.8.8.

-----------------------------------------------------------------
[Please enter your report here]

find2perl . -ls
generates a little stub to pre-collect the mapping of uid to
user names for ls to use later.

while (($name, $pw, $uid) = getpwent) {
$user{$uid} = $name unless $user{$uid};
}

I just ran into a system where this loop takes many minutes to
complete. It seems that find2perl assumes that this
loop will make the -ls code run faster, but there
needs to be a way to defer (and cache) the lookups
on computers like this SuSE 10.1 with LDAP accessed passwords

I did a test where I patched the OUTPUT of find2perl
and saw the desired results, but I wanted to hear
from other p5p users before applying this kind
of a change to the sources of find2perl, since a cleaner
solution would require more extensive changes scattered
throughout the find2perl code.

What cautions would others have
about this idea of no longer pre-initializing the %user and %uid
tables using getwpent?

How could this be made configurable for the find2perl users,
since the system "find" command does not suffer from this
performance hit. (In fact, the system find command was faster
even with my optimization to not pre-read in the whole password
table - which turned out to be due to the cost of reading in the
group file through ldap -- the group file was cached locally
but, due to its length, and its ldap library access there
were over 30,000 extra stat calls, so any optimization should
look at replacing both the getpwent/getgrent calls with
getpwuid/getgrgid)

Here's my 2n'd patch where I got rid of the preloading of
the password and group files.

--- f2p.ls 2006-11-13 11​:12​:03.000021000 -0800
+++ f2p.ls-optg 2006-11-13 16​:14​:32.000080000 -0800
@​@​ -5,13 +5,7 @​@​
@​rwx = ('---','--x','-w-','-wx','r--','r-x','rw-','rwx');
@​moname = (Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec);

-while (($name, $pw, $uid) = getpwent) {
- $user{$uid} = $name unless $user{$uid};
-}

-while (($name, $pw, $gid) = getgrent) {
- $group{$gid} = $name unless $group{$gid};
-}

require "find.pl";

@​@​ -59,6 +53,15 @​@​
substr($tmp,8,1) =~ tr/-x/Tt/ if -k _;
$perms .= $tmp;

+ unless (exists $user{$uid}) {
+ $user{$uid} = getpwuid($uid);
+ warn "looked up $uid => $user{$uid}\n";
+ }
+
+ unless (exists $group{$gid}) {
+ $group{$gid} = getgrgid($gid);
+ warn "looked up group $gid => $group{$gid}\n";
+ }
$user = $user{$uid} || $uid;
$group = $group{$gid} || $gid;

Discussion in this RT petered out in 2006. AFAICT, the patch proposed was never applied to find2perl.

Is there anyone familiar with find2perl who could evaluate whether this patch would be beneficial?

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Feb 2, 2014

From @Leont

On Sun, Feb 2, 2014 at 2​:22 AM, James E Keenan via RT <
perlbug-followup@​perl.org> wrote​:

Discussion in this RT petered out in 2006. AFAICT, the patch proposed was
never applied to find2perl.

Is there anyone familiar with find2perl who could evaluate whether this
patch would be beneficial?

The bug report looks sensible to me. The patch was against find2perl
output, not find2perl itself, but that seems easy enough to fix.

Leon

@p5pRT
Copy link
Author

p5pRT commented Feb 4, 2014

From @jkeenan

On Sun Feb 02 01​:39​:59 2014, LeonT wrote​:

On Sun, Feb 2, 2014 at 2​:22 AM, James E Keenan via RT <
perlbug-followup@​perl.org> wrote​:

Discussion in this RT petered out in 2006. AFAICT, the patch proposed was
never applied to find2perl.

Is there anyone familiar with find2perl who could evaluate whether this
patch would be beneficial?

The bug report looks sensible to me. The patch was against find2perl
output, not find2perl itself, but that seems easy enough to fix.

Leon

At TonyC's request, tonight I explored the possibility of adapting this patch to be drawn against x2p/find2perl.PL in blead.

Unfortunately, I don't think it's adaptable to the current x2p/find2perl.PL. Here's what I think is the relevant code in that file​:

#####
if (exists $init{user} || exists $init{ls} || exists $init{tar}) {
  print "my (%uid, %user);\n";
  print "while (my (\$name, \$pw, \$uid) = getpwent) {\n";
  print ' $uid{$name} = $uid{$uid} = $uid;', "\n"
  if exists $init{user};
  print ' $user{$uid} = $name unless exists $user{$uid};', "\n"
  if exists $init{ls} || exists $init{tar};
  print "}\n\n";
}

if (exists $init{group} || exists $init{ls} || exists $init{tar}) {
  print "my (%gid, %group);\n";
  print "while (my (\$name, \$pw, \$gid) = getgrent) {\n";
  print ' $gid{$name} = $gid{$gid} = $gid;', "\n"
  if exists $init{group};
  print ' $group{$gid} = $name unless exists $group{$gid};', "\n"
  if exists $init{ls} || exists $init{tar};
  print "}\n\n";
}
#####

Now, all those 'print' statements are really confusing. However, if we squint to see what's really happening, I *think* it's this​:

#####
my (%uid, %user);
while (my ($name, $pw, $uid) = getpwent) {
  $uid{$name} = $uid{$uid} = $uid;
  if exists $init{user};
  $user{$uid} = $name unless exists $user{$uid};
  if exists $init{ls} || exists $init{tar};
}

my (%gid, %group);
while (my ($name, $pw, $gid) = getgrent) {
  $gid{$name} = $gid{$gid} = $gid;
  if exists $init{group};
  $group{$gid} = $name unless exists $group{$gid};
  if exists $init{ls} || exists $init{tar};
}
#####
... modulo some 'if' blocks. Note that we are assigning to the following hash elements​:

#####
$uid{$name}
$uid{$uid}
$user{$uid}
$gid{$name}
$gid{$gid}
$group{$gid}
#####

AFAICT, the OP's patch only assigns to​:

#####
$user{$uid}
$group{$gid}
#####

That leaves 4 hash elements unassigned to. So I can't figure out how to rewrite this patch against blead.

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Feb 4, 2014

From @tonycoz

On Mon Feb 03 18​:08​:00 2014, jkeenan wrote​:

AFAICT, the OP's patch only assigns to​:

#####
$user{$uid}
$group{$gid}
#####

That leaves 4 hash elements unassigned to. So I can't figure out how
to rewrite this patch against blead.

Taking a closer look at it, it would break a few other things that use %uid, %user, %gid and %group.

The -name implementation uses %uid.

The -group implementation uses %gid.

The -tar implementation uses %user and %group.

So the code generation for those would also need to be updated.

Tony

@p5pRT
Copy link
Author

p5pRT commented Feb 4, 2014

From @Leont

On Tue, Feb 4, 2014 at 5​:36 AM, Tony Cook via RT
<perlbug-followup@​perl.org>wrote​:

On Mon Feb 03 18​:08​:00 2014, jkeenan wrote​:

AFAICT, the OP's patch only assigns to​:

#####
$user{$uid}
$group{$gid}
#####

That leaves 4 hash elements unassigned to. So I can't figure out how
to rewrite this patch against blead.

Taking a closer look at it, it would break a few other things that use
%uid, %user, %gid and %group.

The -name implementation uses %uid.

The -group implementation uses %gid.

The -tar implementation uses %user and %group.

So the code generation for those would also need to be updated.

I also looked into fixing it and came to similar conclusions. My strategy
is to replace %uid with get_uid() and so on. The lack of tests for all
involved arguments kept me from continuing though.

Leon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants