Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDBM_File/GDBM_File errors #12957

Closed
p5pRT opened this issue May 10, 2013 · 18 comments
Closed

SDBM_File/GDBM_File errors #12957

p5pRT opened this issue May 10, 2013 · 18 comments

Comments

@p5pRT
Copy link

p5pRT commented May 10, 2013

Migrated from rt.perl.org#117953 (status was 'open')

Searchable as RT117953$

@p5pRT
Copy link
Author

p5pRT commented May 10, 2013

From cowboyatheart@gmail.com

Created by cowboyatheart@gmail.com

GDBM_File, and SDBM_File both end early when using each to iterate over a
tied hash and deleting keys.

Test script​:

use strict;
use warnings;
use Fcntl;
use SDBM_File;
use GDBM_File;
use DB_File;

# plain old hash
{
my %hash;

# add 100 keys
my $x = 1;
while ($x <= 100) {
$hash{$x} = $x;
++$x;
}

print "plain old hash has " . scalar(keys %hash) . " keys\n";

# iterate/delete
my $cnt = 0;
while (my ($key,$value) = each %hash) {
$cnt++;
delete $hash{$key};
}
print "deleted $cnt keys in plain old hash\n";
}
# sdbm
{
my %hash;
tie(%hash, 'SDBM_File', 'error.sdbm', O_RDWR|O_CREAT, 0666) or die $!;

# add 100 keys
my $x = 1;
while ($x <= 100) {
$hash{$x} = $x;
++$x;
}

print "sdbm has " . scalar(keys %hash) . " keys\n";

# iterate/delete
my $cnt = 0;
while (my ($key,$value) = each %hash) {
$cnt++;
delete $hash{$key};
}
print "deleted $cnt keys in sdbm\n";
}
# gdbm
{
my %hash;
tie(%hash, 'GDBM_File', 'error.gdbm', O_RDWR|O_CREAT, 0666) or die $!;

# add 100 keys
my $x = 1;
while ($x <= 100) {
$hash{$x} = $x;
++$x;
}
print "gdbm has " . scalar(keys %hash) . " keys\n";

# iterate/delete
my $cnt = 0;
while (my ($key,$value) = each %hash) {
$cnt++;
delete $hash{$key};
}
print "deleted $cnt keys in gdbm\n";

}
# bdb
{
my %hash;
tie(%hash, 'DB_File', 'error.bdb', O_RDWR|O_CREAT, 0666) or die $!;

# add 100 keys
my $x = 1;
while ($x <= 100) {
$hash{$x} = $x;
++$x;
}

print "bdb has " . scalar(keys %hash) . " keys\n";

# iterate/delete
my $cnt = 0;
while (my ($key,$value) = each %hash) {
$cnt++;
delete $hash{$key};
}
print "deleted $cnt keys in bdb\n";
}

Output​:

plain old hash has 100 keys
deleted 100 keys in plain old hash
sdbm has 100 keys
deleted 50 keys in sdbm
gdbm has 100 keys
deleted 1 keys in gdbm
bdb has 100 keys
deleted 100 keys in bdb

Perl Info

Flags:
    category=library
    severity=medium
    module=SDBM_File

Site configuration information for perl 5.14.2:

Configured by Debian Project at Sat Feb  9 13:59:28 UTC 2013.

Summary of my perl5 (revision 5 version 14 subversion 2) configuration:

  Platform:
    osname=linux, osvers=3.2.0-4-amd64,
archname=x86_64-linux-gnu-thread-multi
    uname='linux madeleine 3.2.0-4-amd64 #1 smp debian 3.2.32-1 x86_64
gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN
-D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4
-Wformat -Werror=format-security -Dldflags= -Wl,-z,relro
-Dlddlflags=-shared -Wl,-z,relro -Dcccdlflags=-fPIC
-Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.14
-Darchlib=/usr/lib/perl/5.14 -Dvendorprefix=/usr
-Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5
-Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.14.2
-Dsitearch=/usr/local/lib/perl/5.14.2 -Dman1dir=/usr/share/man/man1
-Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1
-Dsiteman3dir=/usr/local/man/man3 -Duse64bitint -Dman1ext=1 -Dman3ext=3perl
-Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm
-Ui_libutil -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib
-Dlibperl=libperl.so.5.14.2 -des'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN
-fstack-protector -fno-strict-aliasing -pipe -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2 -g',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fstack-protector
-fno-strict-aliasing -pipe -I/usr/local/include'
    ccversion='', gccversion='4.7.2', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /lib/x86_64-linux-gnu /lib/../lib
/usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/lib
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=, so=so, useshrplib=true, libperl=libperl.so.5.14.2
    gnulibc_version='2.13'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib
-fstack-protector'

Locally applied patches:



@INC for perl 5.14.2:
    /etc/perl
    /usr/local/lib/perl/5.14.2
    /usr/local/share/perl/5.14.2
    /usr/lib/perl5
    /usr/share/perl5
    /usr/lib/perl/5.14
    /usr/share/perl/5.14
    /usr/local/lib/site_perl
    .


Environment for perl 5.14.2:
    HOME=/home/jeremy
    LANG=en_US.UTF-8
    LANGUAGE=en_CA:en
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)

PATH=/home/jeremy/devel/software/java/bin:/home/jeremy/devel/edate_third_party/ant/apache-ant-1.8.0/bin:/usr/local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented May 11, 2013

From @jkeenan

On Fri May 10 13​:22​:59 2013, cowboyatheart@​gmail.com wrote​:

This is a bug report for perl from cowboyatheart@​gmail.com,
generated with the help of perlbug 1.39 running under perl 5.14.2.

-----------------------------------------------------------------
[Please describe your issue here]

GDBM_File, and SDBM_File both end early when using each to iterate
over a
tied hash and deleting keys.

Test script​:

use strict;
use warnings;
use Fcntl;
use SDBM_File;
use GDBM_File;
use DB_File;

# plain old hash
{
my %hash;

# add 100 keys
my $x = 1;
while ($x <= 100) {
$hash{$x} = $x;
++$x;
}

print "plain old hash has " . scalar(keys %hash) . " keys\n";

# iterate/delete
my $cnt = 0;
while (my ($key,$value) = each %hash) {
$cnt++;
delete $hash{$key};
}
print "deleted $cnt keys in plain old hash\n";
}
# sdbm
{
my %hash;
tie(%hash, 'SDBM_File', 'error.sdbm', O_RDWR|O_CREAT, 0666) or die $!;

# add 100 keys
my $x = 1;
while ($x <= 100) {
$hash{$x} = $x;
++$x;
}

print "sdbm has " . scalar(keys %hash) . " keys\n";

# iterate/delete
my $cnt = 0;
while (my ($key,$value) = each %hash) {
$cnt++;
delete $hash{$key};
}
print "deleted $cnt keys in sdbm\n";
}
# gdbm
{
my %hash;
tie(%hash, 'GDBM_File', 'error.gdbm', O_RDWR|O_CREAT, 0666) or die $!;

# add 100 keys
my $x = 1;
while ($x <= 100) {
$hash{$x} = $x;
++$x;
}
print "gdbm has " . scalar(keys %hash) . " keys\n";

# iterate/delete
my $cnt = 0;
while (my ($key,$value) = each %hash) {
$cnt++;
delete $hash{$key};
}
print "deleted $cnt keys in gdbm\n";

}
# bdb
{
my %hash;
tie(%hash, 'DB_File', 'error.bdb', O_RDWR|O_CREAT, 0666) or die $!;

# add 100 keys
my $x = 1;
while ($x <= 100) {
$hash{$x} = $x;
++$x;
}

print "bdb has " . scalar(keys %hash) . " keys\n";

# iterate/delete
my $cnt = 0;
while (my ($key,$value) = each %hash) {
$cnt++;
delete $hash{$key};
}
print "deleted $cnt keys in bdb\n";
}

Output​:

plain old hash has 100 keys
deleted 100 keys in plain old hash
sdbm has 100 keys
deleted 50 keys in sdbm
gdbm has 100 keys
deleted 1 keys in gdbm
bdb has 100 keys
deleted 100 keys in bdb

Can anyone make a determination as to whether this is the same bug as RT
#74984?

@p5pRT
Copy link
Author

p5pRT commented May 11, 2013

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Nov 20, 2018

From @jkeenan

On Fri, 10 May 2013 20​:22​:59 GMT, cowboyatheart@​gmail.com wrote​:

This is a bug report for perl from cowboyatheart@​gmail.com,
generated with the help of perlbug 1.39 running under perl 5.14.2.

-----------------------------------------------------------------
[Please describe your issue here]

GDBM_File, and SDBM_File both end early when using each to iterate
over a
tied hash and deleting keys.

Test script​:

use strict;
use warnings;
use Fcntl;
use SDBM_File;
use GDBM_File;
use DB_File;

# plain old hash
{
my %hash;

# add 100 keys
my $x = 1;
while ($x <= 100) {
$hash{$x} = $x;
++$x;
}

print "plain old hash has " . scalar(keys %hash) . " keys\n";

# iterate/delete
my $cnt = 0;
while (my ($key,$value) = each %hash) {
$cnt++;
delete $hash{$key};
}
print "deleted $cnt keys in plain old hash\n";
}
# sdbm
{
my %hash;
tie(%hash, 'SDBM_File', 'error.sdbm', O_RDWR|O_CREAT, 0666) or die $!;

# add 100 keys
my $x = 1;
while ($x <= 100) {
$hash{$x} = $x;
++$x;
}

print "sdbm has " . scalar(keys %hash) . " keys\n";

# iterate/delete
my $cnt = 0;
while (my ($key,$value) = each %hash) {
$cnt++;
delete $hash{$key};
}
print "deleted $cnt keys in sdbm\n";
}
# gdbm
{
my %hash;
tie(%hash, 'GDBM_File', 'error.gdbm', O_RDWR|O_CREAT, 0666) or die $!;

# add 100 keys
my $x = 1;
while ($x <= 100) {
$hash{$x} = $x;
++$x;
}
print "gdbm has " . scalar(keys %hash) . " keys\n";

# iterate/delete
my $cnt = 0;
while (my ($key,$value) = each %hash) {
$cnt++;
delete $hash{$key};
}
print "deleted $cnt keys in gdbm\n";

}
# bdb
{
my %hash;
tie(%hash, 'DB_File', 'error.bdb', O_RDWR|O_CREAT, 0666) or die $!;

# add 100 keys
my $x = 1;
while ($x <= 100) {
$hash{$x} = $x;
++$x;
}

print "bdb has " . scalar(keys %hash) . " keys\n";

# iterate/delete
my $cnt = 0;
while (my ($key,$value) = each %hash) {
$cnt++;
delete $hash{$key};
}
print "deleted $cnt keys in bdb\n";
}

Output​:

plain old hash has 100 keys
deleted 100 keys in plain old hash
sdbm has 100 keys
deleted 50 keys in sdbm
gdbm has 100 keys
deleted 1 keys in gdbm
bdb has 100 keys
deleted 100 keys in bdb

This persists in perl-5.28.0 and blead. See attachments.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 20, 2018

From @jkeenan

#!/usr/bin/env perl
use strict;
use warnings;
use Fcntl;
use SDBM_File;
use GDBM_File;
use DB_File;
use Test​::More;

my $expected;

{
  note("plain old hash");
  my %hash;

  # add 100 keys
  my $x = 1;
  while ($x <= 100) {
  $hash{$x} = $x;
  ++$x;
  }

  $expected = 100;
  is(scalar(keys %hash), $expected, "plain old hash has $expected keys");

  # iterate/delete
  my $cnt = 0;
  while (my ($key,$value) = each %hash) {
  $cnt++;
  delete $hash{$key};
  }
  is($cnt, $expected, "deleted $cnt keys in plain old hash");
}

{
  note("SDBM_File");
  my %hash;
  tie(%hash, 'SDBM_File', 'error.sdbm', O_RDWR|O_CREAT, 0666) or die $!;

  # add 100 keys
  my $x = 1;
  while ($x <= 100) {
  $hash{$x} = $x;
  ++$x;
  }

  $expected = 100;
  is(scalar(keys %hash), $expected, "sdbm has $expected keys");

  # iterate/delete
  my $cnt = 0;
  while (my ($key,$value) = each %hash) {
  $cnt++;
  delete $hash{$key};
  }
  is($cnt, $expected, "deleted $cnt keys in sdbm");
}

{
  note("GDBM_File");
  my %hash;
  tie(%hash, 'GDBM_File', 'error.gdbm', O_RDWR|O_CREAT, 0666) or die $!;

  # add 100 keys
  my $x = 1;
  while ($x <= 100) {
  $hash{$x} = $x;
  ++$x;
  }
  $expected = 100;
  is(scalar(keys %hash), $expected, "gdbm has $expected keys");

  # iterate/delete
  my $cnt = 0;
  while (my ($key,$value) = each %hash) {
  $cnt++;
  delete $hash{$key};
  }
  is($cnt, $expected, "deleted $cnt keys in gdbm");
}

{
  note("DB_File (bdb)");
  my %hash;
  tie(%hash, 'DB_File', 'error.bdb', O_RDWR|O_CREAT, 0666) or die $!;

  # add 100 keys
  my $x = 1;
  while ($x <= 100) {
  $hash{$x} = $x;
  ++$x;
  }

  $expected = 100;
  is(scalar(keys %hash), $expected, "bdb has $expected keys");

  # iterate/delete
  my $cnt = 0;
  while (my ($key,$value) = each %hash) {
  $cnt++;
  delete $hash{$key};
  }
  is($cnt, $expected, "deleted $cnt keys in bdb");
}

done_testing;

@p5pRT
Copy link
Author

p5pRT commented Nov 20, 2018

From @jkeenan

# Failed test 'deleted 50 keys in sdbm'
# at 117953-sdbm-gdbm-file.t line 56.
# got​: '50'
# expected​: '100'
# Failed test 'deleted 1 keys in gdbm'
# at 117953-sdbm-gdbm-file.t line 79.
# got​: '1'
# expected​: '100'
# Looks like you failed 2 tests of 8.
117953-sdbm-gdbm-file.t ..
# plain old hash
ok 1 - plain old hash has 100 keys
ok 2 - deleted 100 keys in plain old hash
# SDBM_File
ok 3 - sdbm has 100 keys
not ok 4 - deleted 50 keys in sdbm
# GDBM_File
ok 5 - gdbm has 100 keys
not ok 6 - deleted 1 keys in gdbm
# DB_File (bdb)
ok 7 - bdb has 100 keys
ok 8 - deleted 100 keys in bdb
1..8
Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/8 subtests

Test Summary Report


117953-sdbm-gdbm-file.t (Wstat​: 512 Tests​: 8 Failed​: 2)
  Failed tests​: 4, 6
  Non-zero exit status​: 2
Files=1, Tests=8, 0 wallclock secs ( 0.02 usr 0.00 sys + 0.05 cusr 0.00 csys = 0.07 CPU)
Result​: FAIL

@p5pRT
Copy link
Author

p5pRT commented Nov 21, 2018

From @iabyn

On Tue, Nov 20, 2018 at 12​:39​:26PM -0800, James E Keenan via RT wrote​:

GDBM_File, and SDBM_File both end early when using each to iterate
over a
tied hash and deleting keys.

This persists in perl-5.28.0 and blead. See attachments.

Zefram added this documentation to GDBM_File.pm with
v5.27.6-268-g3752113a31​:

  Unlike Perl's built-in hashes, it is not safe to C<delete> the current
  item from a GDBM_File tied hash while iterating over it with C<each>.
  This is a limitation of the gdbm library.

I don't know whether a similar restriction applies to the other *DBM
libraries, but I would expect so.

--
You never really learn to swear until you learn to drive.

@graygnuorg
Copy link
Contributor

In gdbm, you cannot delete elements from the database while iterating over it. It is stated both in the GDBM Manual:

Don’t use gdbm_delete or gdbm_store in such a loop. File visiting is based on a hash table. The gdbm_delete function re-arranges the hash table to make sure that any collisions in the table do not leave some item un-findable. The original key order is not guaranteed to remain unchanged in all instances. So it is possible that some key will not be visited ...

and in the documentation to GDBM_File:

Unlike Perl's built-in hashes, it is not safe to delete the current item from a GDBM_File tied hash while iterating over it with each.
This is a limitation of the gdbm library.

I suspect the same is true for SDBM_File as well.

@jkeenan
Copy link
Contributor

jkeenan commented Oct 9, 2021

In gdbm, you cannot delete elements from the database while iterating over it. It is stated both in the GDBM Manual:

Don’t use gdbm_delete or gdbm_store in such a loop. File visiting is based on a hash table. The gdbm_delete function re-arranges the hash table to make sure that any collisions in the table do not leave some item un-findable. The original key order is not guaranteed to remain unchanged in all instances. So it is possible that some key will not be visited ...

and in the documentation to GDBM_File:

Unlike Perl's built-in hashes, it is not safe to delete the current item from a GDBM_File tied hash while iterating over it with each.
This is a limitation of the gdbm library.

I suspect the same is true for SDBM_File as well.

Thanks for looking into this. Given what you say, should we infer that the original poster's complaint back in 2013 should be rejected?

Should we update the documentation to reflect this?

@jkeenan
Copy link
Contributor

jkeenan commented Oct 9, 2021

[snip]
Should we update the documentation to reflect this?

Relevant to the above: perldoc GDBM_File says:

BUGS
       The available functions and the gdbm/perl interface need to be
       documented.

       The GDBM error number and error message interface needs to be added.

@graygnuorg
Copy link
Contributor

Given what you say, should we infer that the original poster's complaint back in 2013 should be rejected?

Yes, I think so.

Should we update the documentation to reflect this?

The GDBM_File documentation already says so clearly, as I've shown above. I'm not sure about SDBM_File, though.

@graygnuorg
Copy link
Contributor

Relevant to the above: perldoc GDBM_File says:

Sorry, but it actually does not. That statement was there until commit e03e7cdb6e3, which took care of this.

@jkeenan
Copy link
Contributor

jkeenan commented Oct 10, 2021

Relevant to the above: perldoc GDBM_File says:

Sorry, but it actually does not. That statement was there until commit e03e7cdb6e3, which took care of this.

Sorry, I was probably looking at the documentation from the 5.32 vendor perl on FreeBSD-12.

@tonycoz
Copy link
Contributor

tonycoz commented Oct 11, 2021

From looking at the sdbm source it doesn't support deleting the current key and continuing from the next key, I'd expect it to skip a key.

If a key is deleted, modified or inserted on some other page of the file it would end up returning some random key, since it depends on the page buffer not being modified.

Since we ship the SDBM source, which we've extensively modified, we could fix that, but I'm not sure it's worth the trouble. Such a change wouldn't be completely trivial.

@graygnuorg
Copy link
Contributor

I believe it would be a great feature. However, in hashed databases, ensuring that each key is visited (and visited exactly once) in presence of concurrent deletions or insertions is far from being trivial. A tentative implementation of this for gdbm required a considerable amount of changes and introduction of additional API call (gdbm_lastkey) to mark end of iteration. It is still in experimental stage, and far from being finished.

@jkeenan
Copy link
Contributor

jkeenan commented Oct 11, 2021

My assessment upon reading the last several posts in this ticket is that the original poster's expectations (back in 2013!) were incorrect and that attempting to delete keys while iterating over a hash tied to these two kinds of non-SQL databases is a Bad Thing.

Hence, we can close this ticket. If someone wants to suggest feature or documentation improvements, we can do that in a new ticket.

Thank you very much.
Jim Keenan

@jkeenan jkeenan closed this as completed Oct 11, 2021
@graygnuorg
Copy link
Contributor

graygnuorg commented Oct 12, 2021 via email

@tonycoz
Copy link
Contributor

tonycoz commented Oct 13, 2021

Can the tracker be configured so that I receive notifications when a new GDBM_File ticket is opened? This could help avoid such long delays.

I don't see a way to do that with github.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants