Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TIED $x = \$y{z}; delete $y{z} -- behaves badly #7161

Open
p5pRT opened this issue Mar 9, 2004 · 9 comments
Open

TIED $x = \$y{z}; delete $y{z} -- behaves badly #7161

p5pRT opened this issue Mar 9, 2004 · 9 comments

Comments

@p5pRT
Copy link

p5pRT commented Mar 9, 2004

Migrated from rt.perl.org#27555 (status was 'open')

Searchable as RT27555$

@p5pRT
Copy link
Author

p5pRT commented Mar 9, 2004

From @muir

Created by @muir

Several related bugs...

1. After deleting a reference to a tied hash value, an assignment
through the reference will re-create the tied hash key and reconnect
the reference to the hash value.

If there is more than one such reference, then when one reference
reconnects, the others disconnect.

2. With untied hashes, creating a reference to hash value that doesn't
exist will create the hash key. With tied hashes this does not happen.
The tie interface probably needs some new methods.

3. In most of perl, two references to the same object are themselves
the same. refaddr($ref1) == refaddr($ref2) iff $ref1 and $ref2 refer
to the same thing. With tied hases, this is not the case.

I haven't checked the behavior of tied ARRAYs with respect to these
things but I wouldn't be suprised if they had the same problems.

I'll upload a regression test.

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl v5.8.3:

Configured by Debian Project at Sun Feb 15 17:22:09 EST 2004.

Summary of my perl5 (revision 5.0 version 8 subversion 3) configuration:
  Platform:
    osname=linux, osvers=2.4.22-xfs+ti1211, archname=i386-linux-thread-multi
    uname='linux kosh 2.4.22-xfs+ti1211 #1 sat oct 25 10:11:37 est 2003 i686 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i386-linux -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.3 -Dsitearch=/usr/local/lib/perl/5.8.3 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.3 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O3',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -I/usr/local/include'
    ccversion='', gccversion='3.3.3 20040125 (prerelease) (Debian)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.3.2.so, so=so, useshrplib=true, libperl=libperl.so.5.8.3
    gnulibc_version='2.3.2'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    


@INC for perl v5.8.3:
    /etc/perl
    /usr/local/lib/perl/5.8.3
    /usr/local/share/perl/5.8.3
    /usr/lib/perl5
    /usr/share/perl5
    /usr/lib/perl/5.8
    /usr/share/perl/5.8
    /usr/local/lib/site_perl
    /usr/local/lib/perl/5.8.2
    /usr/local/share/perl/5.8.2
    .


Environment for perl v5.8.3:
    HOME=/home/muir
    LANG=C
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=.:/home/muir/bin/charm:/home/muir/bin:/home/muir/bin/share:/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/shbin:/usr/local/sbin:/usr/local/bin:/usr/local/ptybin:/usr/X11R6/bin:/usr/bin/X11:/usr/local/tex/bin:/usr/ucb:/usr/bin:/bin:/etc:/usr/etc:/usr/games:/lib:/usr/lib:/usr/local/java/bin:/usr/lib/uucp:/usr/openwin/bin:/usr/openwin/bin/xview:/usr/openwin/demo:/usr/adm:/home/muir/tmp
    PERL_BADLANG (unset)
    SHELL=/bin/tcsh

@p5pRT
Copy link
Author

p5pRT commented Mar 9, 2004

From @muir

#!/usr/bin/perl

use warnings;
use strict;

use Scalar​::Util qw(refaddr reftype blessed);
use Test​::More tests => 82;

#
# References to tied hash values are all unique. They each have
# their own address. References remain even when the hash key
# is deleted.
#
# Sometimes the references can become disconnected from the underlying
# hash. They'll reconnect on assignement.
#
# When a reference reconnects after assignement, any other references
# disconnect. (not shown)
#

print "# block at ".__LINE__."\n";

{
  my %x;
  tie %x, 'Hash1', {};

  $x{y} = 7;
  my $a = \$x{y};
  delete $x{y};
  $x{y} = 9;
  my $b = \$x{y};
  my $c = \$x{y};

  ok($$a == 7,
  "The \$a reference should be disconnected");
  ok(refaddr($b) eq refaddr($c),
  "References to the same thing should be the same");

  delete $x{y};
  $$c = 17;
  ok($$b != 17,
  "Post-delete, references should be disconnected");
  ok($x{y} != 17,
  "Post-delete, references should be disconnected");

  my $d = \$x{y};
  $$a = 12;
  ok($x{y} != 12,
  "Post-disconnect, reconnect shouldn't happen");

  my $q = \$x{q};
  ok(exists($x{q}),
  "creating a reference creates a key");
}

package Hash1;

sub TIEHASH
{
  my $pkg = shift;
  return bless [ @​_ ], $pkg;
}

sub FETCH
{
  my $self = shift;
  my $key = shift;
  my ($underlying) = @​$self;
  return $underlying->{$key};
}

sub STORE
{
  my $self = shift;
  my $key = shift;
  my $value = shift;
  my ($underlying) = @​$self;
  return ($underlying->{$key} = $value);
}

sub DELETE
{
  my ($self, $key) = @​_;
  my ($underlying) = @​$self;
  return delete($underlying->{$key});
}

sub CLEAR
{
  my $self = shift;
  my ($underlying) = @​$self;
  %$underlying = ();
}

sub EXISTS
{
  my $self = shift;
  my $key = shift;
  my ($underlying) = @​$self;
  return exists $underlying->{$key};
}

sub FIRSTKEY
{
  my $self = shift;
  my ($underlying) = @​$self;
  keys %$underlying;
  return each %$underlying;
}

sub NEXTKEY
{
  my $self = shift;
  my ($underlying) = @​$self;
  return each %$underlying;
}

@p5pRT
Copy link
Author

p5pRT commented Mar 24, 2004

From @iabyn

On Tue, Mar 09, 2004 at 09​:58​:08PM -0000, David Muir Sharnoff wrote​:

Several related bugs...

1. After deleting a reference to a tied hash value, an assignment
through the reference will re-create the tied hash key and reconnect
the reference to the hash value.

If there is more than one such reference, then when one reference
reconnects, the others disconnect.

2. With untied hashes, creating a reference to hash value that doesn't
exist will create the hash key. With tied hashes this does not happen.
The tie interface probably needs some new methods.

3. In most of perl, two references to the same object are themselves
the same. refaddr($ref1) == refaddr($ref2) iff $ref1 and $ref2 refer
to the same thing. With tied hases, this is not the case.

I haven't checked the behavior of tied ARRAYs with respect to these
things but I wouldn't be suprised if they had the same problems.

What you are seeing here is various user-level manfifestions of the fact
that the expression $tied_hash{$key} actually creates a "proxy" lvalue
object, which when accessed calls FETCH, and when assigned to, calls
STORE. When you create a reference to a tied element, you are really
creating a reference to this proxy object. This allows the following to
work​:

  $ref = \$tied_hash{foo};
  $$ref = 1; # does $obj->STORE('foo', 1);
  $$ref = 2; # does $obj->STORE('foo', 2);
  $$ref = 3; # does $obj->STORE('foo', 3);
  $x = $$ref; # does $obj->FETCH('foo');
  $y = $$ref; # does $obj->FETCH('foo');

There's no way to achieve these seamantics without the rough edges showing
slightly. If you think about it, its amazing that taking a reference works
in the first place, since there isn't a real value who's reference can
actually be taken.

Dave.

--
"The GPL violates the U.S. Constitution, together with copyright,
antitrust and export control laws"
  -- SCO smoking crack again.

@p5pRT
Copy link
Author

p5pRT commented Mar 24, 2004

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2004

From @muir

I know this stuff isn't easy. I'm impressed it works as well as
it does. In investigating this, I started reading perl's source
code for the first time and I was really impressed at how small it
is. The density is amizingly high. Totally inscrutible without
spending a few weeks learning the psudo-language it's written in
though. I do think this bug can be fixed.

The simplest problem​: binding to a value doesn't cause it to
exist can be fixed by doing​:

  $obj->STORE('foo', undef) unless exists $obj->EXISTS('foo');

For the harder issue that $obj->DELETE('foo') should disconnect all
existing references, addional bookkeeping is required. What I did
to work around this in my persistence module is that I register all
such references with the tied hash object (which knows about this
problem) and then whenever $obj->DELETE is called, the tied has
object notifies all the dangling references to disconnect from the
hash. The disconnect routine connects dangling references to the
same key together. I can't do this completely transparently because
I can't prevent users from doing $ref1 = $ref2. What I require
at that point is that they call workaround27555($ref1);

Perl could do this automatically by automatically registering such
hashes somewhere and then intercepting tied DELETE and tied CLEAR
to deal with it.

I'm not sure how to fix the refaddr() problem. I work around it but
my workaround requires traversing all available structures.

Not pretty and a lot of work for an odd case. I think that work would
only be worth it in the context of the larger goal of making it possible
for tied objects to truely emulate untied objects. This is a worthy
goal and it requires more than just fixing my bugs to achieve it.

-Dave

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2004

From @hvds

David Muir Sharnoff <muir@​idiom.com> wrote​:
:For the harder issue that $obj->DELETE('foo') should disconnect all
:existing references, addional bookkeeping is required. What I did
:to work around this in my persistence module is that I register all
:such references with the tied hash object (which knows about this
:problem) and then whenever $obj->DELETE is called, the tied has
:object notifies all the dangling references to disconnect from the
:hash. The disconnect routine connects dangling references to the
:same key together. I can't do this completely transparently because
:I can't prevent users from doing $ref1 = $ref2. What I require
:at that point is that they call workaround27555($ref1);
:
:Perl could do this automatically by automatically registering such
:hashes somewhere and then intercepting tied DELETE and tied CLEAR
:to deal with it.

Without thinking about it too deeply, it feels like this is something
you should be able to solve by ensuring that the refs taken rely on
reading through a weak reference, and that DELETE frees the last
concrete reference. See Scalar​::Util​::weaken().

Hugo

@p5pRT
Copy link
Author

p5pRT commented Apr 4, 2004

From @jlokier

Created by @jlokier

It is possible to take a reference to an element of a tied hash, like this​:

  my $ref = \$tied_hash{key};

Taking the reference like that creates a special kind of
pseudo-reference which remembers the key that was used. It doesn't
call FETCH or STORE or any other method.

Reading from the reference (C<my $value = $$ref>) calls the tied
hash's FETCH method the _first_ time it's read, and then caches the
result.

Writing to the reference (C<$$ref = ...>) calls the tied hash's STORE
method and invalidates the cached result, so that FETCH will be called
next time the reference is read.

This is very clever and sensible and useful.

Confusion comes about when the hash is written to directly, or via
another pseudo-reference using the same key.

This code prints "new_value"​:

  $tied_hash{key} = "old_value";
  my $ref = \$tied_hash{key};
  $tied_hash{key} = "new_value";
  print $$ref;

Whereas this code prints "old_value" _if_ C<$tied_hash> is tied​:

  $tied_hash{key} = "old_value";
  my $ref = \$tied_hash{key};
  "$$ref"; # Force non-void context.
  $tied_hash{key} = "new_value";
  print $$ref;

This also prints "old_value" if C<$tied_hash> is tied​:

  $tied_hash{key} = "old_value";
  my $ref1 = \$tied_hash{key};
  my $ref2 = \$tied_hash{key};
  "$$ref1"; # Force non-void context.
  $$ref2 = "new_value";
  print $$ref1;

My complaint is that this cacheing of fetched values in a
pseudo-reference is slightly too aggressive​: it makes references to
tied hash elements behave semantically too differently from ordinary
hashes, and it isn't necessary or difficult to fix.

Ideally, any write to that key of the tied hash should invalidate the
cached pseudo-reference value. (Really ideally, all pseudo-references
using the same key would be equal , too).

It may well be to much too state to keep track of, remembering the
pseudo-references of each key individually. So instead, I suggest any
write to a tied hash should invalidate _all_ cached pseudo-references
which are associated with that hash. That wouldn't be much state to
keep track of.

Example program​:

  perl -l <<'END'
  sub TIEHASH { return bless {}, $_[0] }
  sub FETCH { print "Fetch"; return $_[0]->{$_[1]} }
  sub STORE { print "Store"; $_[0]->{$_[1]} = $_[2] }
  tie %HASH, __PACKAGE__; my $r = \$HASH{key};
  $$r = "value1";
  print $$r;
  print $HASH{key};
  print $$r;
  $$r = "value2";
  print $$r;
  print $$r;
  $HASH{key} = "value3";
  print $$r;
  print $HASH{key};
  print $$r;
  END

Output of the above program​:

  Store
  Fetch
  value1
  Fetch
  value1
  value1
  Store
  Fetch
  value2
  value2
  Store
  value2
  Fetch
  value3
  value2

See how the "value2" lines after the last "Store" should logically say
"value3". I think this quirk of pseudo-references is too subtle, and
is likely to cause subtle bugs in programs, especially code that is
given a tied hash and doesn't know it, treating it like a normal hash.

Thanks,
-- Jamie

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl v5.8.0:

Configured by bhcompile'
cf_email='bhcompile at Wed Aug 13 11:45:59 EDT 2003.

Summary of my rderl (revision 5.0 version 8 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.4.21-1.1931.2.382.entsmp, archname=i386-linux-thread-multi
    uname='linux str'
    config_args='-des -Doptimize=-O2 -g -pipe -march=i386 -mcpu=i686 -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Dotherlibdirs=/usr/lib/perl5/5.8.0 -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef'
 useithreads=define usemultiplicity=
    useperlio= d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=un uselongdouble=
    usemymalloc=, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGING -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGING -fno-strict-aliasing -I/usr/local/include -I/usr/include/gdbm'
    ccversion='', gccversion='3.2.2 20030222 (Red Hat Linux 3.2.2-5)', gccosandvers=''
gccversion='3.2.2 200302'
    intsize=r, longsize=r, ptrsize=5, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long'
k', ivsize=4'
ivtype='l, nvtype='double'
o_nonbl', nvsize=, Off_t='', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc'
l', ldflags =' -L/u'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lgdbm -ldb -ldl -lm -lpthread -lc -lcrypt -lutil
    perllibs=
    libc=/lib/libc-2.3.2.so, so=so, useshrplib=true, libperl=libper
    gnulibc_version='2.3.2'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so', d_dlsymun=undef, ccdlflags='-rdynamic -Wl,-rpath,/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE'
    cccdlflags='-fPIC'
ccdlflags='-rdynamic -Wl,-rpath,/usr/lib/perl5', lddlflags='s Unicode/Normalize XS/A'

Locally applied patches:
    MAINT18379


@INC for perl v5.8.0:
    /usr/lib/perl5/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/5.8.0
    /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.0
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.0
    /usr/lib/perl5/vendor_perl
    /usr/lib/perl5/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/5.8.0
    .


Environment for perl v5.8.0:
    HOME=/home/jamie
    LANG=en_GB.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/home/jamie/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash
    dlflags='-share (unset)

@p5pRT
Copy link
Author

p5pRT commented Apr 10, 2004

From @muir

Just noticed something else. An assignment though a reference
to a tied hash key will disassociate any other similar references.
This happens even if there has never been a delete.

print "# block at ".__LINE__."\n";
{
  my %x;
  tie %x, 'Hash1', {};

  $x{y} = 7;
  my $a = \$x{y};
  my $b = \$x{y};
  $x{y} = 9;

  ok($$a == 9);
  ok($$b == 9);

  $$a = 10;

  ok($$b == 9); # bug
}

@p5pRT
Copy link
Author

p5pRT commented Apr 13, 2004

From @muir

This should be merged with bug # 27555.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants