Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[documentation] Hash::Util::FieldHash #9134

Open
p5pRT opened this issue Nov 29, 2007 · 16 comments
Open

[documentation] Hash::Util::FieldHash #9134

p5pRT opened this issue Nov 29, 2007 · 16 comments

Comments

@p5pRT
Copy link

p5pRT commented Nov 29, 2007

Migrated from rt.perl.org#47948 (status was 'open')

Searchable as RT47948$

@p5pRT
Copy link
Author

p5pRT commented Nov 29, 2007

From A_M_C@bigpond.net.au

This is a bug report for perl from Michael Cartmell,
generated with the help of perlbug 1.36 running under perl 5.10.0.


In the documentation for Hash​::Util​::FieldHash, the section
Garbage-Collected Hashes says that deleting a key whilst iterating
through the hash will cause bad things to happen (I paraphrase). This is
at variance with a recent posting by Mark Dominus at
http​://blog.plover.com/ where he says that adding keys will cause
unpredictable behaviour but deleting keys will not. In fact a fair bit
of trouble was taken Larry Wall to make sure it doesn't.

Assuming MJD is correct and the implementation of hashes in 5.10.0
hasn't changed this behaviour (it was current at 5.8.8) the
documentation should warn about *creating* objects whilst iterating over
fields.



Flags​:
  category=library
  severity=low


Site configuration information for perl 5.10.0​:

Configured by michael at Tue Nov 27 17​:34​:32 EST 2007.

Summary of my perl5 (revision 5 version 10 subversion 0) configuration​:
  Platform​:
  osname=linux, osvers=2.6.22.12-0.1-default,
archname=i686-linux-thread-multi-64int-ld
  uname='linux lnx-main 2.6.22.12-0.1-default #1 smp 20071106 23​:05​:18
utc i686 athlon i386 gnulinux '
  config_args=''
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=define, usemultiplicity=define
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=define, use64bitall=undef, uselongdouble=define
  usemymalloc=y, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing
-pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize='-O2 -march=i686',
  cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe
-I/usr/local/include'
  ccversion='', gccversion='4.2.1 (SUSE Linux)', gccosandvers=''
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
  ivtype='long long', ivsize=8, nvtype='long double', nvsize=12,
Off_t='off_t', lseeksize=8
  alignbytes=4, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags =' -L/usr/local/lib'
  libpth=/usr/local/lib /lib /usr/lib
  libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
  perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
  libc=/lib/libc-2.6.1.so, so=so, useshrplib=false, libperl=libperl.a
  gnulibc_version='2.6.1'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
  cccdlflags='-fPIC', lddlflags='-shared -O2 -march=i686 -L/usr/local/lib'

Locally applied patches​:
  RC2


@​INC for perl 5.10.0​:
  /net/perl/5.10.0-rc2/lib/i686-linux-thread-multi-64int-ld
  /net/perl/5.10.0-rc2/lib
  /net/perl/5.10.0-rc2/lib/site_perl/i686-linux-thread-multi-64int-ld
  /net/perl/5.10.0-rc2/lib/site_perl
  .


Environment for perl 5.10.0​:
  HOME=/home/michael
  LANG=en_GB.UTF-8
  LANGUAGE (unset)
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)

PATH=/net/perl/5.10.0-rc2/bin​:/opt/kde3/bin​:/net/perl/5.8.8/bin​:/home/michael/bin​:/usr/local/bin​:/usr/bin​:/bin​:/usr/bin/X11​:/usr/X11R6/bin​:/usr/games​:/usr/lib/mit/bin​:/usr/lib/mit/sbin​:/usr/lib/qt3/bin
  PERL_BADLANG (unset)
  SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Jan 13, 2012

From @jkeenan

On Thu Nov 29 02​:44​:21 2007, A_M_C@​bigpond.net.au wrote​:

-----------------------------------------------------------------
In the documentation for Hash​::Util​::FieldHash, the section
Garbage-Collected Hashes says that deleting a key whilst iterating
through the hash will cause bad things to happen (I paraphrase). This
is
at variance with a recent posting by Mark Dominus at
http​://blog.plover.com/ where he says that adding keys will cause
unpredictable behaviour but deleting keys will not. In fact a fair bit
of trouble was taken Larry Wall to make sure it doesn't.

Assuming MJD is correct and the implementation of hashes in 5.10.0
hasn't changed this behaviour (it was current at 5.8.8) the
documentation should warn about *creating* objects whilst iterating
over
fields.

I am attaching the relevant section of the POD for Hash​::Util​::FieldHash
as it appears in Perl 5.14.2.

p5p list​: Does the OP's contention about the documentation have merit?
If so, what is to be done?

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Jan 13, 2012

@p5pRT
Copy link
Author

p5pRT commented Jan 13, 2012

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jan 13, 2012

From @xdg

On Thu, Jan 12, 2012 at 8​:05 PM, James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

Assuming MJD is correct and the implementation of hashes in 5.10.0
hasn't changed this behaviour (it was current at 5.8.8) the
documentation should warn about *creating* objects whilst iterating
over
fields.

p5p list​:  Does the OP's contention about the documentation have merit?
 If so, what is to be done?

MJD only states that deleting the current hash entry while iterating
with each() is safe. The problem with fieldhashes is that the
destruction of an inside-out object with data stored in the hash will
delete the corresponding key. That is an *extra* restriction over
what MJD points out. It should be safe to destroy the object
corresponding to the key returned by each, as that is just like
deleting the hash entry for that key. The concern in the
documentation is avoiding code that destroys *other* objects
referenced in the hash, which is tantamount to deleting keys, which is
unsafe.

In other words, "not only do you have to not add/delete keys during
iteration, you have to not add/delete objects that store data in the
hash as well -- except for deleting the current key/object, which is
safe"

-- David

@p5pRT
Copy link
Author

p5pRT commented Jan 13, 2012

From @cpansprout

On Thu Jan 12 17​:26​:42 2012, xdaveg@​gmail.com wrote​:

On Thu, Jan 12, 2012 at 8​:05 PM, James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

Assuming MJD is correct and the implementation of hashes in 5.10.0
hasn't changed this behaviour (it was current at 5.8.8) the
documentation should warn about *creating* objects whilst iterating
over
fields.

p5p list​: �Does the OP's contention about the documentation have merit?
�If so, what is to be done?

MJD only states that deleting the current hash entry while iterating
with each() is safe. The problem with fieldhashes is that the
destruction of an inside-out object with data stored in the hash will
delete the corresponding key. That is an *extra* restriction over
what MJD points out. It should be safe to destroy the object
corresponding to the key returned by each, as that is just like
deleting the hash entry for that key. The concern in the
documentation is avoiding code that destroys *other* objects
referenced in the hash, which is tantamount to deleting keys, which is
unsafe.

In other words, "not only do you have to not add/delete keys during
iteration, you have to not add/delete objects that store data in the
hash as well -- except for deleting the current key/object, which is
safe"

My question is​: Why is it unsafe at all to create or delete things from
a hash that is being iterated?

There have been at least three fixes to the deletion code since 5.14.
Could it have anything to do with those?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jan 13, 2012

From @Leont

On Fri, Jan 13, 2012 at 3​:08 AM, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

My question is​: Why is it unsafe at all to create or delete things from
a hash that is being iterated?

There have been at least three fixes to the deletion code since 5.14.
Could it have anything to do with those?

One obvious problem would be reallocation. It can reorder the buckets,
thus causing the iteration to become unreliable. This is one reason
why C++ initially supported only treemaps​: they provide reliable
iteration even in the face of map mutation.

Leon

@p5pRT
Copy link
Author

p5pRT commented Jan 13, 2012

From @xdg

On Thu, Jan 12, 2012 at 9​:08 PM, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

My question is​: Why is it unsafe at all to create or delete things from
a hash that is being iterated?

Here is the relevant MJD post, I think​:
http​://blog.plover.com/prog/perl/undefined.html#3

There have been at least three fixes to the deletion code since 5.14.
Could it have anything to do with those?

MJD looks like it was legacy Larry code that had the problem.

I think it's all to avoid requiring two passes.

Should it be (now or in the future) safe to delete arbitrary keys
during iteration, then the same would apply to inside-out object
destruction.

-- David

@p5pRT
Copy link
Author

p5pRT commented Jan 13, 2012

From @cpansprout

On Thu Jan 12 18​:38​:38 2012, xdaveg@​gmail.com wrote​:

On Thu, Jan 12, 2012 at 9​:08 PM, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

My question is​: Why is it unsafe at all to create or delete things from
a hash that is being iterated?

Here is the relevant MJD post, I think​:
http​://blog.plover.com/prog/perl/undefined.html#3

There have been at least three fixes to the deletion code since 5.14.
Could it have anything to do with those?

MJD looks like it was legacy Larry code that had the problem.

I think it's all to avoid requiring two passes.

Should it be (now or in the future) safe to delete arbitrary keys
during iteration, then the same would apply to inside-out object
destruction.

It has been safe, for as long as I’ve been familiar with the code,
except for two fairly obscure bugs, which are now fixed. (In one case,
keys() on a empty hash would not reset the iterator after the last item
was deleted during iteration; in the other, deleting the current
iterator could leak memory in certain cases, but I don’t remember the
details.)

Now (and previously, most of the time), deleting an element will just
stop it from being iterated over if it has not already been iterated
over. An element added during iteration may or may not be iterated over
in the current iteration.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jan 13, 2012

From @ikegami

On Fri, Jan 13, 2012 at 5​:10 PM, Father Chrysostomos via RT <
perlbug-followup@​perl.org> wrote​:

Should it be (now or in the future) safe to delete arbitrary keys
during iteration, then the same would apply to inside-out object
destruction.

It has been safe, for as long as I’ve been familiar with the code,
except for two fairly obscure bugs, which are now fixed.

A bit of black box testing (attached) finds no problem getting the right
elements back from each in combination with deletions.

- Eric

@p5pRT
Copy link
Author

p5pRT commented Jan 13, 2012

From @ikegami

use strict;
use warnings;

use Test​::More tests => 2*26;

for ('a'..'z') {
  my %h = map { $_ => 1 } 'a'..'z';

  my @​k;
  while (my ($k) = each(%h)) {
  push @​k, $k;
  delete $h{$k} if $k eq $_;
  }

  is( join('', sort @​k), join('', 'a'..'z') );
}

for ('a'..'z') {
  my %h = map { $_ => 1 } 'a'..'z';

  my @​k;
  while (my ($k) = each(%h)) {
  if ($k eq 'a') {
  delete $h{b}
  and push @​k, 'b';
  } else {
  delete $h{a}
  and push @​k, 'a';
  }

  push @​k, $k;
  }

  is( join('', sort @​k), join('', 'a'..'z') );
}

1;

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2013

From @rjbs

The docs on `each` were revisited during the hash overhaul in 5.17. The
assertion is that you can delete the most recently retrieved key safely, but
deleting any other key may change the iteration order. I believe this was also
asserted on the list.

The docs for FieldHash say​:

If anything you do inside the loop could cause an object to go out of scope,
a random key may be deleted from the hash you are looping over.

This matches the assertion in the `each` docs.

Unless perlfunc's entry for each is wrong, there is no contradiction.

I believe Yves said that he did work to keep deleting the current key safe.
MJD's mistake, if any, would be to say that you could delete *any* key you'd
already visited, rather than the *one* key you *just* got.

I think the docs are correct. Perhaps someone -- Yves? -- could confirm.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2013

From @demerphq

On 10 July 2013 03​:56, Ricardo Signes <perl.p5p@​rjbs.manxome.org> wrote​:

The docs on `each` were revisited during the hash overhaul in 5.17. The
assertion is that you can delete the most recently retrieved key safely, but
deleting any other key may change the iteration order. I believe this was also
asserted on the list.

The docs for FieldHash say​:

If anything you do inside the loop could cause an object to go out of scope,
a random key may be deleted from the hash you are looping over.

This matches the assertion in the `each` docs.

Unless perlfunc's entry for each is wrong, there is no contradiction.

I believe Yves said that he did work to keep deleting the current key safe.
MJD's mistake, if any, would be to say that you could delete *any* key you'd
already visited, rather than the *one* key you *just* got.

I think the docs are correct. Perhaps someone -- Yves? -- could confirm.

My understanding of the rules regarding native perl hashes is that we
make the weakest guarantee possible​: that deleting the most recent key
returned by each is "safe", which means that we will visit the
remaining keys and we will not visit any keys we have already seen. We
do not even commit to visiting the remaining keys in the same order we
would have if we had not deleted an item.

Which means that deleting an arbitrary key is allowed to change the order.

The current implementation as far as I know actually results in *all*
deletes being safe, but we make no such guarantee.

But I believe that the documentation in question is about field
hashes, and not normal perl hashes, so it seems to me the question is
irrelevant to answering this ticket. They seem like some kind of tie,
and are definitely magic, so it seems to me they can do whatever they
like.

cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2013

From @cpansprout

On Wed Jul 10 13​:15​:12 2013, demerphq wrote​:

But I believe that the documentation in question is about field
hashes, and not normal perl hashes, so it seems to me the question is
irrelevant to answering this ticket. They seem like some kind of tie,
and are definitely magic, so it seems to me they can do whatever they
like.

Field hashes are actually quite different from tied hashes. They are
regular perl hashes with extra magic attached to the elements to do some
extra bookkeeping behind the scenes. But that magic does not intercept
the regular store/fetch operations of plain hashes.

In fact, the field hash magic is orthogonal to ties, so you can actually
tie a field hash. namespace​::clean uses that.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2013

From @demerphq

On 11 July 2013 02​:51, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

On Wed Jul 10 13​:15​:12 2013, demerphq wrote​:

But I believe that the documentation in question is about field
hashes, and not normal perl hashes, so it seems to me the question is
irrelevant to answering this ticket. They seem like some kind of tie,
and are definitely magic, so it seems to me they can do whatever they
like.

Field hashes are actually quite different from tied hashes. They are
regular perl hashes with extra magic attached to the elements to do some
extra bookkeeping behind the scenes. But that magic does not intercept
the regular store/fetch operations of plain hashes.

In fact, the field hash magic is orthogonal to ties, so you can actually
tie a field hash. namespace​::clean uses that.

Be that as it may field hashes aren't normal hashes, and until someone
carefully reviews the implementation I would assume that the
documentation by the author is correct.

Clearly there is *some* form of magic involved, the magic could do
whatever it wants, including modifying the hash during traversal,
which depending on how it was implemented *could* "throw the hash
iterator".

I stand by my claim that field hashes aren't normal hashes, and should
not be expected to behave as them. It really doesn't matter if they
are a true tie, or some other kind of magic hash, the fact is they are
magic, and the only guarantees we make are about *non* magic hashes.

So IMO field hashes can do whatever they like and all the commentary
in this thread regarding the documentation of *true* perl hashes is
irrelevant.

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2013

From @rjbs

Re-reading the original bug report, I believe we have been totally sidetracked
by the question of deletion. The report was that the the FieldHash docs should
warn also about *creation*, rather than that it should *not* warn about
deletion.

Maybe I'll look at testing this behavior, but I expect it will be just the same
as normal hashes.

--
rjbs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants