Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalars vs globs (*{} should not return FAKE globs) #10625

Closed
p5pRT opened this issue Sep 12, 2010 · 11 comments
Closed

Scalars vs globs (*{} should not return FAKE globs) #10625

p5pRT opened this issue Sep 12, 2010 · 11 comments

Comments

@p5pRT
Copy link

p5pRT commented Sep 12, 2010

Migrated from rt.perl.org#77810 (status was 'resolved')

Searchable as RT77810$

@p5pRT
Copy link
Author

p5pRT commented Sep 12, 2010

From @cpansprout

There are various bugs in perl related to the fact that globs are scalars.

Various Perl operators need to distinguish between globs and scalars, even though the former are a subset of the latter.

Now, there are two types of globs​: autovivified globs (I call these ‘real globs’) and copies of globs (fake globs or glob copies).

The first kind, a real glob, is what you get when you write *foo (prior stash manipulation aside). A real glob is always a glob. That sv is never downgraded to anything else. Even passing it to undef does not undefine it, but only its contents.

The second occurs when you write ‘$a = *foo’. These are marked with the SVf_FAKE flag. I think the original purpose of the distinction was to allow $a to be assigned something other than a glob after it had been assigned a glob.

Now different operators make the distinction between globs and scalar in different ways. For example, the = operator (in 5.12 and earlier) or, rather, sv_setsv_flags treats only real globs on the LHS as globs. Fake globs are scalars.

Other operators, such as tie(d), untie and *{}, treat any glob as a glob, and not as a scalar.

Now the *{} dereference operator returns its operand if it sees it as a glob.

This means that
• *$glob_copy = [] is the same as $glob_copy = []
• tie *$glob_copy is the same as tie $glob_copy
• untie $scalar doesn​::t work if the last thing assigned was a glob.

(The first of those three examples is fixed in blead with a workaround for the actual problem. We now treat *{...}= as a special kind of assignment operator. List assignment is still broken.)

If the patch for #77362 (glob-to-lvalue assignment) is applied, then there will be this problem, too​:

(*$pvlv_glob) = [];

This innocent-looking statement will start assigning [] to $pvlv_glob, instead of the array slot of the GP. (Currently globs are stringified when assigned to PVLVs.)

Now the real problem here seems to be that *{} is returning glob copies. We could make this operator return GvEGV(sv) in the presence of the SVf_FAKE flag, and make all operators that need to distinguish between globs and scalars check the SVf_FAKE flag.

This would fix all these bugs in one fell swoop​:
• 77496 tied $tie when $tie holds a glob
• 77502 *{\$tie} when $tie holds a glob
• 77508 (*$glob_copy) = ...
• 77688 tie $glob_copy
• The untie case mentioned above
• 71686 globs can be un-globbed (fixed; but the fix can be reverted, simplifying the code)
• 1804 *$glob_copy = ... (likewise fixed)
• part of 77684 (opening \*$glob_copy)
And the changes to tie_fetch_count.t in the patch for 77598 will be unnecessary.

(You can consider this a meta-ticket for all those.)

(Some of those would actually still need fixing, but the fix would be to check a single flag, instead of the more complex, more intrusive patches already in some of those tickets.)

There are two problems that can result​:

• Putting a fake glob in the symbol table will have the following effect​:

  $​::{foo} = *bar; # *foo is now fake
  *foo = $baz; # aliases both *foo AND *bar to *baz.

This can be solved easily by making *{} turn the fake flag off if it is doing a symbolic dereference (or the compile-time equivalent). Having fake globs in the symbol table is questionable anyway.

• This code​:

  $foo_copy = *foo;
  *$foo_copy = *bar;

will make *foo (aka GvEGV($foo_copy)) a copy of *bar, instead of making $foo_copy a copy of *bar.

This is the most serious problem.

The only solution I can think of so far is to return some sort of magical variable that will delegate assignment to the fake glob, turning off its fake flag temporarily during the assignment. So we will be back to hacks upon hacks, but at least this hack will solve *all* the bugs above, allowing two existing hacks in blead to be removed.

This will change the meaning of \*$glob_copy, making it similar to \*$io. I think that’s a good thing.


Flags​:
  category=core
  severity=medium


Site configuration information for perl 5.13.4​:

Configured by sprout at Sun Sep 5 17​:19​:23 PDT 2010.

Summary of my perl5 (revision 5 version 13 subversion 4 patch v5.13.4-144-gf1dcae2) configuration​:
  Snapshot of​: f1dcae2
  Platform​:
  osname=darwin, osvers=10.4.0, archname=darwin-ld-2level
  uname='darwin pint.local 10.4.0 darwin kernel version 10.4.0​: fri apr 23 18​:28​:53 pdt 2010; root​:xnu-1504.7.4~1release_i386 i386 '
  config_args='-Dusedevel -des -Ui_db -Uuseithreads -Duselongdouble'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=undef, usemultiplicity=undef
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=undef, use64bitall=undef, uselongdouble=define
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-fno-common -DPERL_DARWIN -no-cpp-precomp -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include',
  optimize='-O3',
  cppflags='-no-cpp-precomp -fno-common -DPERL_DARWIN -no-cpp-precomp -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='4.2.1 (Apple Inc. build 5664)', gccosandvers=''
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=4, nvtype='long double', nvsize=16, Off_t='off_t', lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib
  libs=-ldbm -ldl -lm -lutil -lc
  perllibs=-ldl -lm -lutil -lc
  libc=/usr/lib/libc.dylib, so=dylib, useshrplib=false, libperl=libperl.a
  gnulibc_version=''
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
  cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -L/usr/local/lib -fstack-protector'

Locally applied patches​:
 


@​INC for perl 5.13.4​:
  /usr/local/lib/perl5/site_perl/5.13.4/darwin-ld-2level
  /usr/local/lib/perl5/site_perl/5.13.4
  /usr/local/lib/perl5/5.13.4/darwin-ld-2level
  /usr/local/lib/perl5/5.13.4
  /usr/local/lib/perl5/site_perl
  .


Environment for perl 5.13.4​:
  DYLD_LIBRARY_PATH (unset)
  HOME=/Users/sprout
  LANG=en_US.UTF-8
  LANGUAGE (unset)
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)
  PATH=/usr/bin​:/bin​:/usr/sbin​:/sbin​:/usr/local/bin​:/usr/X11/bin​:/usr/local/bin
  PERL_BADLANG (unset)
  SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Sep 19, 2010

From @cpansprout

On Sep 12, 2010, at 12​:28 PM, Father Chrysostomos wrote​:

Now the real problem here seems to be that *{} is returning glob copies. We could make this operator return GvEGV(sv) in the presence of the SVf_FAKE flag, and make all operators that need to distinguish between globs and scalars check the SVf_FAKE flag.

I tried just making it return a new glob if what it would return is flagged FAKE (unless it’s in the symtab, in which case it is reified), to see what would break.

t/io/defout.t fails, because select() is no longer able to make PL_defoutgv point to something other than a GV. (I would consider that a bug fix.)

In that case, I don’t know that it’s possible to set PL_defoutgv to a non-GV any more, so is that test file still needed?

Another thing this changes is the *$foo_copy = $bar case, mentioned in the original bug report.

The magical glob I suggested would complicate things too much and cause even more buggy edge cases, as the fake glob that the new glob is delegating to may not always be a glob.

Since assignment to glob copies has been so buggy and unreliable in the past, this change may be acceptable as it is. But it may be better to make *{} on the left-hand side of an assignment special, so it has the current (blead) behaviour. (That is, it returns its argument if it’s a glob, fake on not, and = turns of the fakeness temporarily during assignment.)

Does anyone else have any opinions?

@p5pRT
Copy link
Author

p5pRT commented Sep 19, 2010

From @cpansprout

From​: Father Chrysostomos <sprout@​cpan.org>

[perl #77810] Scalars vs globs

Stop *{} from returning globs with the SVf_FAKE flag on.

Inline Patch
diff -up blead/op.c blead-77810-rv2gv/op.c
--- blead/op.c	2010-09-13 08:50:10.000000000 -0700
+++ blead-77810-rv2gv/op.c	2010-09-16 17:57:10.000000000 -0700
@@ -7189,6 +7189,8 @@ Perl_ck_rvconst(pTHX_ register OP *o)
 #endif
 	    kid->op_private = 0;
 	    kid->op_ppaddr = PL_ppaddr[OP_GV];
+	    /* FAKE globs in the symbol table cause weird bugs (#77810) */
+	    SvFAKE_off(gv);
 	}
     }
     return o;
diff -up blead/pp.c blead-77810-rv2gv/pp.c
--- blead/pp.c	2010-09-06 08:29:10.000000000 -0700
+++ blead-77810-rv2gv/pp.c	2010-09-16 18:27:01.000000000 -0700
@@ -213,11 +213,19 @@ PP(pp_rv2gv)
 		}
 		sv = MUTABLE_SV(gv_fetchsv(sv, GV_ADD, SVt_PVGV));
 	    }
+	    /* FAKE globs in the symbol table cause weird bugs (#77810) */
+	    if (sv) SvFAKE_off(sv);
 	}
     }
     if (PL_op->op_private & OPpLVAL_INTRO)
 	save_gp(MUTABLE_GV(sv), !(PL_op->op_flags & OPf_SPECIAL));
-    SETs(sv);
+    if (sv && SvFAKE(sv)) {
+	SV *newsv = sv_newmortal();
+	sv_setsv(newsv, sv);
+	SvFAKE_off(newsv);
+	SETs(newsv);
+    }
+    else SETs(sv);
     RETURN;
 }
 
diff -rup blead/t/op/gv.t blead-77810-rv2gv/t/op/gv.t
--- blead/t/op/gv.t	2010-09-07 20:17:10.000000000 -0700
+++ blead-77810-rv2gv/t/op/gv.t	2010-09-16 18:07:26.000000000 -0700
@@ -12,7 +12,7 @@ BEGIN {
 use warnings;
 
 require './test.pl';
-plan( tests => 194 );
+plan( tests => 199 );
 
 # type coersion on assignment
 $foo = 'foo';
@@ -32,6 +32,34 @@ is(ref(\$foo), 'GLOB');
 is($foo, '*main::bar');
 is(ref(\$foo), 'GLOB');
 
+{
+ no warnings;
+ ${\*$foo} = undef;
+ is(ref(\$foo), 'GLOB', 'no type coersion when assigning to *{} retval');
+ $::{phake} = *bar;
+ is(
+   \$::{phake}, \*{"phake"},
+   'symbolic *{} returns symtab entry when FAKE'
+ );
+ ${\*{"phake"}} = undef;
+ is(
+   ref(\$::{phake}), 'GLOB',
+  'no type coersion when assigning to retval of symbolic *{}'
+ );
+ $::{phaque} = *bar;
+ eval '
+   is(
+     \$::{phaque}, \*phaque,
+     "compile-time *{} returns symtab entry when FAKE"
+   );
+   ${\*phaque} = undef;
+ ';
+ is(
+   ref(\$::{phaque}), 'GLOB',
+  'no type coersion when assigning to retval of compile-time *{}'
+ );
+}
+
 # type coersion on substitutions that match
 $a = *main::foo;
 $b = $a;

@p5pRT
Copy link
Author

p5pRT commented Oct 3, 2010

From @cpansprout

On Sep 27, 2010, at 1​:18 PM, Slaven Rezic wrote​:

Father Chrysostomos <sprout@​cpan.org> writes​:

On Sep 19, 2010, at 3​:18 PM, Slaven Rezic wrote​:

Father Chrysostomos <sprout@​cpan.org> writes​:

Is there any chance you could use your script to test all of CPAN with the patch at <http​://rt.perl.org/rt3/Ticket/Display.html?id=77810>?

If we can find out whether anything breaks, it will help in determining whether *{...}= needs special treatment.

What approach do you propose --- use stock perl 5.13.4 against the same
perl just with your patch?

5.13.5 would be better.

Now it's finished. After testing approx. 12800 distributions I found no
regression.

Thank you. That’s very good news.

I’m going to forward this to the RT ticket.

@p5pRT
Copy link
Author

p5pRT commented Oct 24, 2010

From @cpansprout

On Sun Sep 19 12​:07​:29 2010, sprout wrote​:

On Sep 12, 2010, at 12​:28 PM, Father Chrysostomos wrote​:

Now the real problem here seems to be that *{} is returning glob
copies. We could make this operator return GvEGV(sv) in the
presence of the SVf_FAKE flag, and make all operators that need to
distinguish between globs and scalars check the SVf_FAKE flag.

I tried just making it return a new glob if what it would return is
flagged FAKE (unless it’s in the symtab, in which case it is
reified), to see what would break.

And I’ve just applied it as 2acc331.

If this proves unacceptable for 5.14 and is reverted, it might also be a
good idea to revert the fixes for 36051 (lvalue globs), 71686 (globs can
be un-globbed) and 1804 (anon. glob breaks when assigned through), as
they all introduce regressions that this commit allows me to fix.

@p5pRT
Copy link
Author

p5pRT commented Oct 24, 2010

@cpansprout - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Oct 26, 2010

@cpansprout - Status changed from 'open' to 'resolved'

@p5pRT p5pRT closed this as completed Oct 26, 2010
@p5pRT
Copy link
Author

p5pRT commented Jan 31, 2011

From @jbenjore

On Sun, Oct 24, 2010 at 4​:22 PM, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

On Sun Sep 19 12​:07​:29 2010, sprout wrote​:

On Sep 12, 2010, at 12​:28 PM, Father Chrysostomos wrote​:

Now the real problem here seems to be that *{} is returning glob
   copies. We could make this operator return GvEGV(sv) in the
   presence of the SVf_FAKE flag, and make all operators that need to
   distinguish between globs and scalars check the SVf_FAKE flag.

I tried just making it return a new glob if what it would return is
   flagged FAKE (unless it’s in the symtab, in which case it is
   reified), to see what would break.

And I’ve just applied it as 2acc331.

If this proves unacceptable for 5.14 and is reverted, it might also be a
good idea to revert the fixes for 36051 (lvalue globs), 71686 (globs can
be un-globbed) and 1804 (anon. glob breaks when assigned through), as
they all introduce regressions that this commit allows me to fix.

This broke Data​::Dump​::Streamer
(https://rt.cpan.org/Ticket/Display.html?id=65272&results=7df022c760a8288f87e03aa553e241ec).
I'm unsure if this is now a bug in how B works. I have some un-even
success with the method call ->object_2svref "fixing" the data and
preventing the assert() failures.

Josh

@p5pRT
Copy link
Author

p5pRT commented Feb 1, 2011

From @jbenjore

On Sun, Jan 30, 2011 at 9​:03 PM, Joshua ben Jore <twists@​gmail.com> wrote​:

On Sun, Oct 24, 2010 at 4​:22 PM, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

On Sun Sep 19 12​:07​:29 2010, sprout wrote​:

On Sep 12, 2010, at 12​:28 PM, Father Chrysostomos wrote​:

Now the real problem here seems to be that *{} is returning glob
   copies. We could make this operator return GvEGV(sv) in the
   presence of the SVf_FAKE flag, and make all operators that need to
   distinguish between globs and scalars check the SVf_FAKE flag.

I tried just making it return a new glob if what it would return is
   flagged FAKE (unless it’s in the symtab, in which case it is
   reified), to see what would break.

And I’ve just applied it as 2acc331.

If this proves unacceptable for 5.14 and is reverted, it might also be a
good idea to revert the fixes for 36051 (lvalue globs), 71686 (globs can
be un-globbed) and 1804 (anon. glob breaks when assigned through), as
they all introduce regressions that this commit allows me to fix.

This broke Data​::Dump​::Streamer
(https://rt.cpan.org/Ticket/Display.html?id=65272&results=7df022c760a8288f87e03aa553e241ec).
I'm unsure if this is now a bug in how B works. I have some un-even
success with the method call ->object_2svref "fixing" the data and
preventing the assert() failures.

Have worked around this in DDS-2.26. The key here is keep the glob
reference in existence while using B to inspect it. What seems to
happen is that the glob reference is cleaned up since when I looked,
SvFLAGS(sv) == SVTYPEMASK which is what's true for values that have
just been freed. I've just pushed DDS-2.26 including the fix at
mgruberman/Data-Dump-Streamer@3e56eb8.

Incorrect

  $Bobj = B​::svref_2object(\*$name);
  $Bobj->FORM;

Correct

  my $glob_ref = \*$name;
  $Bobj = B​::svref_2object($glob_ref);
  $Bobj->FORM;

Is this a new bug in perl?

Josh

@p5pRT
Copy link
Author

p5pRT commented Feb 6, 2011

From @cpansprout

On Mon Jan 31 19​:18​:44 2011, jjore wrote​:

On Sun, Jan 30, 2011 at 9​:03 PM, Joshua ben Jore <twists@​gmail.com>
wrote​:

On Sun, Oct 24, 2010 at 4​:22 PM, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

On Sun Sep 19 12​:07​:29 2010, sprout wrote​:

On Sep 12, 2010, at 12​:28 PM, Father Chrysostomos wrote​:

Now the real problem here seems to be that *{} is returning glob
   copies. We could make this operator return GvEGV(sv) in the
   presence of the SVf_FAKE flag, and make all operators that need
to
   distinguish between globs and scalars check the SVf_FAKE flag.

I tried just making it return a new glob if what it would return
is
   flagged FAKE (unless it’s in the symtab, in which case it is
   reified), to see what would break.

And I’ve just applied it as 2acc331.

If this proves unacceptable for 5.14 and is reverted, it might also
be a
good idea to revert the fixes for 36051 (lvalue globs), 71686
(globs can
be un-globbed) and 1804 (anon. glob breaks when assigned through),
as
they all introduce regressions that this commit allows me to fix.

This broke Data​::Dump​::Streamer

(https://rt.cpan.org/Ticket/Display.html?id=65272&results=7df022c760a8288f87e03aa553e241ec).

I'm unsure if this is now a bug in how B works. I have some un-even
success with the method call ->object_2svref "fixing" the data and
preventing the assert() failures.

Have worked around this in DDS-2.26. The key here is keep the glob
reference in existence while using B to inspect it. What seems to
happen is that the glob reference is cleaned up since when I looked,
SvFLAGS(sv) == SVTYPEMASK which is what's true for values that have
just been freed. I've just pushed DDS-2.26 including the fix at
https://github.com/jbenjore/Data-Dump-
Streamer/commit/3e56eb816b5cd2ad66abd71c5568941dfdbd7bfc.

Incorrect

$Bobj = B&#8203;::svref\_2object\(\\\*$name\);
$Bobj\->FORM;

Correct

my $glob\_ref = \\\*$name;
$Bobj = B&#8203;::svref\_2object\($glob\_ref\);
$Bobj\->FORM;

Is this a new bug in perl?

This was an intentional incompatible change that made *{$glob_copy}
create a new glob with the FAKE flag off (aka an immutable glob).

It’s the glob returned by *{} that is being freed, not the reference to
it. The same thing happens with svref_2object([]). I don’t know that
there’s any way to fix that.

The variable $lhs in DDS gets a glob assigned to it at some point (I
didn’t check where), so it ends up containing glob copy (a glob marked
FAKE).

B​::svref_2object(ref \$lhs eq 'GLOB' ? \$lhs : \*$lhs) is another,
slightly more efficient workaround. Avoiding copying the glob to begin
with might be even more efficient.

@p5pRT
Copy link
Author

p5pRT commented Feb 7, 2011

From @jbenjore

On Sun, Feb 6, 2011 at 1​:19 PM, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

On Mon Jan 31 19​:18​:44 2011, jjore wrote​:

On Sun, Jan 30, 2011 at 9​:03 PM, Joshua ben Jore <twists@​gmail.com>
wrote​:

On Sun, Oct 24, 2010 at 4​:22 PM, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

On Sun Sep 19 12​:07​:29 2010, sprout wrote​:

On Sep 12, 2010, at 12​:28 PM, Father Chrysostomos wrote​:

Now the real problem here seems to be that *{} is returning glob
   copies. We could make this operator return GvEGV(sv) in the
   presence of the SVf_FAKE flag, and make all operators that need
to
   distinguish between globs and scalars check the SVf_FAKE flag.

I tried just making it return a new glob if what it would return
is
   flagged FAKE (unless it’s in the symtab, in which case it is
   reified), to see what would break.

And I’ve just applied it as 2acc331.

If this proves unacceptable for 5.14 and is reverted, it might also
be a
good idea to revert the fixes for 36051 (lvalue globs), 71686
(globs can
be un-globbed) and 1804 (anon. glob breaks when assigned through),
as
they all introduce regressions that this commit allows me to fix.

This broke Data​::Dump​::Streamer

(https://rt.cpan.org/Ticket/Display.html?id=65272&results=7df022c760a8288f87e03aa553e241ec).

I'm unsure if this is now a bug in how B works. I have some un-even
success with the method call ->object_2svref "fixing" the data and
preventing the assert() failures.

Have worked around this in DDS-2.26. The key here is keep the glob
reference in existence while using B to inspect it. What seems to
happen is that the glob reference is cleaned up since when I looked,
SvFLAGS(sv) == SVTYPEMASK which is what's true for values that have
just been freed. I've just pushed DDS-2.26 including the fix at
https://github.com/jbenjore/Data-Dump-
Streamer/commit/3e56eb816b5cd2ad66abd71c5568941dfdbd7bfc.

Incorrect

    $Bobj = B​::svref_2object(\*$name);
    $Bobj->FORM;

Correct

    my $glob_ref = \*$name;
    $Bobj = B​::svref_2object($glob_ref);
    $Bobj->FORM;

Is this a new bug in perl?

This was an intentional incompatible change that made *{$glob_copy}
create a new glob with the FAKE flag off (aka an immutable glob).

It’s the glob returned by *{} that is being freed, not the reference to
it. The same thing happens with svref_2object([]). I don’t know that
there’s any way to fix that.

The variable $lhs in DDS gets a glob assigned to it at some point (I
didn’t check where), so it ends up containing glob copy (a glob marked
FAKE).

B​::svref_2object(ref \$lhs eq 'GLOB' ? \$lhs : \*$lhs) is another,
slightly more efficient workaround. Avoiding copying the glob to begin
with might be even more efficient.

Thanks for the info and clarification that this is merely an internal
API change. I've now got to consider whether DDS is doing the wrong
thing by copying the value instead of just a reference.

Jos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant