Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lexical scoping of "my" quite odd in postfix loops #7140

Open
p5pRT opened this issue Mar 8, 2004 · 11 comments
Open

Lexical scoping of "my" quite odd in postfix loops #7140

p5pRT opened this issue Mar 8, 2004 · 11 comments

Comments

@p5pRT
Copy link

p5pRT commented Mar 8, 2004

Migrated from rt.perl.org#27517 (status was 'open')

Searchable as RT27517$

@p5pRT
Copy link
Author

p5pRT commented Mar 8, 2004

From @jlokier

Created by @jlokier

Guess what this program prints?

  use strict;
  my $x = 0;
  my $a = "outer";
  my $a = "inner" while ($a++ < 3);
  print "\$a = :$a​:\n";
  my $b = "outer";
  my $b = "inner" for (1, 2, 3);
  print "\$b = :$b​:\n";
  my $c = "outer";
  my $c = "inner" if 1;
  print "\$c = :$c​:\n";

Good guess!

  $a = :​:
  $b = :​:
  $c = :inner​:

The values for $a and $b are very surprising. I'd expect them to
_either_ inherrit the value "inner" from the inner lexical binding, or
to be outside the scope of that binding and inherit the value "outer"
from the outer binding. But no, the variables end up with values
which they were never assigned.

It's not that surprising to get undef. This must necessarily assign
undef to $n even though that value doesn't appear anywhere​:

  my $n = 1 if some_random_condition();

What's surprising is to get undef when the loops do execute at least once.

This code maybe illustrates what is so unexpected​:

  use strict;
  my $x = $_**2 for (1..3);
  print "$x\n";

The sensible (and DWIM) behaviours I *strongly* expect are $x == 9
_or_ a compile time error. But no, the code compiles fine, and prints "\n".

That's too surprising. I suggest changing the semantic to one of the
sensible ones, but if this Perl semantic cannot be changed now, I
suggest a compile-time warning for when variables are bound using "my"
inside a loop construct and their scope is visible outside the scope.

Thanks,
-- Jamie

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl v5.8.0:

Configured by bhcompile'
cf_email='bhcompile at Wed Aug 13 11:45:59 EDT 2003.

Summary of my rderl (revision 5.0 version 8 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.4.21-1.1931.2.382.entsmp, archname=i386-linux-thread-multi
    uname='linux str'
    config_args='-des -Doptimize=-O2 -g -pipe -march=i386 -mcpu=i686 -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Dotherlibdirs=/usr/lib/perl5/5.8.0 -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef'
 useithreads=define usemultiplicity=
    useperlio= d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=un uselongdouble=
    usemymalloc=, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGING -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGING -fno-strict-aliasing -I/usr/local/include -I/usr/include/gdbm'
    ccversion='', gccversion='3.2.2 20030222 (Red Hat Linux 3.2.2-5)', gccosandvers=''
gccversion='3.2.2 200302'
    intsize=r, longsize=r, ptrsize=5, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long'
k', ivsize=4'
ivtype='l, nvtype='double'
o_nonbl', nvsize=, Off_t='', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc'
l', ldflags =' -L/u'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lgdbm -ldb -ldl -lm -lpthread -lc -lcrypt -lutil
    perllibs=
    libc=/lib/libc-2.3.2.so, so=so, useshrplib=true, libperl=libper
    gnulibc_version='2.3.2'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so', d_dlsymun=undef, ccdlflags='-rdynamic -Wl,-rpath,/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE'
    cccdlflags='-fPIC'
ccdlflags='-rdynamic -Wl,-rpath,/usr/lib/perl5', lddlflags='s Unicode/Normalize XS/A'

Locally applied patches:
    MAINT18379


@INC for perl v5.8.0:
    /usr/lib/perl5/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/5.8.0
    /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.0
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.0
    /usr/lib/perl5/vendor_perl
    /usr/lib/perl5/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/5.8.0
    .


Environment for perl v5.8.0:
    HOME=/home/jamie
    LANG=en_GB.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/home/jamie/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash
    dlflags='-share (unset)

@p5pRT
Copy link
Author

p5pRT commented Mar 9, 2004

From @rgs

Jamie Lokier wrote in perl.perl5.porters :

Guess what this program prints?

use strict;
my $x = 0;
my $a = "outer";
my $a = "inner" while \($a\+\+ \< 3\);

Using the terms "outer" and "inner" is misleading.
There is only one scope here; the statement modifiers don't create any.

The perlsyn manpage documents this bug, saying that "the behaviour of a
C<my> statement modified with a statement modifier conditional or loop
construct (e.g. C<my $x if ...>) is B<undefined>."

That's too surprising. I suggest changing the semantic to one of the
sensible ones, but if this Perl semantic cannot be changed now, I
suggest a compile-time warning for when variables are bound using "my"
inside a loop construct and their scope is visible outside the scope.

The plan is to add a deprecation warning for the cases where is
"feature" is abused; notably C<my $x if 0>; and to change the semantics
of this construct in perl 5.12.

@p5pRT
Copy link
Author

p5pRT commented Mar 9, 2004

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Mar 9, 2004

From @iabyn

On Tue, Mar 09, 2004 at 08​:50​:20AM -0000, Rafael Garcia-Suarez wrote​:

Jamie Lokier wrote in perl.perl5.porters :

Guess what this program prints?

use strict;
my $x = 0;
my $a = "outer";
my $a = "inner" while \($a\+\+ \< 3\);

Using the terms "outer" and "inner" is misleading.
There is only one scope here; the statement modifiers don't create any.

Yes, if run with 'use warnings', you get​:

"my" variable $a masks earlier declaration in same scope at /tmp/p line 7.
"my" variable $b masks earlier declaration in same scope at /tmp/p line 10.
"my" variable $c masks earlier declaration in same scope at /tmp/p line 13.

--
You live and learn (although usually you just live).

@p5pRT
Copy link
Author

p5pRT commented Mar 9, 2004

From @jlokier

Dave Mitchell via RT wrote​:

use strict;
my $x = 0;
my $a = "outer";
my $a = "inner" while \($a\+\+ \< 3\);

Using the terms "outer" and "inner" is misleading.
There is only one scope here; the statement modifiers don't create any.

Yes, if run with 'use warnings', you get​:

"my" variable $a masks earlier declaration in same scope at /tmp/p line 7.
"my" variable $b masks earlier declaration in same scope at /tmp/p line 10.
"my" variable $c masks earlier declaration in same scope at /tmp/p line 13.

Ok, but that's not really the point of my bug report. Sorry if that
example was misleading.

Please consider this program​:

  use strict;
  my $x = 0;
  my $a = 1 while ($x++ < 3);
  print $a;

The point is that $a is still visible after the loop, and is
C<undef>ined. I'd expect $a to either have the value 1, or to not be
visible at all and yielding an error from C<use strict> when I try to use it.

-- Jamie

@p5pRT
Copy link
Author

p5pRT commented Mar 9, 2004

From @jlokier

Rafael Garcia-Suarez via RT wrote​:

use strict;
my $x = 0;
my $a = "outer";
my $a = "inner" while \($a\+\+ \< 3\);

Using the terms "outer" and "inner" is misleading.
There is only one scope here; the statement modifiers don't create any.

Yes, I was meaning in the sense that the second $a overrides the
earlier one. In effect it creates an inner binding contour - that's
just a matter of computer language terminology, and I apologise for
the confusion.

Please ignore that example. The important one is​:

  use strict;
  my $x = 1 for (1, 2, 3);
  $x; # Visible after the loop, and undefined.

I found this setting $x to undef counterintuitive, having learned that
this sets $x to a defined value​:

  my $x = 1 if 1;
  $x; # Visible after the conditional, and set to 1.

Of course that's nothing technically wrong with the loop semantic,
it's my intuition which is incorrect. The important thing is that
code in a program looked quite innocent​: the loop condition was
guaranteed to execute once, and the value _assigned_ to $x could
legitimately be undef, so the code appeared to be working in some
tests when in fact it did not.

Hence the suggestion of a warning for this, or for a scope to be
created by the loop modifiers, or (shock!) for $x to retain the value
it was assigned at the last iteration. I would find the latter most
intuitive, and occasionally useful.

The perlsyn manpage documents this bug, saying that "the behaviour of a
C<my> statement modified with a statement modifier conditional or loop
construct (e.g. C<my $x if ...>) is B<undefined>."

Sneaky. That's easy to forget because it's not something one tries
often, and with "if" it works as I'd expect. I don't know whether
other people's expectations are similar to mine. I have used C<my $x
= calculation if ...> several times in real programs.

Unlike, say, C, in Perl if you try something and it works, and C<use
strict> doesn't complain, then it's usually fine, and any rules you
may get a feel for about evaluation order or "my" scoping are
dependable.

In a language like C it's essential to keep in mind what is defined
and undefined and implementation defined, because it's too easy to
write programs with non-deterministic behavior, which happen to work
in the obvious way on one machine at one phase of the moon.

For example, in C evaluation order in expressions is quite flexible,
function arguments aren't evaluated in order, C<x++> should never be
used in the same expression twice, etc. Contrarily, in Perl you may
use C<shift> several times in an expression and know exactly what it
means.

Perl is much more deterministic, and this allows the language to be
learned without memorising everything from the manuals. If an
expression works in Perl and depends on specific evaluation order or
specific "my" binding scope, it's highly likely that behaviour is
dependable in general.

For example, C<my $timeout = 30 + (my $time = time())> is well
defined in Perl and makes both variables visible to later statements;
C<if (defined (my $x = calculation)) { ... } else { ... }> makes the
variable visible in both arms of the conditional, but not after it.

These rules aren't obvious from the manual; they are learned by trying
things to discover how the language works, and then assuming that it's
consistent about it.

Point being that a mention in the manual isn't half as good as a
decent warning, or C<use strict> complaining, or semantics which
aren't surprising.

End of rant. :)

The plan is to add a deprecation warning for the cases where is
"feature" is abused; notably C<my $x if 0>; and to change the semantics
of this construct in perl 5.12.

What will the new semantics be?

I find the current semantic of C<my $x = calculation if maybe_false>
intuitive, it does exactly what I expect, and it's useful too. My
beef is with the loop ones :)

-- Jamie

@p5pRT
Copy link
Author

p5pRT commented Mar 9, 2004

From @iabyn

On Tue, Mar 09, 2004 at 06​:51​:47PM +0000, Jamie Lokier wrote​:

The plan is to add a deprecation warning for the cases where is
"feature" is abused; notably C<my $x if 0>; and to change the semantics
of this construct in perl 5.12.

What will the new semantics be?

  my $x = foo if bar

will be equivalent to​:

  my $x;
  $x = foo if bar

except that the scope of $x wont expand, eg in

  if (my $x = foo) ...

$x still wont be visible outside the loop.

I find the current semantic of C<my $x = calculation if maybe_false>
intuitive, it does exactly what I expect, and it's useful too. My
beef is with the loop ones :)

I doubt that it currently does what you expect​:

  $ ./perl -le 'for (1..10) { my $x=10 if $_==1;;print $x; $x=20 }'
  10

  20
  20
  20
  20
  20
  20
  20
  20

Most people find the printing of the value 20 unexpected.

--
Lady Nancy Astor​: If you were my husband, I would flavour your coffee
with poison.
Churchill​: Madam - if I were your husband, I would drink it.

@p5pRT
Copy link
Author

p5pRT commented Mar 10, 2004

From @jlokier

Dave Mitchell wrote​:

my $x = foo if bar

will be equivalent to​:

my $x;
$x = foo if bar

Oh. That's what I thought it was already!

except that the scope of $x wont expand, eg in

if \(my $x = foo\) \.\.\.

$x still wont be visible outside the loop.

You mean outside the conditional, I presume.
That's nice, I like it as it is.

By the way, this is inconsistent with the binding of regex results in
$0, $1, ...

Regexes matched in a loop have their result bound inside the loop, but
not outside. For example​:

  while (/(.)/) {
  # $1 contains first non-newline character of $_, if any.
  }
  # $1 contains the value it had before the while.

Yet conditionals bind regex results differently​:

  if (/(.)/) {
  # $1 contains first non-newline character of $_, if any.
  }
  # $1 contains first non-newline character of $_, if any.

This binding for conditionals is very useful in practice, although
it's a little counterintuitive that regex result scope is different
from "my" scope.

I have occasionally changed an "if (/.../) {...}" to a "while (/.../)
{...}", and thus introduced a not-so-obvious bug into some code that
used the regex result just after the "if".

I find the current semantic of C<my $x = calculation if maybe_false>
intuitive, it does exactly what I expect, and it's useful too. My
beef is with the loop ones :)

I doubt that it currently does what you expect​:

$ \./perl \-le 'for \(1\.\.10\) \{ my $x=10 if $\_==1;;print $x; $x=20 \}'
10

20
20
20
\[etc\.\.\.\]

Most people find the printing of the value 20 unexpected.

Oh. I'm surprised too, and I actually use that construct in a few
programs. I didn't realise the programs where buggy. That is
_really_ broken behaviour as I'm sure you agree.

It looks quite similar to the regex bug with lexicals, #26909, which
by the way nobody has replied too (perhaps it's too hard :) The
similarity is that a reference to a lexical variable is reading a
previous value which should be impossible as the dynamic scope has
been left and reentered.

So perhaps the fix to this semantic will help with #26909?

By the way, what happens if $x is assigned an object instead of 20?
Does this mean the object isn't destroyed at the end of the scope of
the loop?

(Does a quick test).

Yikes, the object isn't destroyed when its lexical scope if left
completely. This is more serious than I thought, captain.

$ perl -le 'package X; sub DESTROY { print "destroy" }
{ my $x if 0; $x = bless [],"X"; } print "finished";'

finished
destroy

Get rid of the conditional and it destroys the object when expected​:

$ perl -le 'package X; sub DESTROY { print "destroy" }
{ my $x; $x = bless [],"X"; } print "finished";'

destroy
finished

It's good news that it's to be fixed! Will that change fix the regex
lexical bug #26909?

-- Jamie

@p5pRT
Copy link
Author

p5pRT commented Mar 10, 2004

From @rgs

Jamie Lokier wrote in perl.perl5.porters :

Oh. I'm surprised too, and I actually use that construct in a few
programs. I didn't realise the programs where buggy. That is
_really_ broken behaviour as I'm sure you agree.

The current plan is to fix it in 5.12 and deprecate it in 5.10.

It looks quite similar to the regex bug with lexicals, #26909, which
by the way nobody has replied too (perhaps it's too hard :)

Most strange ; I did reply to this bug, (mostly to say, "it's not a
bug, and this emits a warnings now"), but for some reason RT doesn't
have it.

http​://groups.google.com/groups?threadm=20040221225833.6d1dc31c.rgarciasuarez%40free.fr

@p5pRT
Copy link
Author

p5pRT commented Mar 24, 2004

From @iabyn

On Wed, Mar 10, 2004 at 04​:41​:39AM +0000, Jamie Lokier wrote​:

By the way, this is inconsistent with the binding of regex results in
$0, $1, ...

Regexes matched in a loop have their result bound inside the loop, but
not outside. For example​:

while \(/\(\.\)/\) \{
    \# $1 contains first non\-newline character of $\_\, if any\.
\}
\# $1 contains the value it had before the while\.

Yet conditionals bind regex results differently​:

if \(/\(\.\)/\) \{
    \# $1 contains first non\-newline character of $\_\, if any\.
\}
\# $1 contains first non\-newline character of $\_\, if any\.

This binding for conditionals is very useful in practice, although
it's a little counterintuitive that regex result scope is different
from "my" scope.

I have occasionally changed an "if (/.../) {...}" to a "while (/.../)
{...}", and thus introduced a not-so-obvious bug into some code that
used the regex result just after the "if".

Well, if (X){Y} deliberately has differnt scope semantics to
while(X){Y}; the loop operators introduce a new scope, but the if()
is treated the same as X && do {Y}, ie the X isn't in a new scope.

Except...

Its broken as regards my delcarations within the conditional​:

  sub X​::DESTROY { print "DESTROY\n" }
  $x = "global";
  {
  if (my $x = bless [], 'X') {
  print "inner​: $x\n";
  }
  print "middle​: $x\n";
  }
  print "outer​: $x\n";

which outputs​:

  inner​: X=ARRAY(0x817e000)
  middle​: global
  DESTROY
  outer​: global

There is a difference between compile-time and run-time scope here.
At runtime, the if conditional does not introduce a new scope,
so the lexical isn't freed until the middle section is exited; but
at compile-time, a new scope *is* introduced, so the middle print sees the
global $x rather than the lexical one.

This inconsistency smells like a bug to me. Which behaviour is wrong,
runtime or compile-time, is open to debate, but personally I think it's
the compile-time that's wrong. I don't think its worth fixing though,
because the backwards compatibility outweighs the benefits (I think).

Dave.

--
A power surge on the Bridge is rapidly and correctly diagnosed as a faulty
capacitor by the highly-trained and competent engineering staff.
  -- Things That Never Happen in "Star Trek" #9

@swade1987
Copy link

Issues go stale after 60d of inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants