Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PathTools-3.27 triggers a bug in Perl #9343

Closed
p5pRT opened this issue May 23, 2008 · 28 comments
Closed

PathTools-3.27 triggers a bug in Perl #9343

p5pRT opened this issue May 23, 2008 · 28 comments

Comments

@p5pRT
Copy link

p5pRT commented May 23, 2008

Migrated from rt.perl.org#54728 (status was 'open')

Searchable as RT54728$

@p5pRT
Copy link
Author

p5pRT commented May 23, 2008

From @jandubois

The update to PathTools-3.27 triggers a bug in the localization of $1​:

#!perl
use File​::Spec​::Win32 ();

print File​::Spec​::Win32->catfile("c​:/foo", "bar"), "\n";
"bar" =~ m/(.*)/;
print File​::Spec​::Win32->catfile("c​:/foo", $1), "\n";
"bar" =~ m/(.*)/;
print File​::Spec​::Win32->catfile("c​:/foo", "$1"), "\n";
__END__

With PathTools-3.27​:
C​:\foo\bar
C​:\foo\c​:
C​:\foo\bar

With older version of PathTools​:
C​:\foo\bar
C​:\foo\bar
C​:\foo\bar

The problem seems to be that $1 isn't properly localized. A simple
workaround in File​::Spec​::Win32 is to stringify @​_ in _canon_cat(),
e.g.

sub _canon_cat(@​)
{
  @​_ = map "$_", @​_;
  my $first = shift;

But it would be nicer if the problem can be solved in a more general
manner.

I encountered this problem when I was testing Pod​::Simple with
5.8.9-tobe, which contains this code triggering the problem
in Pod​::Simple​::HTMLBatch​:

  if( ref($chunk->[-1]) and $url =~ m{^(_[-a-z0-9_]+\.css$)} ) {
  $outfile = $self->filespecsys->catfile( $outdir, $1 );

Cheers,
-Jan

@p5pRT
Copy link
Author

p5pRT commented May 24, 2008

From @rgs

2008/5/23 via RT Jan Dubois <perlbug-followup@​perl.org>​:

The update to PathTools-3.27 triggers a bug in the localization of $1​:

#!perl
use File​::Spec​::Win32 ();

print File​::Spec​::Win32->catfile("c​:/foo", "bar"), "\n";
"bar" =~ m/(.*)/;
print File​::Spec​::Win32->catfile("c​:/foo", $1), "\n";
"bar" =~ m/(.*)/;
print File​::Spec​::Win32->catfile("c​:/foo", "$1"), "\n";
__END__

With PathTools-3.27​:
C​:\foo\bar
C​:\foo\c​:
C​:\foo\bar

Does that happen also with older perls ?

@p5pRT
Copy link
Author

p5pRT commented May 24, 2008

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented May 24, 2008

From @jandubois

On Sat, 24 May 2008, Rafael Garcia-Suarez wrote​:

2008/5/23 via RT Jan Dubois <perlbug-followup@​perl.org>​:

The update to PathTools-3.27 triggers a bug in the localization of $1​:

#!perl
use File​::Spec​::Win32 ();

print File​::Spec​::Win32->catfile("c​:/foo", "bar"), "\n";
"bar" =~ m/(.*)/;
print File​::Spec​::Win32->catfile("c​:/foo", $1), "\n";
"bar" =~ m/(.*)/;
print File​::Spec​::Win32->catfile("c​:/foo", "$1"), "\n";
__END__

With PathTools-3.27​:
C​:\foo\bar
C​:\foo\c​:
C​:\foo\bar

Does that happen also with older perls ?

Yes, it happens with 5.6.1 and 5.8.0 too.

All you have to do to reproduce it is copy the lib/File/Spec/Win32.pm
into an older Perl to test this. The problem itself is not Windows
specific; the test program above fails the same way on Linux too.

Cheers,
-Jan

@p5pRT
Copy link
Author

p5pRT commented Aug 4, 2013

From @jkeenan

On Sat May 24 07​:52​:30 2008, jdb wrote​:

On Sat, 24 May 2008, Rafael Garcia-Suarez wrote​:

2008/5/23 via RT Jan Dubois <perlbug-followup@​perl.org>​:

The update to PathTools-3.27 triggers a bug in the localization of $1​:

#!perl
use File​::Spec​::Win32 ();

print File​::Spec​::Win32->catfile("c​:/foo", "bar"), "\n";
"bar" =~ m/(.*)/;
print File​::Spec​::Win32->catfile("c​:/foo", $1), "\n";
"bar" =~ m/(.*)/;
print File​::Spec​::Win32->catfile("c​:/foo", "$1"), "\n";
__END__

With PathTools-3.27​:
C​:\foo\bar
C​:\foo\c​:
C​:\foo\bar

Does that happen also with older perls ?

Yes, it happens with 5.6.1 and 5.8.0 too.

All you have to do to reproduce it is copy the lib/File/Spec/Win32.pm
into an older Perl to test this. The problem itself is not Windows
specific; the test program above fails the same way on Linux too.

Cheers,
-Jan

Path-Tools is a CPAN distribution. In the Perl 5 core distribution,
File​::Spec appears under 'dist/Cwd/'. So I don't see why we should be
concerned with Path-Tools per se.

Be that as it may, I can't seem to reproduce this on Perl 5.18.0 on
Unix-ish systems (Darwin/PPC; Linux/i386)​:

#####
$ cat 54728_catfile.pl
use File​::Spec ();

print File​::Spec->catfile("c​:/foo", "bar"), "\n";
"bar" =~ m/(.*)/;
print File​::Spec->catfile("c​:/foo", $1), "\n";
"bar" =~ m/(.*)/;
print File​::Spec->catfile("c​:/foo", "$1"), "\n";
#####
$ perl 54728_catfile.pl
c​:/foo/bar
c​:/foo/bar
c​:/foo/bar
#####

Is this ticket closable?

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Aug 4, 2013

From @cpansprout

On Sat Aug 03 17​:05​:29 2013, jkeenan wrote​:

Path-Tools is a CPAN distribution. In the Perl 5 core distribution,
File​::Spec appears under 'dist/Cwd/'. So I don't see why we should be
concerned with Path-Tools per se.

dist/Cwd *is* PathTools. blead is upstream. The CPAN distribution name
happens to be different from the directory name in blead. I don’t know
the history of that.

Bugs relating to PathTools modules in general can go in this tracker.
Bugs relating to the CPAN (e.g., the Makefile.PL, which is not in blead)
packaging go in the CPAN queue. At least that’s how I report them. In
actuality, both categories of bugs end up in both queues.

Yes, it’s a mess.

(And I don’t know why the CPAN Makefile.PL cannot be in core.)

Is this ticket closable?

I don’t know the answer to that.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 4, 2013

From @cpansprout

On Sat Aug 03 17​:29​:47 2013, sprout wrote​:

On Sat Aug 03 17​:05​:29 2013, jkeenan wrote​:

Path-Tools is a CPAN distribution. In the Perl 5 core distribution,
File​::Spec appears under 'dist/Cwd/'. So I don't see why we should be
concerned with Path-Tools per se.

dist/Cwd *is* PathTools. blead is upstream. The CPAN distribution name
happens to be different from the directory name in blead. I don’t know
the history of that.

Bugs relating to PathTools modules in general can go in this tracker.
Bugs relating to the CPAN (e.g., the Makefile.PL, which is not in blead)
packaging go in the CPAN queue. At least that’s how I report them. In
actuality, both categories of bugs end up in both queues.

Yes, it’s a mess.

Sorry, that was meant to be an informative note, but it turned into a rant.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 4, 2013

From victor@vsespb.ru

Is this ticket closable?

I don’t know the answer to that.

tested on linux, perl 5.10.1

bug persists with PathTools 3.27 3.26
no bug in 3.2501 3.29

diff​:
https://metacpan.org/diff/release/KWILLIAMS/PathTools-3.27/SMUELLER/PathTools-3.29#lib/File/Spec/Win32.pm

seems bug is fixed, there is changelog entry​:

https://metacpan.org/source/SMUELLER/PathTools-3.29/Changes#L24

related tickets​:

http​://perlmonks.org/?node_id=671025
https://rt.cpan.org/Public/Bug/Display.html?id=33675

@p5pRT
Copy link
Author

p5pRT commented Aug 4, 2013

From @chorny

This ticket can be viewed as two tickets​:
1. Bug in PathTools (fixed)
2. This problem​:
#!perl

"2" =~ m/(.*)/;
print test1("1", $1), "\n";

sub test1 {
  'aa' =~ m/(.)/;
  return $_[0].'/'.$_[1];
}
prints "1/a" on perl 5.18.0.

Can it be considered a bug?

2013/8/4 James E Keenan via RT <perlbug-followup@​perl.org>​:

On Sat May 24 07​:52​:30 2008, jdb wrote​:

On Sat, 24 May 2008, Rafael Garcia-Suarez wrote​:

2008/5/23 via RT Jan Dubois <perlbug-followup@​perl.org>​:

The update to PathTools-3.27 triggers a bug in the localization of $1​:

#!perl
use File​::Spec​::Win32 ();

print File​::Spec​::Win32->catfile("c​:/foo", "bar"), "\n";
"bar" =~ m/(.*)/;
print File​::Spec​::Win32->catfile("c​:/foo", $1), "\n";
"bar" =~ m/(.*)/;
print File​::Spec​::Win32->catfile("c​:/foo", "$1"), "\n";
__END__

With PathTools-3.27​:
C​:\foo\bar
C​:\foo\c​:
C​:\foo\bar

Does that happen also with older perls ?

Yes, it happens with 5.6.1 and 5.8.0 too.

All you have to do to reproduce it is copy the lib/File/Spec/Win32.pm
into an older Perl to test this. The problem itself is not Windows
specific; the test program above fails the same way on Linux too.

Cheers,
-Jan

Path-Tools is a CPAN distribution. In the Perl 5 core distribution,
File​::Spec appears under 'dist/Cwd/'. So I don't see why we should be
concerned with Path-Tools per se.

Be that as it may, I can't seem to reproduce this on Perl 5.18.0 on
Unix-ish systems (Darwin/PPC; Linux/i386)​:

#####
$ cat 54728_catfile.pl
use File​::Spec ();

print File​::Spec->catfile("c​:/foo", "bar"), "\n";
"bar" =~ m/(.*)/;
print File​::Spec->catfile("c​:/foo", $1), "\n";
"bar" =~ m/(.*)/;
print File​::Spec->catfile("c​:/foo", "$1"), "\n";
#####
$ perl 54728_catfile.pl
c​:/foo/bar
c​:/foo/bar
c​:/foo/bar
#####

Is this ticket closable?

Thank you very much.
Jim Keenan

---
via perlbug​: queue​: perl5 status​: open
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=54728

--
Alexandr Ciornii, http​://chorny.net

@p5pRT
Copy link
Author

p5pRT commented Aug 4, 2013

From @mauke

On 04.08.2013 15​:12, Alexandr Ciornii wrote​:

This ticket can be viewed as two tickets​:
1. Bug in PathTools (fixed)
2. This problem​:
#!perl

"2" =~ m/(.*)/;
print test1("1", $1), "\n";

sub test1 {
'aa' =~ m/(.)/;
return $_[0].'/'.$_[1];
}
prints "1/a" on perl 5.18.0.

Can it be considered a bug?

I don't see it as a bug.

1) "2" =~ m/(.*)/ sets $1 to "2".
2) test1("1", $1) calls test1 with "1" and $1.
3) 'aa' =~ m/(.)/ sets $1 to "a".
4) $_[0].'/'.$_[1] builds a string from the arguments passed to test1.
  Perl passes arguments by alias.
  $_[0] is an alias to "1".
  $_[1] is an alias to $1, which currently contains "a".
5) Thus test1 returns "1/a".
6) Match variables are implicitly localized, so $1 reverts to "2" here.
7) "1/a\n" is printed.

--
Lukas Mai <plokinom@​gmail.com>

@p5pRT
Copy link
Author

p5pRT commented Aug 4, 2013

From @jkeenan

On Sun Aug 04 06​:14​:16 2013, chorny wrote​:

This ticket can be viewed as two tickets​:
1. Bug in PathTools (fixed)
2. This problem​:
#!perl

"2" =~ m/(.*)/;
print test1("1", $1), "\n";

sub test1 {
'aa' =~ m/(.)/;
return $_[0].'/'.$_[1];
}
prints "1/a" on perl 5.18.0.

Can it be considered a bug?

No, I don't think so.

With this line​:

'aa' =~ m/(.)/;

... you have made a new assignment to global variable $1. Hence, you
have changed the value to which $_[1] is aliased. From
'pod/perlsub.pod' (blead)​:

#####
Any arguments passed in show up in the array @​_.
Therefore, if you called a function with two
arguments, those would be stored in $_[0] and
$_[1]. The array @​_ is a local array, but its
elements are aliases for the actual scalar
parameters. In particular, if an element $_[0] is
updated, the corresponding argument is updated (or
an error occurs if it is not updatable).
#####

I'll take this ticket for the purpose of closing it within 7 days unless
someone thinks we still have a bug in Perl.

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Aug 4, 2013

From victor@vsespb.ru

Let's try replace "$1" with $x in your example and explanation (and
localize it manually)​:

########
our $x;

$x = 2;
print test1("1", $x), "\n";

sub test1 {
local $x;
$x = 'a';
return $_[0].'/'.$_[1];
}
########

1) $x=2 sets $x to "2".
2) test1("1", $1) calls test1 with "1" and $x.
3) $x = 'a' sets $x to "a".
4) $_[0].'/'.$_[1] builds a string from the arguments passed to test1.
Perl passes arguments by alias.
$_[0] is an alias to "1".
$_[1] is an alias to $x, which currently contains "a".
5) Thus test1 returns "1/a".
6) $x is explicitly localized, so $x reverts to "2" here.
7) "1/a\n" is printed.

#########

However it's not true. This example with "$x" prints 1/2

Anyway, even if we find it's not a bug, it's unique situation and can be
documented (btw who is responsible for that kind of side effects -
caller or subroutine?)

On Sun Aug 04 06​:52​:21 2013, plokinom@​gmail.com wrote​:

On 04.08.2013 15​:12, Alexandr Ciornii wrote​:

This ticket can be viewed as two tickets​:
1. Bug in PathTools (fixed)
2. This problem​:
#!perl

"2" =~ m/(.*)/;
print test1("1", $1), "\n";

sub test1 {
'aa' =~ m/(.)/;
return $_[0].'/'.$_[1];
}
prints "1/a" on perl 5.18.0.

Can it be considered a bug?

I don't see it as a bug.

1) "2" =~ m/(.*)/ sets $1 to "2".
2) test1("1", $1) calls test1 with "1" and $1.
3) 'aa' =~ m/(.)/ sets $1 to "a".
4) $_[0].'/'.$_[1] builds a string from the arguments passed to test1.
Perl passes arguments by alias.
$_[0] is an alias to "1".
$_[1] is an alias to $1, which currently contains "a".
5) Thus test1 returns "1/a".
6) Match variables are implicitly localized, so $1 reverts to "2" here.
7) "1/a\n" is printed.

@p5pRT
Copy link
Author

p5pRT commented Aug 4, 2013

From victor@vsespb.ru

So, it looks like a bug to me.

1) two explanation above, with aliasing and localized global vars does
not apply for real localized glabal vars.

2) Sometimes subroutines just cannot copy arguments ( my ($a, $b) = @​_ )
due to performance reasons (COW can help here?)

And if it uses regexps, there is no way to prevent such issues, except
documenting this in subroutine interface documentation.
(even my ($m1, $m2) = /(...)(.)/ doest not help. enclosing each regexp
in { } probably helps, unless it uses one of aliased arguments )

On Sun Aug 04 11​:13​:19 2013, vsespb wrote​:

Let's try replace "$1" with $x in your example and explanation (and
localize it manually)​:

########
our $x;

$x = 2;
print test1("1", $x), "\n";

sub test1 {
local $x;
$x = 'a';
return $_[0].'/'.$_[1];
}
########

1) $x=2 sets $x to "2".
2) test1("1", $1) calls test1 with "1" and $x.
3) $x = 'a' sets $x to "a".
4) $_[0].'/'.$_[1] builds a string from the arguments passed to test1.
Perl passes arguments by alias.
$_[0] is an alias to "1".
$_[1] is an alias to $x, which currently contains "a".
5) Thus test1 returns "1/a".
6) $x is explicitly localized, so $x reverts to "2" here.
7) "1/a\n" is printed.

#########

However it's not true. This example with "$x" prints 1/2

Anyway, even if we find it's not a bug, it's unique situation and can be
documented (btw who is responsible for that kind of side effects -
caller or subroutine?)

On Sun Aug 04 06​:52​:21 2013, plokinom@​gmail.com wrote​:

On 04.08.2013 15​:12, Alexandr Ciornii wrote​:

This ticket can be viewed as two tickets​:
1. Bug in PathTools (fixed)
2. This problem​:
#!perl

"2" =~ m/(.*)/;
print test1("1", $1), "\n";

sub test1 {
'aa' =~ m/(.)/;
return $_[0].'/'.$_[1];
}
prints "1/a" on perl 5.18.0.

Can it be considered a bug?

I don't see it as a bug.

1) "2" =~ m/(.*)/ sets $1 to "2".
2) test1("1", $1) calls test1 with "1" and $1.
3) 'aa' =~ m/(.)/ sets $1 to "a".
4) $_[0].'/'.$_[1] builds a string from the arguments passed to test1.
Perl passes arguments by alias.
$_[0] is an alias to "1".
$_[1] is an alias to $1, which currently contains "a".
5) Thus test1 returns "1/a".
6) Match variables are implicitly localized, so $1 reverts to "2" here.
7) "1/a\n" is printed.

@p5pRT
Copy link
Author

p5pRT commented Aug 5, 2013

From @iabyn

On Sun, Aug 04, 2013 at 04​:28​:32PM -0700, Victor Efimov via RT wrote​:

So, it looks like a bug to me.

1) two explanation above, with aliasing and localized global vars does
not apply for real localized glabal vars.

Except that $1 etc are *not* localised. The thing that is localised is the
current match (PL_curpm internally). $1 et al are just magic vars that
retrieve values from the current match on request.

--
Spock (or Data) is fired from his high-ranking position for not being able
to understand the most basic nuances of about one in three sentences that
anyone says to him.
  -- Things That Never Happen in "Star Trek" #19

@p5pRT
Copy link
Author

p5pRT commented Aug 5, 2013

From victor@vsespb.ru

On Mon Aug 05 03​:51​:38 2013, davem wrote​:

On Sun, Aug 04, 2013 at 04​:28​:32PM -0700, Victor Efimov via RT wrote​:

So, it looks like a bug to me.

1) two explanation above, with aliasing and localized global vars does
not apply for real localized glabal vars.

Except that $1 etc are *not* localised. The thing that is localised is the
current match (PL_curpm internally). $1 et al are just magic vars that
retrieve values from the current match on request.

Isn't it unclearly documented?

(both cases declared as "dynamically-scoped")

http​://perldoc.perl.org/perlvar.html
########
Most of the special variables related to regular expressions are side
effects. Perl sets these variables when it has a successful match, so
you should check the match result before using them. For instance​:
These variables are read-only and dynamically-scoped, unless we note
otherwise.
The dynamic nature of the regular expression variables means that their
value is limited to the block that they are in, as demonstrated by this
bit of code​:
########

http​://perldoc.perl.org/perlsub.html
########
A local modifies its listed variables to be "local" to the enclosing
block, eval, or do FILE --and to any subroutine called from within that
block. A local just gives temporary values to global (meaning package)
variables. It does not create a local variable. This is known as dynamic
scoping.
########

Also, isn't this a misfeature?

Seems we're going drop things like empty regexps
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=119095 (although it is
documented). And this issue much more useless and unclear imho.

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2013

From @iabyn

On Mon, Aug 05, 2013 at 08​:42​:48AM -0700, Victor Efimov via RT wrote​:

On Mon Aug 05 03​:51​:38 2013, davem wrote​:

On Sun, Aug 04, 2013 at 04​:28​:32PM -0700, Victor Efimov via RT wrote​:

So, it looks like a bug to me.

1) two explanation above, with aliasing and localized global vars does
not apply for real localized glabal vars.

Except that $1 etc are *not* localised. The thing that is localised is the
current match (PL_curpm internally). $1 et al are just magic vars that
retrieve values from the current match on request.

Isn't it unclearly documented?

Quite possibly

(both cases declared as "dynamically-scoped")

That's because they're both dynamically (as opposed to lexically) scoped.

Also, isn't this a misfeature?

I don't see why.

Seems we're going drop things like empty regexps
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=119095 (although it is
documented). And this issue much more useless and unclear imho.

If we were to set every capture var ($1,$2,...) on every match, rather than
just on demand, regexes would be an awful lot slower.

--
Never work with children, animals, or actors.

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2013

From victor@vsespb.ru

On Tue Aug 06 04​:24​:55 2013, davem wrote​:

On Mon, Aug 05, 2013 at 08​:42​:48AM -0700, Victor Efimov via RT wrote​:

Also, isn't this a misfeature?

I don't see why.

a) is there a valid case when user indeed wants 'aa' =~ m/(.)/ to modify
$_[1], in code like this?

"2" =~ m/(.*)/;
print test1("1", $1), "\n";

sub test1 {
'aa' =~ m/(.)/;
return $_[0].'/'.$_[1];
}

b) I think using regexps with capture groups, before using $_[], or @​_,
or &othersub; without expecting side effects is much more usable than
this feature.

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2013

From victor@vsespb.ru

I've tried to grep CPAN to find examples of use of regexps before @​_

so here they are​:

## bug​:

https://metacpan.org/source/ADAMK/Params-Util-1.07/lib/Params/Util.pm#L191
(https://rt.cpan.org/Public/Bug/Display.html?id=87649
https://rt.cpan.org/Public/Bug/Display.html?id=81276)

## possible bug​:
https://metacpan.org/source/FLORA/NEXT-0.65/lib/NEXT.pm#L147
https://metacpan.org/source/ADAMK/pip-1.19/lib/pip.pm#L99

## not affected, because no capture groups​:
https://metacpan.org/source/PJF/autodie-2.20/lib/Fatal.pm#L194
https://metacpan.org/source/MELO/Redis-1.961/lib/Redis.pm#L101
https://metacpan.org/source/FLORA/bignum-0.32/lib/bigint.pm#L34
https://metacpan.org/source/TOBYINK/Test-Tabs-0.003/lib/Test/Tabs.pm#L148
https://metacpan.org/source/GAAS/Data-Dump-1.22/lib/Data/Dump.pm#L86
https://metacpan.org/source/STBEY/Date-Calc-6.3/lib/Date/Calendar/Year.pm#L87

## not affected​: private subroutine, no possibility to call with $1, etc
https://metacpan.org/source/ADAMK/pip-1.19/lib/pip.pm#L99
https://metacpan.org/source/FLORA/bignum-0.32/lib/bigint.pm#L205
https://metacpan.org/source/GAAS/HTTP-Cookies-6.01/lib/HTTP/Cookies.pm#L598

On Tue Aug 06 04​:42​:19 2013, vsespb wrote​:

On Tue Aug 06 04​:24​:55 2013, davem wrote​:

On Mon, Aug 05, 2013 at 08​:42​:48AM -0700, Victor Efimov via RT wrote​:

Also, isn't this a misfeature?

I don't see why.

a) is there a valid case when user indeed wants 'aa' =~ m/(.)/ to modify
$_[1], in code like this?

"2" =~ m/(.*)/;
print test1("1", $1), "\n";

sub test1 {
'aa' =~ m/(.)/;
return $_[0].'/'.$_[1];
}

b) I think using regexps with capture groups, before using $_[], or @​_,
or &othersub; without expecting side effects is much more usable than
this feature.

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2013

From @iabyn

On Tue, Aug 06, 2013 at 04​:42​:20AM -0700, Victor Efimov via RT wrote​:

On Tue Aug 06 04​:24​:55 2013, davem wrote​:

On Mon, Aug 05, 2013 at 08​:42​:48AM -0700, Victor Efimov via RT wrote​:

Also, isn't this a misfeature?

I don't see why.

a) is there a valid case when user indeed wants 'aa' =~ m/(.)/ to modify
$_[1], in code like this?

"2" =~ m/(.*)/;
print test1("1", $1), "\n";

sub test1 {
'aa' =~ m/(.)/;
return $_[0].'/'.$_[1];
}

Probably not, but in the same way that the user probably doesn't want

  sub f { $_ = 1 }

to modify $_[0], but it will if called as

  f($_);

In what ever way that $1 et al are implemented, there will be strange edge
cases where it doesn't do what people expect. If we change it, it will
just swap one set of edge cases for another. By not swapping, we at least
a) don't break code which relies on the current behaviour;
b) mange to do implement it in an efficient way.

b) I think using regexps with capture groups, before using $_[], or @​_,
or &othersub; without expecting side effects is much more usable than
this feature.

I think any code that passes raw global 'magic' vars as args to subs (e.g.
without stringifying them a la "$1"), is asking for trouble​:

  foo($1, $+[0], $_, $., $!);

--
The Enterprise is involved in a bizarre time-warp experience which is in
some way unconnected with the Late 20th Century.
  -- Things That Never Happen in "Star Trek" #14

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2013

From victor@vsespb.ru

On Tue Aug 06 06​:32​:12 2013, davem wrote​:

Probably not, but in the same way that the user probably doesn't want
sub f { $_ = 1 }
to modify $_[0], but it will if called as
f($_);

if $_ is localized, then code

sub f { local $_ = 1; print $_[0] }
$_=2;
f($_);
print $_;

prints 22, so $_[0] is not modified.

is subroutine intention is to modify global $_, it should be documented,
and modified $_ and even $_[0] is expected behaviour.

subroutine that does not localize $_ and does not document this,
subroutine is buggy imho.

I think any code that passes raw global 'magic' vars as args to subs (e.g.
without stringifying them a la "$1"), is asking for trouble​:

Ok, get it. So caller responsible for this. Then it's probably thing to
report as feature request for perlcritic.

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2013

From @jkeenan

On Sun Aug 04 06​:56​:58 2013, jkeenan wrote​:

On Sun Aug 04 06​:14​:16 2013, chorny wrote​:

This ticket can be viewed as two tickets​:
1. Bug in PathTools (fixed)
2. This problem​:
#!perl

"2" =~ m/(.*)/;
print test1("1", $1), "\n";

sub test1 {
'aa' =~ m/(.)/;
return $_[0].'/'.$_[1];
}
prints "1/a" on perl 5.18.0.

Can it be considered a bug?

No, I don't think so.

With this line​:

'aa' =~ m/(.)/;

... you have made a new assignment to global variable $1. Hence, you
have changed the value to which $_[1] is aliased. From
'pod/perlsub.pod' (blead)​:

#####
Any arguments passed in show up in the array @​_.
Therefore, if you called a function with two
arguments, those would be stored in $_[0] and
$_[1]. The array @​_ is a local array, but its
elements are aliases for the actual scalar
parameters. In particular, if an element $_[0] is
updated, the corresponding argument is updated (or
an error occurs if it is not updatable).
#####

I'll take this ticket for the purpose of closing it within 7 days unless
someone thinks we still have a bug in Perl.

Since there has been extensive back-and-forth in this RT since Sunday, I
can no longer take the responsibility of closing it. So I'm reassigning
it to the ever-popular Nobody.

Nonetheless, I still don't believe there is a bug in Perl here.

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Aug 7, 2013

From victor@vsespb.ru

1) How about notice in perlvar​:
"It's not advised to pass directly regular expressions variables to
function as arguments, because function can modify it before use. Use
stringification instead​: somefunc("$1")"

before this paragraph​:

"Due to an unfortunate accident of Perl's implementation, use English
imposes a considerable performance penalty on all regular expression
matches"

(I am not submitting patch, because almost sure exact wording can be
improved)

2) Would it be useful if I try to improve tests for this (probably
op/local.t, op/magic.t) to test current behaviour and submit a patch?

On Tue Aug 06 06​:44​:56 2013, vsespb wrote​:

On Tue Aug 06 06​:32​:12 2013, davem wrote​:

Probably not, but in the same way that the user probably doesn't want
sub f { $_ = 1 }
to modify $_[0], but it will if called as
f($_);

if $_ is localized, then code

sub f { local $_ = 1; print $_[0] }
$_=2;
f($_);
print $_;

prints 22, so $_[0] is not modified.

is subroutine intention is to modify global $_, it should be documented,
and modified $_ and even $_[0] is expected behaviour.

subroutine that does not localize $_ and does not document this,
subroutine is buggy imho.

I think any code that passes raw global 'magic' vars as args to subs
(e.g.
without stringifying them a la "$1"), is asking for trouble​:

Ok, get it. So caller responsible for this. Then it's probably thing to
report as feature request for perlcritic.

@p5pRT
Copy link
Author

p5pRT commented Aug 9, 2013

From @epa

IMHO, this is a longstanding misfeature in Perl. It is easy to write

  our $options = 'debug';
  sub foo {
  say 'debug flag set' if $options =~ /debug/;
  say $_[0] + $_[1];
  }

And it is also easy to write

  my $x = '123';
  $x =~ /(\d+)/ or die;
  foo(5, $1);

No warning is given in either case. So, depending on which way you look at
it, you have a subroutine foo() which is buggy if called with $1, or you have
some buggy user code calling it, depending on the details of how foo() is
implemented.

It would be far better if $1 were truly localized, so that the regexp match
inside foo() did not have any effect on the $1 used by the calling code
(which has been passed by reference).

It would even be better, or at least easier to understand, if $1 were global.
Then the puzzled programmer could see that calling foo() has modified $1 as
a side effect.

But at present $1 exists in a strange halfway house where it is neither truly
global nor truly local. It's a global variable where Perl shuffles around the
value behind the scenes; but this does not give the same behaviour as​:

  our $o;
  sub foo {
  local $o = 'a';
  say $_[0] + $_[1];
  }
  local $o = 123;
  foo(5, $o);

Even though foo() is setting the value of $o, this doesn't break the
argument passing because $o is localized. If $1 were localized properly
then it wouldn't break either.

--
Ed Avis <eda@​waniasset.com>

@p5pRT
Copy link
Author

p5pRT commented Aug 12, 2013

From @iabyn

On Fri, Aug 09, 2013 at 10​:26​:37AM +0000, Ed Avis wrote​:

But at present $1 exists in a strange halfway house where it is neither truly
global nor truly local. It's a global variable where Perl shuffles around the
value behind the scenes;

In this respect, $1 is no different than for example $. :

  sub f {
  print "$_[0]\n"; # prints 1
  open my $fh, $0 or die;
  <$fh>; <$fh>;
  print "$_[0]\n"; # prints 2
  }

  open my $fh, $0 or die;
  <$fh>;
  f($.);

but this does not give the same behaviour as​:

our $o;
sub foo \{
    local $o = 'a';
    say $\_\[0\] \+ $\_\[1\];
\}
local $o = 123;
foo\(5\, $o\);

Even though foo() is setting the value of $o, this doesn't break the
argument passing because $o is localized. If $1 were localized properly
then it wouldn't break either.

Well yes, but for a successful match, this would involve​:

* for $1​:
  * save the existing scalar slot of *1 on the savestack;
  * create a new SV and copy the relevant substring of the match string
  to it;
  * put that new SV in the scalar slot of *1;
  * on scope exit, undo the above and free the SV;
* ditto for $2, and any other captures;
* possibly ditto for $`, $&amp;, $&;
* possibly ditto for ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH};
* for @​-​:
  similarly localise, creating a new array with new indices;
* something similar for @​+, %-, %+;
* possibly something similar for $+, $^N, $^R.

Which collectively would impose a significant performance penalty for each
successful match.

Potentially these vars could remain magic, but when localised, a new var with
the same magic attached could be created, where the magic points to a
different match object (i.e. PL_curpm). This would avoid having to copy
the relevant substring to each var until needed, but would still involve
lots of SV creation and destruction and typeglob+savestack activity for
each successful match.

--
"Strange women lying in ponds distributing swords is no basis for a system
of government. Supreme executive power derives from a mandate from the
masses, not from some farcical aquatic ceremony."
  -- Dennis, "Monty Python and the Holy Grail"

@p5pRT
Copy link
Author

p5pRT commented Aug 20, 2013

From @epa

Dave Mitchell <davem <at> iabyn.com> writes​:

But at present $1 exists in a strange halfway house where it is neither
truly global nor truly local.

You are right to note that $. and other punctuation variables have the
same behaviour.

I suggested that $1 should be 'localized properly' but you noted that this
might involve quite a performance penalty for every regexp match.

A simpler answer, I suggest, is that when variables like $1 and $. are passed
as arguments to subroutines, a copy be made in a read-only temporary scalar.
That would get rid of these perennial bug reports or surprises. How about it?

--
Ed Avis <eda@​waniasset.com>

@p5pRT
Copy link
Author

p5pRT commented Aug 20, 2013

From victor@vsespb.ru

It looks to me that example with $. is irrelevant here. $. is not localized
(unlike $1)

sub f {
  open my $fh, $0 or die;
  <$fh>; <$fh>;
  print $.;
}

open my $fh, $0 or die;
<$fh>;
print $.;
f();
print $.;

__END__
prints 122

2013/8/20 Ed Avis <eda@​waniasset.com>

Dave Mitchell <davem <at> iabyn.com> writes​:

But at present $1 exists in a strange halfway house where it is neither
truly global nor truly local.

You are right to note that $. and other punctuation variables have the
same behaviour.

I suggested that $1 should be 'localized properly' but you noted that this
might involve quite a performance penalty for every regexp match.

A simpler answer, I suggest, is that when variables like $1 and $. are
passed
as arguments to subroutines, a copy be made in a read-only temporary
scalar.
That would get rid of these perennial bug reports or surprises. How about
it?

--
Ed Avis <eda@​waniasset.com>

@toddr
Copy link
Member

toddr commented Feb 4, 2020

I suggest closing this ticket. I get that it is 2 issues and the first related to the title has been addressed. What I suggest is that if localizing $1 is important then that sounds like a new issue that should be discussed in its own issue.

@toddr toddr added the Closable? We might be able to close this ticket, but we need to check with the reporter label Feb 4, 2020
@jkeenan
Copy link
Contributor

jkeenan commented Feb 5, 2020

I suggest closing this ticket. I get that it is 2 issues and the first related to the title has been addressed. What I suggest is that if localizing $1 is important then that sounds like a new issue that should be discussed in its own issue.

Yes. Anyone who believes that localizing $1 is important should open a new issue. Closing this one. Thank you very much.

@jkeenan jkeenan closed this as completed Feb 5, 2020
@jkeenan jkeenan removed the Closable? We might be able to close this ticket, but we need to check with the reporter label Feb 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants