Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

require parses barewords strangely #11824

Open
p5pRT opened this issue Dec 25, 2011 · 9 comments
Open

require parses barewords strangely #11824

p5pRT opened this issue Dec 25, 2011 · 9 comments

Comments

@p5pRT
Copy link

p5pRT commented Dec 25, 2011

Migrated from rt.perl.org#107004 (status was 'open')

Searchable as RT107004$

@p5pRT
Copy link
Author

p5pRT commented Dec 25, 2011

From @cpansprout

When require is followed by a bareword, if that bareword is followed by an operator that has higher precedence than require, it falls back to treating its argument as a normal expression, but not quite.

Here is the special behaviour​:

$ perl5.15.5 -MO=Concise -e 'require a​::b'
5 <@​> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e​:1) v​:{ ->3
4 <1> require sK/1 ->5
3 <$> const[PV "a/b.pm"] s/BARE ->4
-e syntax OK

Notice the "a/b.pm".

Now if we put ‘. 1’ after it (+1 doesn’t work, because of bug #105924)​:

$ perl5.15.5 -MO=Concise -we 'require a​::b . 1'
7 <@​> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e​:1) v​:{ ->3
6 <1> require sK/1 ->7
5 <2> concat[t1] sK/2 ->6
3 <$> const[PV "a​::b"] s/BARE ->4
4 <$> const[IV 1] s ->5
-e syntax OK

That I can understand, as a​::b is a string in the absence of any other interpretation, ‘a​::b . 1’ being the argument to require, which is more than a single bareword.

But if there is a subroutine named a​::b, things get strange​:

$ perl5.15.5 -MO=Concise -we 'sub a​::b; require a​::b . 1'
7 <@​> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e​:1) v​:{ ->3
6 <1> require sK/1 ->7
5 <2> concat[t1] sK/2 ->6
3 <$> const[PV "a​::b"] s/BARE ->4
4 <$> const[IV 1] s ->5
-e syntax OK

The a​::b should be a subroutine call.

This may be related to #36333.


Flags​:
  category=core
  severity=low


Site configuration information for perl 5.15.5​:

Configured by sprout at Sun Dec 18 11​:26​:14 PST 2011.

Summary of my perl5 (revision 5 version 15 subversion 5) configuration​:
  Snapshot of​: 5dca8ed
  Platform​:
  osname=darwin, osvers=10.5.0, archname=darwin-2level
  uname='darwin pint.local 10.5.0 darwin kernel version 10.5.0​: fri nov 5 23​:20​:39 pdt 2010; root​:xnu-1504.9.17~1release_i386 i386 '
  config_args='-de -Dusedevel -DDEBUGGING=-g'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=undef, usemultiplicity=undef
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=undef, use64bitall=undef, uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include',
  optimize='-O3 -g',
  cppflags='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='4.2.1 (Apple Inc. build 5664)', gccosandvers=''
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib
  libs=-ldbm -ldl -lm -lutil -lc
  perllibs=-ldl -lm -lutil -lc
  libc=, so=dylib, useshrplib=false, libperl=libperl.a
  gnulibc_version=''
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
  cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -L/usr/local/lib -fstack-protector'

Locally applied patches​:
 


@​INC for perl 5.15.5​:
  /usr/local/lib/perl5/site_perl/5.15.5/darwin-2level
  /usr/local/lib/perl5/site_perl/5.15.5
  /usr/local/lib/perl5/5.15.5/darwin-2level
  /usr/local/lib/perl5/5.15.5
  /usr/local/lib/perl5/site_perl
  .


Environment for perl 5.15.5​:
  DYLD_LIBRARY_PATH (unset)
  HOME=/Users/sprout
  LANG=en_US.UTF-8
  LANGUAGE (unset)
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)
  PATH=/usr/bin​:/bin​:/usr/sbin​:/sbin​:/usr/local/bin​:/usr/X11/bin​:/usr/local/bin
  PERL_BADLANG (unset)
  SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented May 4, 2012

From @cpansprout

On Sun Dec 25 14​:20​:49 2011, sprout wrote​:

When require is followed by a bareword, if that bareword is followed
by an operator that has higher precedence than require, it falls back
to treating its argument as a normal expression, but not quite.

Here is the special behaviour​:

$ perl5.15.5 -MO=Concise -e 'require a​::b'
5 <@​> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e​:1) v​:{ ->3
4 <1> require sK/1 ->5
3 <$> const[PV "a/b.pm"] s/BARE ->4
-e syntax OK

Notice the "a/b.pm".

Now if we put ‘. 1’ after it (+1 doesn’t work, because of bug
#105924)​:

$ perl5.15.5 -MO=Concise -we 'require a​::b . 1'
7 <@​> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e​:1) v​:{ ->3
6 <1> require sK/1 ->7
5 <2> concat[t1] sK/2 ->6
3 <$> const[PV "a​::b"] s/BARE ->4
4 <$> const[IV 1] s ->5
-e syntax OK

That I can understand, as a​::b is a string in the absence of any other
interpretation, ‘a​::b . 1’ being the argument to require, which is
more than a single bareword.

But if there is a subroutine named a​::b, things get strange​:

$ perl5.15.5 -MO=Concise -we 'sub a​::b; require a​::b . 1'
7 <@​> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e​:1) v​:{ ->3
6 <1> require sK/1 ->7
5 <2> concat[t1] sK/2 ->6
3 <$> const[PV "a​::b"] s/BARE ->4
4 <$> const[IV 1] s ->5
-e syntax OK

The a​::b should be a subroutine call.

When ‘require’ is followed by a bareword, it is treated specially in
three different ways​:

1. The token following it is forced to be a bareword, even if there is a
subroutine with that name, and even under strict mode.
2. The bareword has its double colons changed to slashes and has .pm
appended to the end.
3. A stash is autovivified during compilation.

Items 1 and 3 happen in the tokeniser, based on whether the token that
immediately follows the ‘require’ token is an identifier. Item 2
happens in op.c when the op tree is being built, and depends on whether
the child op is a single constant that was a bareword.

This means that cases like ‘require a​::b . "foo"’ treat a​::b as a
bareword, exempt from strict mode, but that bareword does not undergo
the s|​::|/|g and .= ".pm" treatment. So the bareword is only half special.

(This also means that ‘require foo’ is allowed under strict, but
‘require(foo)’ isn’t, even though the latter, when used outside of
strict mode, turns into ‘require "foo.pm"’. However, ‘require(foo)’
treats foo as a sub call if there is a foo sub in scope. I don’t want
to deal with that just yet.)

I’m wondering whether it would be possible to make the require code in
toke.c scan past the bareword and see whether it is followed by
something that would make the child op of require into something more
than just a bareword; i.e., an infix op of higher precedence than
require (<< >> and above) or an opening parenthesis (for require foo()).
If such a character is *not* found, then the bareword can get its
special treatment via S_force_word, etc. Can anyone see anything I’ve
missed?

Simply moving all the bareword handling to op.c won’t work, because
‘require foo’ will turn into ‘require(foo())’ if there is a foo sub.
Though it’s usually possible to detect that, it isn’t with constant subs.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented May 4, 2012

From [Unknown Contact. See original ticket]

On Sun Dec 25 14​:20​:49 2011, sprout wrote​:

When require is followed by a bareword, if that bareword is followed
by an operator that has higher precedence than require, it falls back
to treating its argument as a normal expression, but not quite.

Here is the special behaviour​:

$ perl5.15.5 -MO=Concise -e 'require a​::b'
5 <@​> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e​:1) v​:{ ->3
4 <1> require sK/1 ->5
3 <$> const[PV "a/b.pm"] s/BARE ->4
-e syntax OK

Notice the "a/b.pm".

Now if we put ‘. 1’ after it (+1 doesn’t work, because of bug
#105924)​:

$ perl5.15.5 -MO=Concise -we 'require a​::b . 1'
7 <@​> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e​:1) v​:{ ->3
6 <1> require sK/1 ->7
5 <2> concat[t1] sK/2 ->6
3 <$> const[PV "a​::b"] s/BARE ->4
4 <$> const[IV 1] s ->5
-e syntax OK

That I can understand, as a​::b is a string in the absence of any other
interpretation, ‘a​::b . 1’ being the argument to require, which is
more than a single bareword.

But if there is a subroutine named a​::b, things get strange​:

$ perl5.15.5 -MO=Concise -we 'sub a​::b; require a​::b . 1'
7 <@​> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e​:1) v​:{ ->3
6 <1> require sK/1 ->7
5 <2> concat[t1] sK/2 ->6
3 <$> const[PV "a​::b"] s/BARE ->4
4 <$> const[IV 1] s ->5
-e syntax OK

The a​::b should be a subroutine call.

When ‘require’ is followed by a bareword, it is treated specially in
three different ways​:

1. The token following it is forced to be a bareword, even if there is a
subroutine with that name, and even under strict mode.
2. The bareword has its double colons changed to slashes and has .pm
appended to the end.
3. A stash is autovivified during compilation.

Items 1 and 3 happen in the tokeniser, based on whether the token that
immediately follows the ‘require’ token is an identifier. Item 2
happens in op.c when the op tree is being built, and depends on whether
the child op is a single constant that was a bareword.

This means that cases like ‘require a​::b . "foo"’ treat a​::b as a
bareword, exempt from strict mode, but that bareword does not undergo
the s|​::|/|g and .= ".pm" treatment. So the bareword is only half special.

(This also means that ‘require foo’ is allowed under strict, but
‘require(foo)’ isn’t, even though the latter, when used outside of
strict mode, turns into ‘require "foo.pm"’. However, ‘require(foo)’
treats foo as a sub call if there is a foo sub in scope. I don’t want
to deal with that just yet.)

I’m wondering whether it would be possible to make the require code in
toke.c scan past the bareword and see whether it is followed by
something that would make the child op of require into something more
than just a bareword; i.e., an infix op of higher precedence than
require (<< >> and above) or an opening parenthesis (for require foo()).
If such a character is *not* found, then the bareword can get its
special treatment via S_force_word, etc. Can anyone see anything I’ve
missed?

Simply moving all the bareword handling to op.c won’t work, because
‘require foo’ will turn into ‘require(foo())’ if there is a foo sub.
Though it’s usually possible to detect that, it isn’t with constant subs.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented May 4, 2012

@cpansprout - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented May 5, 2012

From @cpansprout

On Thu May 03 22​:47​:36 2012, sprout wrote​:

When ‘require’ is followed by a bareword, it is treated specially in
three different ways​:

1. The token following it is forced to be a bareword, even if there is a
subroutine with that name, and even under strict mode.
2. The bareword has its double colons changed to slashes and has .pm
appended to the end.
3. A stash is autovivified during compilation.

Items 1 and 3 happen in the tokeniser, based on whether the token that
immediately follows the ‘require’ token is an identifier. Item 2
happens in op.c when the op tree is being built, and depends on whether
the child op is a single constant that was a bareword.

This means that cases like ‘require a​::b . "foo"’ treat a​::b as a
bareword, exempt from strict mode, but that bareword does not undergo
the s|​::|/|g and .= ".pm" treatment. So the bareword is only half
special.

(This also means that ‘require foo’ is allowed under strict, but
‘require(foo)’ isn’t, even though the latter, when used outside of
strict mode, turns into ‘require "foo.pm"’. However, ‘require(foo)’
treats foo as a sub call if there is a foo sub in scope. I don’t want
to deal with that just yet.)

I’m wondering whether it would be possible to make the require code in
toke.c scan past the bareword and see whether it is followed by
something that would make the child op of require into something more
than just a bareword; i.e., an infix op of higher precedence than
require (<< >> and above) or an opening parenthesis (for require foo()).
If such a character is *not* found, then the bareword can get its
special treatment via S_force_word, etc. Can anyone see anything I’ve
missed?

I would prefer to avoid this method, if possible. If we introduce
pluggable infix operators, this won’t work or will complicate things.

Simply moving all the bareword handling to op.c won’t work, because
‘require foo’ will turn into ‘require(foo())’ if there is a foo sub.
Though it’s usually possible to detect that, it isn’t with constant subs.

I wonder whether we should somehow record the name of a sub (more
precisely, the name whereby it is invoked) that is inlined as a constant.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented May 5, 2012

From [Unknown Contact. See original ticket]

On Thu May 03 22​:47​:36 2012, sprout wrote​:

When ‘require’ is followed by a bareword, it is treated specially in
three different ways​:

1. The token following it is forced to be a bareword, even if there is a
subroutine with that name, and even under strict mode.
2. The bareword has its double colons changed to slashes and has .pm
appended to the end.
3. A stash is autovivified during compilation.

Items 1 and 3 happen in the tokeniser, based on whether the token that
immediately follows the ‘require’ token is an identifier. Item 2
happens in op.c when the op tree is being built, and depends on whether
the child op is a single constant that was a bareword.

This means that cases like ‘require a​::b . "foo"’ treat a​::b as a
bareword, exempt from strict mode, but that bareword does not undergo
the s|​::|/|g and .= ".pm" treatment. So the bareword is only half
special.

(This also means that ‘require foo’ is allowed under strict, but
‘require(foo)’ isn’t, even though the latter, when used outside of
strict mode, turns into ‘require "foo.pm"’. However, ‘require(foo)’
treats foo as a sub call if there is a foo sub in scope. I don’t want
to deal with that just yet.)

I’m wondering whether it would be possible to make the require code in
toke.c scan past the bareword and see whether it is followed by
something that would make the child op of require into something more
than just a bareword; i.e., an infix op of higher precedence than
require (<< >> and above) or an opening parenthesis (for require foo()).
If such a character is *not* found, then the bareword can get its
special treatment via S_force_word, etc. Can anyone see anything I’ve
missed?

I would prefer to avoid this method, if possible. If we introduce
pluggable infix operators, this won’t work or will complicate things.

Simply moving all the bareword handling to op.c won’t work, because
‘require foo’ will turn into ‘require(foo())’ if there is a foo sub.
Though it’s usually possible to detect that, it isn’t with constant subs.

I wonder whether we should somehow record the name of a sub (more
precisely, the name whereby it is invoked) that is inlined as a constant.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented May 11, 2012

From @cpansprout

The plot thickens​:

$ perl -e 'sub v123{a} require v123()'
syntax error at -e line 1, near "require v123("
Execution of -e aborted due to compilation errors.

$ perl -e 'sub v123{a} require +v123()'
Warning​: Use of "require" without parentheses is ambiguous at -e line 1.
Can't locate a in @​INC (...) at -e line 1.

$ perl -e 'sub v123{a} require v123.""'
Warning​: Use of "require" without parentheses is ambiguous at -e line 1.
Can't locate v123 in @​INC (...) at -e line 1.

$ perl -e 'sub v123{a} require v123 .""'
Warning​: Use of "require" without parentheses is ambiguous at -e line 1.
Can't locate { in @​INC (...) at -e line 1.

$ perl -e 'sub v123{a} require v123'
Perl v123.0.0 required--this is only v5.10.1, stopped at -e line 1.

$ perl -e 'sub v123{a} require +v123'
Warning​: Use of "require" without parentheses is ambiguous at -e line 1.
Can't locate a in @​INC (...) at -e line 1.

That whitespace is significant is troubling. That v123 could be
interpreted as a bareword with no => following it is more troubling.

It is not at all clear how these things are supposed to behave.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented May 11, 2012

From [Unknown Contact. See original ticket]

The plot thickens​:

$ perl -e 'sub v123{a} require v123()'
syntax error at -e line 1, near "require v123("
Execution of -e aborted due to compilation errors.

$ perl -e 'sub v123{a} require +v123()'
Warning​: Use of "require" without parentheses is ambiguous at -e line 1.
Can't locate a in @​INC (...) at -e line 1.

$ perl -e 'sub v123{a} require v123.""'
Warning​: Use of "require" without parentheses is ambiguous at -e line 1.
Can't locate v123 in @​INC (...) at -e line 1.

$ perl -e 'sub v123{a} require v123 .""'
Warning​: Use of "require" without parentheses is ambiguous at -e line 1.
Can't locate { in @​INC (...) at -e line 1.

$ perl -e 'sub v123{a} require v123'
Perl v123.0.0 required--this is only v5.10.1, stopped at -e line 1.

$ perl -e 'sub v123{a} require +v123'
Warning​: Use of "require" without parentheses is ambiguous at -e line 1.
Can't locate a in @​INC (...) at -e line 1.

That whitespace is significant is troubling. That v123 could be
interpreted as a bareword with no => following it is more troubling.

It is not at all clear how these things are supposed to behave.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 23, 2013

From @nwc10

On Thu, May 10, 2012 at 06​:38​:26PM -0700, Father Chrysostomos via RT wrote​:

The plot thickens​:

It is not at all clear how these things are supposed to behave.

Agree. It isn't.

(No-one else seems to have commented)

Nicholas Clark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants