New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perl5: false warning on "Multidimensional syntax not supported" #16535
Comments
From wolf-dietrich_moeller@t-online.deHi, Best regards use warnings;
my @x = (['a','b']);
my @y = ('a','b');
my $z = 'rst';
my $i = 's';
print ' 6: ',$x[0][index $z,$i],"\n";
print ' 7: ',$y[index $z,$i],"\n";
print ' 8: ',$y[index($z,$i)],"\n";
print ' 9: ',$y[index 'rst',$i],"\n";
print '10: ',@y[index $z,$i],"\n"; Output:
Perl Info
|
From @jkeenanI've been able to reproduce this in perl-5.24.1. Hence, it's not a regression introduced in the 5.27 development cycle and is not a blocker for 5.28.0. Thank you very much. -- |
The RT System itself - Status changed from 'new' to 'open' |
This output seems to be consistent going back to 5.8 at the least. This sounds more like a feature request or by design. |
It smells like a bug to me, it is decidedly odd that the line-7 example warns when none of the other 4 examples do. If I encountered this I'd also be reporting it. |
On Fri, 31 Jan 2020 at 11:49, Hugo van der Sanden ***@***.***> wrote:
It smells like a bug to me, it is decidedly odd that the line-7 example
warns when none of the other 4 examples do. If I encountered this I'd also
be reporting it.
It is definitely a bug, or more specifically a broken heuristic.
See toke.c line 4955:
if (*s == '[') {
PL_tokenbuf[0] = '@';
if (ckWARN(WARN_SYNTAX)) {
char *t = s+1;
while ( isSPACE(*t)
|| isWORDCHAR_lazy_if_safe(t, PL_bufend, UTF)
|| *t == '$')
{
t += UTF ? UTF8SKIP(t) : 1;
}
if (*t++ == ',') {
PL_bufptr = skipspace(PL_bufptr); /* XXX can
realloc */
while (t < PL_bufend && *t != ']')
t++;
Perl_warner(aTHX_ packWARN(WARN_SYNTAX),
"Multidimensional syntax %" UTF8f " not
supported",
UTF8fARG(UTF,(int)((t - PL_bufptr) +
1), PL_bufptr));
}
}
}
As written it doesnt grok that this is a function call. The parens case
works because when it falls out of the loop it is pointing at a '('.
This shouldn't be too hard to fix either. I have to say however, it feels
wrong to me that this is done in toke.c. It feels like it really should be
resolved somehow in the parser, not the toker. I bet Zefram would know how
to fix it properly if he felt like it.
Yves
|
Fix issue #16535 - $t[index $x, $y] should not throw Multidimensional array warnings. The heuristic for detecting lists in array subscripts is implemented in toke.c, which means it is not particularly reliable. There are lots of ways that code might return a list in an array subscript. So for instance $t[do{ $x, $y }] should throw a warning but doesn't. On the other hand, we can make this warning less likely to happen by being a touch more careful about how we parse the inside of the square brackets so we do not throw an exception from $t[index $x,$y]. Really this should be moved to the parser so we do not need to rely on fallable heuristics, and also into the runtime so that if we have $t[f()] and f() returns a list we can also warn there. But for now this improves things somewhat.
On Fri, 31 Jan 2020 at 13:35, demerphq ***@***.***> wrote:
On Fri, 31 Jan 2020 at 11:49, Hugo van der Sanden <
***@***.***> wrote:
> It smells like a bug to me, it is decidedly odd that the line-7 example
> warns when none of the other 4 examples do. If I encountered this I'd also
> be reporting it.
>
It is definitely a bug, or more specifically a broken heuristic.
See toke.c line 4955:
if (*s == '[') {
PL_tokenbuf[0] = '@';
if (ckWARN(WARN_SYNTAX)) {
char *t = s+1;
while ( isSPACE(*t)
|| isWORDCHAR_lazy_if_safe(t, PL_bufend, UTF)
|| *t == '$')
{
t += UTF ? UTF8SKIP(t) : 1;
}
if (*t++ == ',') {
PL_bufptr = skipspace(PL_bufptr); /* XXX can
realloc */
while (t < PL_bufend && *t != ']')
t++;
Perl_warner(aTHX_ packWARN(WARN_SYNTAX),
"Multidimensional syntax %" UTF8f "
not supported",
UTF8fARG(UTF,(int)((t - PL_bufptr) +
1), PL_bufptr));
}
}
}
As written it doesnt grok that this is a function call. The parens case
works because when it falls out of the loop it is pointing at a '('.
This shouldn't be too hard to fix either. I have to say however, it feels
wrong to me that this is done in toke.c. It feels like it really should be
resolved somehow in the parser, not the toker. I bet Zefram would know how
to fix it properly if he felt like it.
Anyway, until someone who understands the grammar much better than me steps
up to move this to the parser I have pushed smoke-me/fix_issue_16535 to
close this ticket.
Cheers,
yves
…--
perl -Mre=debug -e "/just|another|perl|hacker/"
|
+1 The OP's program PASS on smoke-me/fix_issue_16535 branch. |
demerphq wrote:
This shouldn't be too hard to fix either. I have to say however, it feels
wrong to me that this is done in toke.c. It feels like it really should be
resolved somehow in the parser, not the toker.
If there's a consensus to make this a semantic heuristic rather than a
syntactic one, it would be fairly easy to make it so. There are three
productions in perly.y that generate an OP_AELEM, and each coerces the
index expression to scalar. It is at that point, immediately before
applying the context, that the index expression optree could sensibly
be examined, and a warning generated if it looked too list-like.
One would have to come up with a specific semantic heuristic. Note that
many situations that currently generate the "Multidimensional syntax
not supported" warning also generate an entirely semantic "Useless use
in void context" warning, so the separate "Multidimensional syntax"
warning looks a bit redundant. One might want to trigger it on exactly
the same semantic condition that generates the "Useless use" warning.
More radically, one might want to entirely merge the warnings, making the
"Multidimensional syntax" part just a parenthetical note in "Useless use"
warnings that arise in this context.
I'm not going to offer an opinion on what conditions ought to trigger
this warning. I swore off that class of debate years ago.
…-zefram
|
On Sat, 1 Feb 2020, 12:27 Zefram, ***@***.***> wrote:
demerphq wrote:
>This shouldn't be too hard to fix either. I have to say however, it feels
>wrong to me that this is done in toke.c. It feels like it really should be
>resolved somehow in the parser, not the toker.
If there's a consensus to make this a semantic heuristic rather than a
syntactic one, it would be fairly easy to make it so. There are three
productions in perly.y that generate an OP_AELEM, and each coerces the
index expression to scalar. It is at that point, immediately before
applying the context, that the index expression optree could sensibly
be examined, and a warning generated if it looked too list-like.
One would have to come up with a specific semantic heuristic. Note that
many situations that currently generate the "Multidimensional syntax
not supported" warning also generate an entirely semantic "Useless use
in void context" warning, so the separate "Multidimensional syntax"
warning looks a bit redundant. One might want to trigger it on exactly
the same semantic condition that generates the "Useless use" warning.
More radically, one might want to entirely merge the warnings, making the
"Multidimensional syntax" part just a parenthetical note in "Useless use"
warnings that arise in this context.
I'm not going to offer an opinion on what conditions ought to trigger
this warning. I swore off that class of debate years ago.
Problem is I think you are one of the few who can offer a cogent summary of
the options available to us. So while i appreciate your desire to stay out
of any debates on the subject we need your insight before we can even start
any debate on this.
Personally my thought is that if someone uses a list as an array subscript
we should warn. Not sure how to state that more formally. Eg imo if we warn
on
$array[$x,$y]
Then we should also warn on
$array[do{ $x, $y}]
But we shouldn't warn at compile time if it's a function call unless the
function returns a list.
FWIW when I was writing tests I didnt see any "useless use of..." warnings
from things like
$array[0x1,0x2]
Cheers
Yves
… |
demerphq wrote:
Personally my thought is that if someone uses a list as an array subscript
we should warn. Not sure how to state that more formally.
It might be as simple as the top-level op being an OP_LIST.
But we shouldn't warn at compile time if it's a function call unless the
function returns a list.
One can't generally tell at compile time whether the function would
want to return a list (however that's defined). A function call is
pretty opaque.
FWIW when I was writing tests I didnt see any "useless use of..." warnings
from things like
$array[0x1,0x2]
The constant 1 (however spelled) is exempt from that warning, because
of its common use as a filler value. If you were to swap the two list
elements in the above expression, you'd get a warning for useless use
of a constant 2.
…-zefram
|
On Sat, 1 Feb 2020, 20:42 Zefram, ***@***.***> wrote:
demerphq wrote:
>Personally my thought is that if someone uses a list as an array subscript
>we should warn. Not sure how to state that more formally.
It might be as simple as the top-level op being an OP_LIST.
I'd be fine with that personally...
>But we shouldn't warn at compile time if it's a function call unless the
>function returns a list.
One can't generally tell at compile time whether the function would
want to return a list (however that's defined). A function call is
pretty opaque.
Sorry, my mistake in how I phrased that. I should have said that detecting
a list return should be a runtime check.
>FWIW when I was writing tests I didnt see any "useless use of..." warnings
>from things like
>
>$array[0x1,0x2]
The constant 1 (however spelled) is exempt from that warning, because
of its common use as a filler value. If you were to swap the two list
elements in the above expression, you'd get a warning for useless use
of a constant 2.
Oh. Interesting. I'll write more tests. Thanks!
Cheers
Yves
… |
demerphq wrote:
Sorry, my mistake in how I phrased that. I should have said that detecting
a list return should be a runtime check.
That would be difficult, and would give that version of the warning
a very different character from the compile-time versions. Since the
function call gets put into scalar context, the function *can't* return
a list at runtime. To detect an attempt to return a list, i.e., the
return value coming from an OP_LIST (or whatever the criterion is)
that inherited scalar context from the function call, you'd need to
add a bunch of machinery to the op execution system. That's quite an
unattractive approach, from the core maintenance point of view.
…-zefram
|
On Sat, 1 Feb 2020, 21:06 Zefram, ***@***.***> wrote:
demerphq wrote:
>Sorry, my mistake in how I phrased that. I should have said that detecting
>a list return should be a runtime check.
That would be difficult, and would give that version of the warning
a very different character from the compile-time versions. Since the
function call gets put into scalar context, the function *can't* return
a list at runtime. To detect an attempt to return a list, i.e., the
return value coming from an OP_LIST (or whatever the criterion is)
that inherited scalar context from the function call, you'd need to
add a bunch of machinery to the op execution system. That's quite an
unattractive approach, from the core maintenance point of view.
Ok. Fair enough. Scratch that then.
Thanks,
Yves
… |
I wrote:
It might be as simple as the top-level op being an OP_LIST.
Further thought: it would make sense to match the condition for a hash
index expression to be treated as multidimensional. This condition is
in fact the top-level op being an OP_LIST.
…-zefram
|
Fix issue #16535 - $t[index $x, $y] should not throw Multidimensional array warnings. The heuristic for detecting lists in array subscripts is implemented in toke.c, which means it is not particularly reliable. There are lots of ways that code might return a list in an array subscript. So for instance $t[do{ $x, $y }] should throw a warning but doesn't. On the other hand, we can make this warning less likely to happen by being a touch more careful about how we parse the inside of the square brackets so we do not throw an exception from $t[index $x,$y]. Really this should be moved to the parser so we do not need to rely on fallable heuristics, and also into the runtime so that if we have $t[f()] and f() returns a list we can also warn there. But for now this improves things somewhat.
On Fri, 31 Jan 2020 at 23:59, James E Keenan <notifications@github.com>
wrote:
On Fri, 31 Jan 2020 at 13:35, demerphq *@*.***> wrote:
[snip]
Anyway, until someone who understands the grammar much better than me
steps up to move this to the parser I have pushed smoke-me/fix_issue_16535
to close this ticket. Cheers, yves
[snip]
+1
The OP's program PASS on smoke-me/fix_issue_16535 branch.
I have merged this change as 41eecd5 but
I think we should leave the ticket open a while so Zefram can have some
time to come up with something better.
cheers,
yves
…--
perl -Mre=debug -e "/just|another|perl|hacker/"
|
This was fixed by 41eecd5 |
Migrated from rt.perl.org#133155 (status was 'open')
Searchable as RT133155$
The text was updated successfully, but these errors were encountered: