Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with superscripts when there is no number in front of it (³² == 9) #4787

Closed
p6rt opened this issue Nov 26, 2015 · 17 comments
Closed
Labels
RFC Request For Comments

Comments

@p6rt
Copy link

p6rt commented Nov 26, 2015

Migrated from rt.perl.org#126732 (status was 'rejected')

Searchable as RT126732$

@p6rt
Copy link
Author

p6rt commented Nov 26, 2015

From @AlexDaniel

Code​:
say ³²

Result​:
9

This should probably be a syntax error.

@p6rt
Copy link
Author

p6rt commented Dec 23, 2015

From mattoates@gmail.com

If no other numeric literal is given as a base along with the unicode
exponent characters exponentiation is done on the first digit of the
exponent. This I feel should instead at least warn the base is being
assumed, if not be a compile time error. Worse assuming a single digit
base mixup feels really bad since the code silently succeeds in a way
defined by the implementation not explicitly by the programmer.

$ perl6
say ⁸⁸
16777216
say 8⁸
16777216

@p6rt
Copy link
Author

p6rt commented Jul 5, 2016

From @zoffixznet

| This should probably be a syntax error.

There's an agreement on that. Possibly to force the user to add parentheses​: (³)² == 9

There's been a discussion on the topic today, including for why such behaviour is observed​: http://irclog.perlgeek.de/perl6/2016-07-05#i_12788472

Since superscripts are 'No' category, the first digit so it's the same as ⑤², except with ³ instead of ⑤

@p6rt
Copy link
Author

p6rt commented Jul 5, 2016

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Dec 29, 2016

From @ronaldxs

Looks like it should be merged with https://rt-archive.perl.org/perl6/Ticket/Display.html?id=126732

@p6rt
Copy link
Author

p6rt commented Dec 29, 2016

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Jun 5, 2017

From @zoffixznet

FWIW, I rescind all of my previous comments on the matter and now think no special casing should be done to error out on ³² or anything like that.

The only people I see complaining about it are those who just type it up randomly to see what it'd do; i.e. not an issue in real programs. I see no sufficient argument to add special casing in code, documentation, and tests, without solving any real problems.

1 similar comment
@p6rt
Copy link
Author

p6rt commented Jun 5, 2017

From @zoffixznet

FWIW, I rescind all of my previous comments on the matter and now think no special casing should be done to error out on ³² or anything like that.

The only people I see complaining about it are those who just type it up randomly to see what it'd do; i.e. not an issue in real programs. I see no sufficient argument to add special casing in code, documentation, and tests, without solving any real problems.

@p6rt
Copy link
Author

p6rt commented Jun 5, 2017

From @zoffixznet

More commentary on the issue​: https://irclog.perlgeek.de/perl6/2017-06-05#i_14686462

Looking at all the No chars ( https://gist.github.com/Whateverable/02fff4038552bff6f31f34042016cb9eb ) there are plenty of candidates that look even worse than two superscripts. So IMO, this ticket is a clear candidate for rejection.

@p6rt
Copy link
Author

p6rt commented Jun 7, 2017

From @AlexDaniel

“The only people I see complaining about it are those who just type it up randomly to see what it'd do”

We had a bunch of segfaults and overflows that could only be caused by people throwing random stuff into the compiler. And yes, very often we had to go through this “wait, but normal people will not see this” idea, and in every case it was fixed in rakudo instead (for example, because this kind of stuff makes the language look fragile).

Let's resolve the issue by adding an error message and not by closing our eyes on this.

On 2017-06-05 03​:19​:58, cpan@​zoffix.com wrote​:

FWIW, I rescind all of my previous comments on the matter and now
think no special casing should be done to error out on ³² or anything
like that.

The only people I see complaining about it are those who just type it
up randomly to see what it'd do; i.e. not an issue in real programs. I
see no sufficient argument to add special casing in code,
documentation, and tests, without solving any real problems.

@p6rt
Copy link
Author

p6rt commented Jun 7, 2017

From @zoffixznet

On Wed, 07 Jun 2017 08​:48​:20 -0700, alex.jakimenko@​gmail.com wrote​:

We had a bunch of segfaults

Segfaults are program memory access errors. Here, we're talking about well-defined
behaviour that you wish to make more complex on entirely arbitrary whim by special-casing
the compiler, documentation, tests, and any program that relies on this well-defined behaviour.

What baffles me is we have several people calling for the ban on The Superscripts yet, no one appears
appear to have any issues with ⅟², 𑁓², ౸², ㆒², 𐌣², and 𑁒² which are also perfectly valid sequences. I'll
tell you why, because no one typed that crap up on IRC and then said it looks weird to them. It's the sole
reason this ticket exists and there's isn't a single real program in the world where making superscripts
special-cased and erroring out on them would've done any good.

The price of your proposal is unwanted complexity and IMO your justification for introducing it is entirely
insufficient, poorly-defined, and arbitrary. The purpose of errors is to help people, not to make the language more complex.

@p6rt
Copy link
Author

p6rt commented Jun 7, 2017

From @zoffixznet

On Wed, 07 Jun 2017 14​:09​:25 -0700, jo@​durchholz.org wrote​:

There's also the issue that undefined behaviour tends to become exploitable as part of a security hole.
So I'm seconding Alekx-Daniel on this.

It's not undefined. My entire point is the reason these sequence parse is due to well defined behaviour
that a No character can be used as a literal numeral. It's just so happens superscripts are No numerals,
which is why they're allowed to be used as the leading numeral.

What's undefined is why you, and Alex-Daniel you're seconding, here are choosing to ban superscripts. If it's aesthetics alone,
then there are plenty of other characters that foot the bill. Will you be special casing them as well? Will we create a
Yucky Character Unicode Committee to police unsightly combinations? That's what's undefined here.

@p6rt
Copy link
Author

p6rt commented Jun 7, 2017

From @zoffixznet

Quoting Joachim Durchholz <jo@​durchholz.org>​:

Actually I'd like to *remove* a special case​: That ² is to be interpreted as 2

But it's NOT a special case. You can use any character with No property as a numeric
literal. That's. The. Entire. Rule that governs the behaviour under examination in this
ticket. No special cases. No "unless followed by a superscript char". Just​: "Any `No` char is good"

There's quite a bunch of these chars​: https://gist.github.com/Whateverable/d94b6a42532a4c1262df794d9be799f3

So just as you can write​:
 
  <Zoffix> m​: say ⅓²
  <camelia> rakudo-moar 0a1008​: OUTPUT​: «0.111111␤»

You can write​:

  <Zoffix> m​: say ²²
  <camelia> rakudo-moar 0a1008​: OUTPUT​: «4␤»

In BOTH of these cases a No numeral is used with a ² "postfix n" operator.
Entirely identical. No special casing.

But before deciding how to deal with them​: What are the
supposed/expected meaning of the following constructs?


²

3² should be 3 squared, i.e. 9
Indeed that is it. Same as in regular mathematics. The 3 raised to power 2​:

  <Zoffix> m​: say 3²
  <camelia> rakudo-moar 0a1008​: OUTPUT​: «9␤»

x² to be x squared, i.e. an expression
Indeed that is it. Same as in regular mathematics. `x` raised to power 2​:

  <Zoffix> m​: my \x = 42; say x²
  <camelia> rakudo-moar 0a1008​: OUTPUT​: «1764␤»

² confuses me, does that even make sense?
It's the same as the stated rule above​: "Any `No` char can be used as a numeric literal"

  <Zoffix> m​: dd [.unival, .uniprop] with '²'
  <camelia> rakudo-moar 0a1008​: OUTPUT​: «[2, "No"]␤»

As you can see above, ² is indeed a `No` char, and it's Unicode numeric value is 2, which is
the numeral you get.

What happens actually is that in ³², the ³ is taken as a base and the
² as an exponent (that's why we're getting ³² == 9.
If the superscripts were handled just like normal literal digits, the
result should be 32.

No, the description above is incorrect and conflates `Nd` Unicode characters that can be
used as **digits** with `No` chars that can be used as **numerals**. They can't be used as
individual **digits**. And since Perl 6's expects a term here to be followed by an op,
there's no ambiguity about what `²` in `³²` is supposed to be interpreted as. It's an
operator, so we have a `No` numeral followed by an operator; or 3 raised to the power of 2.

*If* Perl6 is interpreting superscripts specially (and it already
does)

That statement is incorrect. It doesn't.

, *then* I expect it to recognize a bare ² as an exponent
That's incorrect, because it expects a term in that position, not an operator.

since bare exponents do not make any sense
They do. In that context they're a term and you can use ANY `No` character as a numeral.

I.e. no specialcasing at all beyond what's already there.

There's no special casing. There's just one rule​: "you can use `No` characters as numerals".

Simple. Elegant. Easy to remember.

@p6rt
Copy link
Author

p6rt commented Jun 8, 2017

From @zoffixznet

Quoting Joachim Durchholz <jo@​durchholz.org>​:

Am 08.06.2017 um 01​:11 schrieb Zoffix Znet via RT​:

Quoting Joachim Durchholz <jo@​durchholz.org>​:
That cannot be correct. There's that other rule that turns
superscripts into exponents.

Except it IS correct. There's no "other rule". There are No characters as literals
and superscript power operators.

What I don't know is how Rakudo distinguishes ²
(superscript/exponent) from 2 (never assumed to be an exponent).
Ideally it would be some Unicode property, but I do not happen to
know whether Unicode already offers a property for that.

I already explained that. Perl 6 expects an operator at that position. That's
how the entire freaking language works. You can't keep ignoring the most basic
rule of the language, while trying to advocate how that language should work.

Since we're talking about what Perl *should* do, which means
"programmer expectations", the details of how the parsing is done do
not matter *that* much

Except they do. This isn't a detail of parsing. It's the way the language works
ops follow terms and terms follow ops. That's why there's no ambiguity.

It turns ² into an exponent in some contexts, and into a bare number
in others.
That's pretty special when compared how it interprets 2.

It interprets 2 just the same way. There's no ambiguity in this code​:

  <Zoffix> m​: sub infix​:<2> { $^a + $^b }; say 2 2 2
  <camelia> rakudo-moar 1ac799​: OUTPUT​: «4␤»

Because when an op is expected. There's just one op named `2`. And when
a term is expected, there's just one term named 2.

And since nobody is going to use that

Yes! Exactly. You've put the nail in your own coffin with that one. No one ever
is going to use that. So adding special casing Perl 6's standard and simple behaviour
in implementations, documentation, and tests, along with any of third party user programs,
just to throw an error doesn't make any sense, as the error will not help anyone.

And if you still maintain that ²² should be banned. I invite you to answer my
original question that you evaded. Why are superscripts special?
Why are you not banning ⅟², 𑁓², ౸², ㆒², 𐌣², and 𑁒² as well? And if you are,
who are the members of the Yucky Character Unicode Committee to police unsightly combinations?

@p6rt
Copy link
Author

p6rt commented Jun 8, 2017

From @zoffixznet

After a conversation in #perl6-dev[^1], I'm rejecting this ticket.

Unlike invisible operators (RT#​128159), there's no security risk involved.
Unlike `&0` (RT#​128159), there's no ambiguity in what the non-throwing behaviour is supposed to be.
This is just a peculiar intersection of two very-well defined rules and there does not appear to be a sufficient reason to mangle those rules with special cases, all for the sake of banning this intersection.

[1] https://irclog.perlgeek.de/perl6-dev/2017-06-08#i_14703885

@p6rt
Copy link
Author

p6rt commented Jun 8, 2017

@zoffixznet - Status changed from 'open' to 'rejected'

@p6rt p6rt closed this as completed Jun 8, 2017
@p6rt
Copy link
Author

p6rt commented Jun 9, 2017

From @zoffixznet

Tests covering the discussed behaviour​: Raku/roast@a520037d7f

@p6rt p6rt added the RFC Request For Comments label Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC Request For Comments
Projects
None yet
Development

No branches or pull requests

1 participant