Skip Menu |
Queue is disabled
This queue is disabled and you may not create new tickets in it. Disabled queues are usually because the distribution was merged with another or changed names. Sometimes they are the end result of a bad autocreate from PAUSE data before anyone noticed.
Report information
Id: 133695
Status: pending release
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: haukex [at] zero-g.net
Cc:
AdminCc:

Operating System: (no value)
PatchStatus: (no value)
Severity: low
Type: unknown
Perl Version: (no value)
Fixed In: (no value)



Subject: Range Operator inconsistency?
Date: Wed, 28 Nov 2018 16:09:18 +0100
From: Hauke D <haukex [...] zero-g.net>
To: perlbug [...] perl.org
Download (untitled) / with headers
text/plain 1.9k
Dear P5P, As first reported on PerlMonks in this thread: https://www.perlmonks.org/?node_id=1226434 perlop says: "The range operator (in list context) makes use of the magical auto-increment algorithm if the operands are strings. ... If the final value specified is not in the sequence that the magical increment would produce, the sequence goes until the next value would be longer than the final value specified." And yet there are some really strange inconsistencies with respect to the produced ranges, sometimes the strings appear to be treated as integers, sometimes they don't. In particular, compare "0".."-1", which produces "0" through "99", to "1".."-1", which produces the empty list. Some more test cases from Perl 5.26.0 on Linux are below. (A note on the output: unfortunately Data::Dump numifies strings that look like integers - e.g. "0".."99" does in fact produce the *strings* "0" through "99" and "0".." -1 " the *strings* "0" through "9999", despite them being shown as numbers below.) $ perl -wMstrict -MData::Dump -e' dd "0".."-1" ' (0 .. 99) $ perl -wMstrict -MData::Dump -e' dd "1".."-1" ' () $ perl -wMstrict -MData::Dump -e' dd "01".."-1" ' ("01", "02", "03", "04", "05", "06", "07", "08", "09", 10 .. 99) $ perl -wMstrict -MData::Dump -e' dd "90".."-1" ' () $ perl -wMstrict -MData::Dump -e' dd "1".."xx" ' (1 .. 99) $ perl -wMstrict -MData::Dump -e' dd "11".."xx" ' (11 .. 99) $ perl -wMstrict -MData::Dump -e' dd "90".."xx" ' (90 .. 99) $ perl -wMstrict -MData::Dump -e' dd "-1".."xx" ' -1 $ perl -wMstrict -MData::Dump -e' dd "0".." -1 " ' (0 .. 9999) $ perl -wMstrict -MData::Dump -e' dd " 0 ".." -1 " ' () $ perl -wMstrict -MData::Dump -e' dd " 11 ".." -1 " ' () $ perl -wMstrict -MData::Dump -e' dd "0.0".."-1.0" ' "0.0" $ perl -wMstrict -MData::Dump -e' dd " 0.0 ".." -1.0 " ' () $ perl -wMstrict -MData::Dump -e' dd "0.0".." 1.0 " ' "0.0" $ perl -wMstrict -MData::Dump -e' dd " 0.0 ".."1.0" ' (0, 1) Thanks, Regards, -- Hauke D
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 2.1k
Hi all, Now with a test file attached. Best, -- Hauke D On Wed, 28 Nov 2018 07:15:33 -0800, haukex@zero-g.net wrote: Show quoted text
> Dear P5P, > > As first reported on PerlMonks in this thread: > https://www.perlmonks.org/?node_id=1226434 > > perlop says: "The range operator (in list context) makes use of the > magical auto-increment algorithm if the operands are strings. ... If the > final value specified is not in the sequence that the magical increment > would produce, the sequence goes until the next value would be longer > than the final value specified." > > And yet there are some really strange inconsistencies with respect to > the produced ranges, sometimes the strings appear to be treated as > integers, sometimes they don't. In particular, compare "0".."-1", which > produces "0" through "99", to "1".."-1", which produces the empty list. > > Some more test cases from Perl 5.26.0 on Linux are below. (A note on the > output: unfortunately Data::Dump numifies strings that look like > integers - e.g. "0".."99" does in fact produce the *strings* "0" through > "99" and "0".." -1 " the *strings* "0" through "9999", despite them > being shown as numbers below.) > > $ perl -wMstrict -MData::Dump -e' dd "0".."-1" ' > (0 .. 99) > $ perl -wMstrict -MData::Dump -e' dd "1".."-1" ' > () > $ perl -wMstrict -MData::Dump -e' dd "01".."-1" ' > ("01", "02", "03", "04", "05", "06", "07", "08", "09", 10 .. 99) > $ perl -wMstrict -MData::Dump -e' dd "90".."-1" ' > () > $ perl -wMstrict -MData::Dump -e' dd "1".."xx" ' > (1 .. 99) > $ perl -wMstrict -MData::Dump -e' dd "11".."xx" ' > (11 .. 99) > $ perl -wMstrict -MData::Dump -e' dd "90".."xx" ' > (90 .. 99) > $ perl -wMstrict -MData::Dump -e' dd "-1".."xx" ' > -1 > $ perl -wMstrict -MData::Dump -e' dd "0".." -1 " ' > (0 .. 9999) > $ perl -wMstrict -MData::Dump -e' dd " 0 ".." -1 " ' > () > $ perl -wMstrict -MData::Dump -e' dd " 11 ".." -1 " ' > () > $ perl -wMstrict -MData::Dump -e' dd "0.0".."-1.0" ' > "0.0" > $ perl -wMstrict -MData::Dump -e' dd " 0.0 ".." -1.0 " ' > () > $ perl -wMstrict -MData::Dump -e' dd "0.0".." 1.0 " ' > "0.0" > $ perl -wMstrict -MData::Dump -e' dd " 0.0 ".."1.0" ' > (0, 1) > > Thanks, Regards, > -- Hauke D
Subject: rt133695.pl
Download rt133695.pl
text/x-perl 1.7k
#!/usr/bin/env perl use warnings; use strict; use Test::More; # https://rt.perl.org/Public/Bug/Display.html?id=133695 # http://perldoc.perl.org/perlop.html#Range-Operators # "The range operator (in list context) makes use of the magical # auto-increment algorithm if the operands are strings. ... If the # final value specified is not in the sequence that the magical # increment would produce, the sequence goes until the next value # would be longer than the final value specified. ... If the initial # value specified isn't part of a magical increment sequence (that # is, a non-empty string matching /^[a-zA-Z]*[0-9]*\z/ ), only the # initial value will be returned." sub gen_expect_range { my ($first, $last, $inseq) = @_; return [ $first ] unless $first =~ /^[a-zA-Z]*[0-9]*\z/; my $x = "$first"; my @out; if ($inseq) { # is the $last value in the magic autoinc seq? while ( $x ne $last ) { push @out, $x++; } } else { while ( length($x) <= length($last) ) { push @out, $x++; } } return \@out; } sub runtest { my ($first, $last, $inseq) = @_; my $got = [ "$first" .. "$last" ]; my $expected = gen_expect_range($first, $last, $inseq); is_deeply $got, $expected or diag explain $got; } # currently passing tests: runtest("0", "-1", 0); runtest("00", "-1", 0); runtest("1", "xx", 0); runtest("11", "xx", 0); runtest("90", "xx", 0); runtest("-1", "xx", 0); runtest("0", " -1 ", 0); runtest("0.0", "-1.0", 0); runtest("0.0", " 1.0 ", 0); # currently failing tests: runtest("1", "-1", 0); runtest("90", "-1", 0); runtest(" 0 ", " -1 ", 0); runtest(" 11 ", " -1 ", 0); runtest(" 0.0 ", " -1.0 ", 0); runtest(" 0.0 ", "1.0", 0); done_testing;
To: Hauke D via RT <perlbug-comment [...] perl.org>
Date: Thu, 29 Nov 2018 12:05:03 +0000
Subject: Re: [perl #133695] Range Operator inconsistency?
From: Dave Mitchell <davem [...] iabyn.com>
CC: perl5-porters [...] perl.org
On Wed, Nov 28, 2018 at 07:56:34AM -0800, Hauke D via RT wrote: Show quoted text
> > As first reported on PerlMonks in this thread: > > https://www.perlmonks.org/?node_id=1226434 > > > > perlop says: "The range operator (in list context) makes use of the > > magical auto-increment algorithm if the operands are strings. ... If the > > final value specified is not in the sequence that the magical increment > > would produce, the sequence goes until the next value would be longer > > than the final value specified." > > > > And yet there are some really strange inconsistencies with respect to > > the produced ranges, sometimes the strings appear to be treated as > > integers, sometimes they don't. In particular, compare "0".."-1", which > > produces "0" through "99", to "1".."-1", which produces the empty list.
Perl internally tries very hard to treat the range args as numeric where possible, and has a special exception for the string "0". The relevant macro from pp_ctl.c (reformed for clarity) is: /* This code tries to decide if "$left .. $right" should use the magical string increment, or if the range is numeric (we make an exception for .."0" [#18165]). AMS 20021031. */ #define RANGE_IS_NUMERIC(left,right) ( SvNIOKp(left) || (SvOK(left) && !SvPOKp(left)) || SvNIOKp(right) || (SvOK(right) && !SvPOKp(right)) || ( ( (!SvOK(left) && SvOK(right)) || ( (!SvOK(left) || looks_like_number(left)) && SvPOKp(left) && *SvPVX_const(left) != '0') ) && (!SvOK(right) || looks_like_number(right)) ) ) Frabnkly I don't understand all those conditions; they are a lot more specific than the docs. -- A power surge on the Bridge is rapidly and correctly diagnosed as a faulty capacitor by the highly-trained and competent engineering staff. -- Things That Never Happen in "Star Trek" #9
Subject: [PATCH] Range Operator inconsistency
RT-Send-CC: perl5-porters [...] perl.org, davem [...] iabyn.com, perl5-porters [...] perl.org, davem [...] iabyn.com
Download (untitled) / with headers
text/plain 4.6k
Hi, Thanks for looking into this! The code comment in the code you showed [1] mentions #18165 [2] which references #18114 [3] where a reply by Slaven Rezic makes sense to me: 'There is a special handling for numeric strings beginning with a "0". This is to allow things like "01".."31" to preserve the leading zero for one-digit numbers.' The basic behavior appears to go all the way back to 5.000 [4]. [1] https://perl5.git.perl.org/perl.git/blob/23665de87341f4f3452009759d4fc95ce30b8ced:/pp_ctl.c#l1179 [2] https://rt.perl.org/Public/Bug/Display.html?id=18165 [3] https://rt.perl.org/Public/Bug/Display.html?id=18114 [4] https://perl5.git.perl.org/perl.git/blob/refs/tags/perl-5.000:/pp_ctl.c#l694 So my interpretation of the rules is this: If the left and right operands are strings, then check if they looks_like_number. If they do, treat them as integers. However, make an exception when the left-hand side begins with "0", for the reason stated above. The key word here is *begins* with zero; the condition *SvPVX_const(left)!='0' causes this inconsistency: -3..-1 and "-3".."-1" are (-3,-2,-1) -2..-1 and "-2".."-1" are (-2,-1) -1..-1 and "-1".."-1" are (-1) 1..-1 and "1".."-1" are () however: 0..-1 is () but "0".."-1" is (0..99) That latter behavior may be in line with "01".."-1", which is ("01","02","03",...), but IMO it's still surprising, and in any case the fact that strings that look like numbers are treated as such appears to be undocumented. I have two alternative proposals: (A) leave the behavior as-is, but document it, or (B) change the behavior so that the above condition is 'if the LHS is a string that begins with 0, except for the string "0" itself' (and document it) - this would cause the "01".."31" case to still work, but also cause "0".."-1" to act like 0..-1. Patches for both A (just document) and B (change behavior) are attached, with tests included (a full build passes all tests on my end). My internals knowledge is quite limited so I hope my use of SvCUR in the second patch is correct. My personal preference is option B, since it gets rid of the above inconsistency, but I understand that if there are worries about backwards compatibility; option A may be better in that respect. The way I've worded the documentation pretty much nails down the behavior and wouldn't allow for future changes, a third option might be to word the documentation more loosely and leave the door open for future changes. Thanks, Regards, -- Hauke D P.S. The attachment "rt133695.pl" in my previous message contains an off-by-one error, but in an unused branch of code, so the output and conclusions produced by the script are still correct (as long as $inseq is always false, which it currently is). On Thu, 29 Nov 2018 04:05:27 -0800, davem wrote: Show quoted text
> On Wed, Nov 28, 2018 at 07:56:34AM -0800, Hauke D via RT wrote:
> > > As first reported on PerlMonks in this thread: > > > https://www.perlmonks.org/?node_id=1226434 > > > > > > perlop says: "The range operator (in list context) makes use of the > > > magical auto-increment algorithm if the operands are strings. ... > > > If the > > > final value specified is not in the sequence that the magical > > > increment > > > would produce, the sequence goes until the next value would be > > > longer > > > than the final value specified." > > > > > > And yet there are some really strange inconsistencies with respect > > > to > > > the produced ranges, sometimes the strings appear to be treated as > > > integers, sometimes they don't. In particular, compare "0".."-1", > > > which > > > produces "0" through "99", to "1".."-1", which produces the empty > > > list.
> > Perl internally tries very hard to treat the range args as numeric > where > possible, and has a special exception for the string "0". The relevant > macro from pp_ctl.c (reformed for clarity) is: > > /* This code tries to decide if "$left .. $right" should use the > magical string increment, or if the range is numeric (we make > an exception for .."0" [#18165]). AMS 20021031. */ > > #define RANGE_IS_NUMERIC(left,right) ( > SvNIOKp(left) > || (SvOK(left) && !SvPOKp(left)) > || SvNIOKp(right) > || (SvOK(right) && !SvPOKp(right)) > || ( > ( > (!SvOK(left) && SvOK(right)) > || ( > (!SvOK(left) || looks_like_number(left)) > && SvPOKp(left) > && *SvPVX_const(left) != '0') > ) > && (!SvOK(right) || looks_like_number(right)) > ) > ) > > Frabnkly I don't understand all those conditions; they are a lot more > specific than the docs.
Subject: rt133695_rangeop_zero_A_doc_only.patch
From 52296ca221128e2ed89d2f9e39520dcb96801eb9 Mon Sep 17 00:00:00 2001 From: Hauke D <haukex@zero-g.net> Date: Fri, 30 Nov 2018 13:56:10 +0100 Subject: [PATCH] (perl #133695) Document range op details "-2".."-1" is the same as -2..-1 and "1".."-1" is the same as 1..-1, but "0".."-1" is the same as "0".."99". This patch documents the rules for the range operator in list context with both operands being strings more explicitly. See also #18165 and #18114. --- pod/perlop.pod | 85 +++++++++++++++++++++++++++++++++++++++----------- pp_ctl.c | 3 +- t/op/range.t | 24 +++++++++++++- 3 files changed, 92 insertions(+), 20 deletions(-) diff --git a/pod/perlop.pod b/pod/perlop.pod index d6adbd11f2..9ff980e9b4 100644 --- a/pod/perlop.pod +++ b/pod/perlop.pod @@ -1081,26 +1081,82 @@ And now some examples as a list operator: @foo = @foo[0 .. $#foo]; # an expensive no-op @foo = @foo[$#foo-4 .. $#foo]; # slice last 5 items -The range operator (in list context) makes use of the magical -auto-increment algorithm if the operands are strings. You -can say +Because each operand is evaluated in integer form, S<C<2.18 .. 3.14>> will +return two elements in list context. - @alphabet = ("A" .. "Z"); + @list = (2.18 .. 3.14); # same as @list = (2 .. 3); -to get all normal letters of the English alphabet, or +The range operator in list context can make use of the magical +auto-increment algorithm if both operands are strings, subject to the +following rules: - $hexdigit = (0 .. 9, "a" .. "f")[$num & 15]; +=over + +=item * + +With one exception (below), if both strings look like numbers to Perl, +the magic increment will not be applied, and the strings will be treated +as numbers (more specifically, integers) instead. + +For example, C<"-2".."2"> is the same as C<-2..2>, C<"1".."-1"> is the +same as C<1..-1> (producing the empty list), and C<"2.18".."3.14"> +produces C<2, 3>. -to get a hexadecimal digit, or +=item * + +The exception to the above rule is when the left-hand string begins with +C<0>, including the string C<"0"> itself. In this case, the magic +increment I<will> be applied, even though strings like C<"01"> would +normally look like a number to Perl. + +For example, C<"01".."04"> produces C<"01", "02", "03", "04">, and +C<"0".."-1"> produces C<"0"> through C<"99"> - this may seem +surprising, but see the following rules for why it works this way. +To get dates with leading zeros, you can say: @z2 = ("01" .. "31"); print $z2[$mday]; -to get dates with leading zeros. +If you want to force strings to be interpreted as numbers, you could say + + @numbers = ( 0+$first .. 0+$last ); + +=item * + +If the initial value specified isn't part of a magical increment +sequence (that is, a non-empty string matching C</^[a-zA-Z]*[0-9]*\z/>), +only the initial value will be returned. + +For example, C<"ax".."az"> produces C<"ax", "ay", "az">, but +C<"*x".."az"> produces only C<"*x">. + +=item * + +For other initial values that are strings that do follow the rules of the +magical increment, the corresponding sequence will be returned. + +For example, you can say + + @alphabet = ("A" .. "Z"); + +to get all normal letters of the English alphabet, or + + $hexdigit = (0 .. 9, "a" .. "f")[$num & 15]; + +to get a hexadecimal digit. + +=item * If the final value specified is not in the sequence that the magical increment would produce, the sequence goes until the next value would -be longer than the final value specified. +be longer than the final value specified. If the length of the final +string is shorter than the first, the empty list is returned. + +For example, C<"a".."--"> is the same as C<"a".."zz">, C<"0".."xx"> +produces C<"0"> through C<"99">, and C<"aaa".."--"> returns the empty +list. + +=back As of Perl 5.26, the list-context range operator on strings works as expected in the scope of L<< S<C<"use feature 'unicode_strings">>|feature/The @@ -1108,10 +1164,8 @@ in the scope of L<< S<C<"use feature 'unicode_strings">>|feature/The that feature, it exhibits L<perlunicode/The "Unicode Bug">: its behavior depends on the internal encoding of the range endpoint. -If the initial value specified isn't part of a magical increment -sequence (that is, a non-empty string matching C</^[a-zA-Z]*[0-9]*\z/>), -only the initial value will be returned. So the following will only -return an alpha: +Because the magical increment only works on non-empty strings matching +C</^[a-zA-Z]*[0-9]*\z/>, the following will only return an alpha: use charnames "greek"; my @greek_small = ("\N{alpha}" .. "\N{omega}"); @@ -1131,11 +1185,6 @@ you could use the pattern C</(?:(?=\p{Greek})\p{Lower})+/> (or the L<experimental feature|perlrecharclass/Extended Bracketed Character Classes> C<S</(?[ \p{Greek} & \p{Lower} ])+/>>). -Because each operand is evaluated in integer form, S<C<2.18 .. 3.14>> will -return two elements in list context. - - @list = (2.18 .. 3.14); # same as @list = (2 .. 3); - =head2 Conditional Operator X<operator, conditional> X<operator, ternary> X<ternary> X<?:> diff --git a/pp_ctl.c b/pp_ctl.c index 17d4f0d14a..2da942aa88 100644 --- a/pp_ctl.c +++ b/pp_ctl.c @@ -1178,7 +1178,8 @@ PP(pp_flip) /* This code tries to decide if "$left .. $right" should use the magical string increment, or if the range is numeric (we make - an exception for .."0" [#18165]). AMS 20021031. */ + an exception for .."0" [#18165]). AMS 20021031. + See also [#133695] - the rules are now documented in perlop. */ #define RANGE_IS_NUMERIC(left,right) ( \ SvNIOKp(left) || (SvOK(left) && !SvPOKp(left)) || \ diff --git a/t/op/range.t b/t/op/range.t index 19ae1269ca..18eaa1fe0c 100644 --- a/t/op/range.t +++ b/t/op/range.t @@ -9,7 +9,7 @@ BEGIN { use Config; -plan (146); +plan (162); is(join(':',1..5), '1:2:3:4:5'); @@ -112,6 +112,28 @@ is(join(":","-4".."-0") , "-4:-3:-2:-1:0"); is(join(":","-4\n".."0\n") , "-4:-3:-2:-1:0"); is(join(":","-4\n".."-0\n"), "-4:-3:-2:-1:0"); +# [#133695] document inconsistency between "0".."-1" and 0..-1 +is(join(":","-2".."-1") , "-2:-1"); +is(join(":","-1".."-1") , "-1"); +is(join(":", 0 .. -1 ) , ""); +is(join(":","0".."-1") , "0:1:2:3:4:5:6:7:8:9:10:11:12:13:14:15:16:17:18:19:20:21:22:23:24:25:26:27:28:29:30:31:32:33:34:35:36:37:38:39:40:41:42:43:44:45:46:47:48:49:50:51:52:53:54:55:56:57:58:59:60:61:62:63:64:65:66:67:68:69:70:71:72:73:74:75:76:77:78:79:80:81:82:83:84:85:86:87:88:89:90:91:92:93:94:95:96:97:98:99"); +is(join(":","1".."-1") , ""); + +# these test the statements made in the documentation +# regarding the rules of string ranges +is(join(":","-2".."2"), join(":",-2..2)); +is(join(":","2.18".."3.14"), "2:3"); +is(join(":","01".."04"), "01:02:03:04"); +# "0".."-1" tested above +is(join(":","00".."31"), "00:01:02:03:04:05:06:07:08:09:10:11:12:13:14:15:16:17:18:19:20:21:22:23:24:25:26:27:28:29:30:31"); +is(join(":","ax".."az"), "ax:ay:az"); +is(join(":","*x".."az"), "*x"); +is(join(":","A".."Z"), "A:B:C:D:E:F:G:H:I:J:K:L:M:N:O:P:Q:R:S:T:U:V:W:X:Y:Z"); +is(join(":", 0..9,"a".."f"), "0:1:2:3:4:5:6:7:8:9:a:b:c:d:e:f"); +is(join(":","a".."--"), join(":","a".."zz")); +is(join(":","0".."xx"), "0:1:2:3:4:5:6:7:8:9:10:11:12:13:14:15:16:17:18:19:20:21:22:23:24:25:26:27:28:29:30:31:32:33:34:35:36:37:38:39:40:41:42:43:44:45:46:47:48:49:50:51:52:53:54:55:56:57:58:59:60:61:62:63:64:65:66:67:68:69:70:71:72:73:74:75:76:77:78:79:80:81:82:83:84:85:86:87:88:89:90:91:92:93:94:95:96:97:98:99"); +is(join(":","aaa".."--"), ""); + # undef should be treated as 0 for numerical range is(join(":",undef..2), '0:1:2'); is(join(":",-2..undef), '-2:-1:0'); -- 2.19.2
Subject: rt133695_rangeop_zero_B_change.patch
From cd2b39ae22f1a9e2090cea546da9a2c3884bf22e Mon Sep 17 00:00:00 2001 From: Hauke D <haukex@zero-g.net> Date: Fri, 30 Nov 2018 13:06:07 +0100 Subject: [PATCH] (perl #133695) "0".."-1" should act like 0..-1 Previously, *any* string beginning with 0, including the string "0" itself, would be subject to the magic string auto-increment, instead of being treated like a number. This meant that "-2".."-1" was the same as -2..-1 and "1".."-1" was the same as 1..-1, but "0".."-1" was the same as "0".."99". This patch fixes that inconsistency, while still allowing ranges like "01".."31" to produce the strings "01", "02", ... "31", which is what the "begins with 0" exception was intended for. This patch also expands the documentation in perlop and states the rules for the range operator in list context with both operands being strings more explicitly. See also #18165 and #18114. --- pod/perlop.pod | 84 +++++++++++++++++++++++++++++++++++++++----------- pp_ctl.c | 10 ++++-- t/op/range.t | 23 +++++++++++++- 3 files changed, 95 insertions(+), 22 deletions(-) diff --git a/pod/perlop.pod b/pod/perlop.pod index d6adbd11f2..d4101ff544 100644 --- a/pod/perlop.pod +++ b/pod/perlop.pod @@ -1081,26 +1081,81 @@ And now some examples as a list operator: @foo = @foo[0 .. $#foo]; # an expensive no-op @foo = @foo[$#foo-4 .. $#foo]; # slice last 5 items -The range operator (in list context) makes use of the magical -auto-increment algorithm if the operands are strings. You -can say +Because each operand is evaluated in integer form, S<C<2.18 .. 3.14>> will +return two elements in list context. - @alphabet = ("A" .. "Z"); + @list = (2.18 .. 3.14); # same as @list = (2 .. 3); -to get all normal letters of the English alphabet, or +The range operator in list context can make use of the magical +auto-increment algorithm if both operands are strings, subject to the +following rules: - $hexdigit = (0 .. 9, "a" .. "f")[$num & 15]; +=over + +=item * + +With one exception (below), if both strings look like numbers to Perl, +the magic increment will not be applied, and the strings will be treated +as numbers (more specifically, integers) instead. + +For example, C<"-2".."2"> is the same as C<-2..2>, and +C<"2.18".."3.14"> produces C<2, 3>. -to get a hexadecimal digit, or +=item * + +The exception to the above rule is when the left-hand string begins with +C<0> and is longer than one character, in this case the magic increment +I<will> be applied, even though strings like C<"01"> would normally look +like a number to Perl. + +For example, C<"01".."04"> produces C<"01", "02", "03", "04">, and +C<"00".."-1"> produces C<"00"> through C<"99"> - this may seem +surprising, but see the following rules for why it works this way. +To get dates with leading zeros, you can say: @z2 = ("01" .. "31"); print $z2[$mday]; -to get dates with leading zeros. +If you want to force strings to be interpreted as numbers, you could say + + @numbers = ( 0+$first .. 0+$last ); + +=item * + +If the initial value specified isn't part of a magical increment +sequence (that is, a non-empty string matching C</^[a-zA-Z]*[0-9]*\z/>), +only the initial value will be returned. + +For example, C<"ax".."az"> produces C<"ax", "ay", "az">, but +C<"*x".."az"> produces only C<"*x">. + +=item * + +For other initial values that are strings that do follow the rules of the +magical increment, the corresponding sequence will be returned. + +For example, you can say + + @alphabet = ("A" .. "Z"); + +to get all normal letters of the English alphabet, or + + $hexdigit = (0 .. 9, "a" .. "f")[$num & 15]; + +to get a hexadecimal digit. + +=item * If the final value specified is not in the sequence that the magical increment would produce, the sequence goes until the next value would -be longer than the final value specified. +be longer than the final value specified. If the length of the final +string is shorter than the first, the empty list is returned. + +For example, C<"a".."--"> is the same as C<"a".."zz">, C<"0".."xx"> +produces C<"0"> through C<"99">, and C<"aaa".."--"> returns the empty +list. + +=back As of Perl 5.26, the list-context range operator on strings works as expected in the scope of L<< S<C<"use feature 'unicode_strings">>|feature/The @@ -1108,10 +1163,8 @@ in the scope of L<< S<C<"use feature 'unicode_strings">>|feature/The that feature, it exhibits L<perlunicode/The "Unicode Bug">: its behavior depends on the internal encoding of the range endpoint. -If the initial value specified isn't part of a magical increment -sequence (that is, a non-empty string matching C</^[a-zA-Z]*[0-9]*\z/>), -only the initial value will be returned. So the following will only -return an alpha: +Because the magical increment only works on non-empty strings matching +C</^[a-zA-Z]*[0-9]*\z/>, the following will only return an alpha: use charnames "greek"; my @greek_small = ("\N{alpha}" .. "\N{omega}"); @@ -1131,11 +1184,6 @@ you could use the pattern C</(?:(?=\p{Greek})\p{Lower})+/> (or the L<experimental feature|perlrecharclass/Extended Bracketed Character Classes> C<S</(?[ \p{Greek} & \p{Lower} ])+/>>). -Because each operand is evaluated in integer form, S<C<2.18 .. 3.14>> will -return two elements in list context. - - @list = (2.18 .. 3.14); # same as @list = (2 .. 3); - =head2 Conditional Operator X<operator, conditional> X<operator, ternary> X<ternary> X<?:> diff --git a/pp_ctl.c b/pp_ctl.c index 17d4f0d14a..e820a9df02 100644 --- a/pp_ctl.c +++ b/pp_ctl.c @@ -1177,14 +1177,18 @@ PP(pp_flip) } /* This code tries to decide if "$left .. $right" should use the - magical string increment, or if the range is numeric (we make - an exception for .."0" [#18165]). AMS 20021031. */ + magical string increment, or if the range is numeric. Initially, + an exception was made for *any* string beginning with "0" (see + [#18165], AMS 20021031), but now that is only applied when the + string's length is also >1 - see the rules now documented in + perlop [#133695] */ #define RANGE_IS_NUMERIC(left,right) ( \ SvNIOKp(left) || (SvOK(left) && !SvPOKp(left)) || \ SvNIOKp(right) || (SvOK(right) && !SvPOKp(right)) || \ (((!SvOK(left) && SvOK(right)) || ((!SvOK(left) || \ - looks_like_number(left)) && SvPOKp(left) && *SvPVX_const(left) != '0')) \ + looks_like_number(left)) && SvPOKp(left) \ + && !(*SvPVX_const(left) == '0' && SvCUR(left)>1 ) )) \ && (!SvOK(right) || looks_like_number(right)))) PP(pp_flop) diff --git a/t/op/range.t b/t/op/range.t index 19ae1269ca..2deefc61cf 100644 --- a/t/op/range.t +++ b/t/op/range.t @@ -9,7 +9,7 @@ BEGIN { use Config; -plan (146); +plan (162); is(join(':',1..5), '1:2:3:4:5'); @@ -112,6 +112,27 @@ is(join(":","-4".."-0") , "-4:-3:-2:-1:0"); is(join(":","-4\n".."0\n") , "-4:-3:-2:-1:0"); is(join(":","-4\n".."-0\n"), "-4:-3:-2:-1:0"); +# [#133695] "0".."-1" should be the same as 0..-1 +is(join(":","-2".."-1") , "-2:-1"); +is(join(":","-1".."-1") , "-1"); +is(join(":","0".."-1") , ""); +is(join(":","1".."-1") , ""); + +# these test the statements made in the documentation +# regarding the rules of string ranges +is(join(":","-2".."2"), join(":",-2..2)); +is(join(":","2.18".."3.14"), "2:3"); +is(join(":","01".."04"), "01:02:03:04"); +is(join(":","00".."-1"), "00:01:02:03:04:05:06:07:08:09:10:11:12:13:14:15:16:17:18:19:20:21:22:23:24:25:26:27:28:29:30:31:32:33:34:35:36:37:38:39:40:41:42:43:44:45:46:47:48:49:50:51:52:53:54:55:56:57:58:59:60:61:62:63:64:65:66:67:68:69:70:71:72:73:74:75:76:77:78:79:80:81:82:83:84:85:86:87:88:89:90:91:92:93:94:95:96:97:98:99"); +is(join(":","00".."31"), "00:01:02:03:04:05:06:07:08:09:10:11:12:13:14:15:16:17:18:19:20:21:22:23:24:25:26:27:28:29:30:31"); +is(join(":","ax".."az"), "ax:ay:az"); +is(join(":","*x".."az"), "*x"); +is(join(":","A".."Z"), "A:B:C:D:E:F:G:H:I:J:K:L:M:N:O:P:Q:R:S:T:U:V:W:X:Y:Z"); +is(join(":", 0..9,"a".."f"), "0:1:2:3:4:5:6:7:8:9:a:b:c:d:e:f"); +is(join(":","a".."--"), join(":","a".."zz")); +is(join(":","0".."xx"), "0:1:2:3:4:5:6:7:8:9:10:11:12:13:14:15:16:17:18:19:20:21:22:23:24:25:26:27:28:29:30:31:32:33:34:35:36:37:38:39:40:41:42:43:44:45:46:47:48:49:50:51:52:53:54:55:56:57:58:59:60:61:62:63:64:65:66:67:68:69:70:71:72:73:74:75:76:77:78:79:80:81:82:83:84:85:86:87:88:89:90:91:92:93:94:95:96:97:98:99"); +is(join(":","aaa".."--"), ""); + # undef should be treated as 0 for numerical range is(join(":",undef..2), '0:1:2'); is(join(":",-2..undef), '-2:-1:0'); -- 2.19.2
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 2.7k
On Fri, 30 Nov 2018 06:09:07 -0800, haukex@zero-g.net wrote: Show quoted text
> Hi, > > Thanks for looking into this! > > The code comment in the code you showed [1] mentions #18165 [2] which > references #18114 [3] where a reply by Slaven Rezic makes sense to me: > 'There is a special handling for numeric strings beginning with a "0". > This is to allow things like "01".."31" to preserve the leading zero > for one-digit numbers.' The basic behavior appears to go all the way > back to 5.000 [4]. > > [1] > https://perl5.git.perl.org/perl.git/blob/23665de87341f4f3452009759d4fc95ce30b8ced:/pp_ctl.c#l1179 > [2] https://rt.perl.org/Public/Bug/Display.html?id=18165 > [3] https://rt.perl.org/Public/Bug/Display.html?id=18114 > [4] https://perl5.git.perl.org/perl.git/blob/refs/tags/perl- > 5.000:/pp_ctl.c#l694 > > So my interpretation of the rules is this: If the left and right > operands are strings, then check if they looks_like_number. If they > do, treat them as integers. However, make an exception when the left- > hand side begins with "0", for the reason stated above. > > The key word here is *begins* with zero; the condition > *SvPVX_const(left)!='0' causes this inconsistency: > > -3..-1 and "-3".."-1" are (-3,-2,-1) > -2..-1 and "-2".."-1" are (-2,-1) > -1..-1 and "-1".."-1" are (-1) > 1..-1 and "1".."-1" are () > however: > 0..-1 is () but "0".."-1" is (0..99) > > That latter behavior may be in line with "01".."-1", which is > ("01","02","03",...), but IMO it's still surprising, and in any case > the fact that strings that look like numbers are treated as such > appears to be undocumented. > > I have two alternative proposals: (A) leave the behavior as-is, but > document it, or (B) change the behavior so that the above condition is > 'if the LHS is a string that begins with 0, except for the string "0" > itself' (and document it) - this would cause the "01".."31" case to > still work, but also cause "0".."-1" to act like 0..-1. > > Patches for both A (just document) and B (change behavior) are > attached, with tests included (a full build passes all tests on my > end). My internals knowledge is quite limited so I hope my use of > SvCUR in the second patch is correct. > > My personal preference is option B, since it gets rid of the above > inconsistency, but I understand that if there are worries about > backwards compatibility; option A may be better in that respect. The > way I've worded the documentation pretty much nails down the behavior > and wouldn't allow for future changes, a third option might be to word > the documentation more loosely and leave the door open for future > changes.
I think I prefer B too. It would be nice to find out what anyone else thinks. Unfortunately I don't think I'd want to put a change in behaviour into core at this point in the release cycle. Tony
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 3.1k
On Wed, 13 Feb 2019 15:59:02 -0800, tonyc wrote: Show quoted text
> On Fri, 30 Nov 2018 06:09:07 -0800, haukex@zero-g.net wrote:
> > Hi, > > > > Thanks for looking into this! > > > > The code comment in the code you showed [1] mentions #18165 [2] which > > references #18114 [3] where a reply by Slaven Rezic makes sense to > > me: > > 'There is a special handling for numeric strings beginning with a > > "0". > > This is to allow things like "01".."31" to preserve the leading zero > > for one-digit numbers.' The basic behavior appears to go all the way > > back to 5.000 [4]. > > > > [1] > > https://perl5.git.perl.org/perl.git/blob/23665de87341f4f3452009759d4fc95ce30b8ced:/pp_ctl.c#l1179 > > [2] https://rt.perl.org/Public/Bug/Display.html?id=18165 > > [3] https://rt.perl.org/Public/Bug/Display.html?id=18114 > > [4] https://perl5.git.perl.org/perl.git/blob/refs/tags/perl- > > 5.000:/pp_ctl.c#l694 > > > > So my interpretation of the rules is this: If the left and right > > operands are strings, then check if they looks_like_number. If they > > do, treat them as integers. However, make an exception when the left- > > hand side begins with "0", for the reason stated above. > > > > The key word here is *begins* with zero; the condition > > *SvPVX_const(left)!='0' causes this inconsistency: > > > > -3..-1 and "-3".."-1" are (-3,-2,-1) > > -2..-1 and "-2".."-1" are (-2,-1) > > -1..-1 and "-1".."-1" are (-1) > > 1..-1 and "1".."-1" are () > > however: > > 0..-1 is () but "0".."-1" is (0..99) > > > > That latter behavior may be in line with "01".."-1", which is > > ("01","02","03",...), but IMO it's still surprising, and in any case > > the fact that strings that look like numbers are treated as such > > appears to be undocumented. > > > > I have two alternative proposals: (A) leave the behavior as-is, but > > document it, or (B) change the behavior so that the above condition > > is > > 'if the LHS is a string that begins with 0, except for the string "0" > > itself' (and document it) - this would cause the "01".."31" case to > > still work, but also cause "0".."-1" to act like 0..-1. > > > > Patches for both A (just document) and B (change behavior) are > > attached, with tests included (a full build passes all tests on my > > end). My internals knowledge is quite limited so I hope my use of > > SvCUR in the second patch is correct. > > > > My personal preference is option B, since it gets rid of the above > > inconsistency, but I understand that if there are worries about > > backwards compatibility; option A may be better in that respect. The > > way I've worded the documentation pretty much nails down the behavior > > and wouldn't allow for future changes, a third option might be to > > word > > the documentation more loosely and leave the door open for future > > changes.
> > I think I prefer B too. It would be nice to find out what anyone else > thinks. > > Unfortunately I don't think I'd want to put a change in behaviour into > core at this point in the release cycle. > > Tony
Now we're in a brand new release cycle, so I think it's time to revisit this ticket. Personally, I think that the option B is better, it's unlikely that anything relies on the current (broken) behaviour.
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 3.5k
On Tue, 06 Aug 2019 23:58:10 -0700, me@xenu.pl wrote: Show quoted text
> On Wed, 13 Feb 2019 15:59:02 -0800, tonyc wrote:
> > On Fri, 30 Nov 2018 06:09:07 -0800, haukex@zero-g.net wrote:
> > > Hi, > > > > > > Thanks for looking into this! > > > > > > The code comment in the code you showed [1] mentions #18165 [2] > > > which > > > references #18114 [3] where a reply by Slaven Rezic makes sense to > > > me: > > > 'There is a special handling for numeric strings beginning with a > > > "0". > > > This is to allow things like "01".."31" to preserve the leading > > > zero > > > for one-digit numbers.' The basic behavior appears to go all the > > > way > > > back to 5.000 [4]. > > > > > > [1] > > > https://perl5.git.perl.org/perl.git/blob/23665de87341f4f3452009759d4fc95ce30b8ced:/pp_ctl.c#l1179 > > > [2] https://rt.perl.org/Public/Bug/Display.html?id=18165 > > > [3] https://rt.perl.org/Public/Bug/Display.html?id=18114 > > > [4] https://perl5.git.perl.org/perl.git/blob/refs/tags/perl- > > > 5.000:/pp_ctl.c#l694 > > > > > > So my interpretation of the rules is this: If the left and right > > > operands are strings, then check if they looks_like_number. If they > > > do, treat them as integers. However, make an exception when the > > > left- > > > hand side begins with "0", for the reason stated above. > > > > > > The key word here is *begins* with zero; the condition > > > *SvPVX_const(left)!='0' causes this inconsistency: > > > > > > -3..-1 and "-3".."-1" are (-3,-2,-1) > > > -2..-1 and "-2".."-1" are (-2,-1) > > > -1..-1 and "-1".."-1" are (-1) > > > 1..-1 and "1".."-1" are () > > > however: > > > 0..-1 is () but "0".."-1" is (0..99) > > > > > > That latter behavior may be in line with "01".."-1", which is > > > ("01","02","03",...), but IMO it's still surprising, and in any > > > case > > > the fact that strings that look like numbers are treated as such > > > appears to be undocumented. > > > > > > I have two alternative proposals: (A) leave the behavior as-is, but > > > document it, or (B) change the behavior so that the above condition > > > is > > > 'if the LHS is a string that begins with 0, except for the string > > > "0" > > > itself' (and document it) - this would cause the "01".."31" case to > > > still work, but also cause "0".."-1" to act like 0..-1. > > > > > > Patches for both A (just document) and B (change behavior) are > > > attached, with tests included (a full build passes all tests on my > > > end). My internals knowledge is quite limited so I hope my use of > > > SvCUR in the second patch is correct. > > > > > > My personal preference is option B, since it gets rid of the above > > > inconsistency, but I understand that if there are worries about > > > backwards compatibility; option A may be better in that respect. > > > The > > > way I've worded the documentation pretty much nails down the > > > behavior > > > and wouldn't allow for future changes, a third option might be to > > > word > > > the documentation more loosely and leave the door open for future > > > changes.
> > > > I think I prefer B too. It would be nice to find out what anyone > > else > > thinks. > > > > Unfortunately I don't think I'd want to put a change in behaviour > > into > > core at this point in the release cycle. > > > > Tony
> > Now we're in a brand new release cycle, so I think it's time to > revisit this ticket. > > Personally, I think that the option B is better, it's unlikely that > anything relies on the current (broken) behaviour.
I've applied to the B version to blead, so we should find out if anything depends on the old behaviour. Leaving open for now. Tony
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 3.8k
On Wed, 07 Aug 2019 18:19:54 -0700, tonyc wrote: Show quoted text
> On Tue, 06 Aug 2019 23:58:10 -0700, me@xenu.pl wrote:
> > On Wed, 13 Feb 2019 15:59:02 -0800, tonyc wrote:
> > > On Fri, 30 Nov 2018 06:09:07 -0800, haukex@zero-g.net wrote:
> > > > Hi, > > > > > > > > Thanks for looking into this! > > > > > > > > The code comment in the code you showed [1] mentions #18165 [2] > > > > which > > > > references #18114 [3] where a reply by Slaven Rezic makes sense > > > > to > > > > me: > > > > 'There is a special handling for numeric strings beginning with a > > > > "0". > > > > This is to allow things like "01".."31" to preserve the leading > > > > zero > > > > for one-digit numbers.' The basic behavior appears to go all the > > > > way > > > > back to 5.000 [4]. > > > > > > > > [1] > > > > https://perl5.git.perl.org/perl.git/blob/23665de87341f4f3452009759d4fc95ce30b8ced:/pp_ctl.c#l1179 > > > > [2] https://rt.perl.org/Public/Bug/Display.html?id=18165 > > > > [3] https://rt.perl.org/Public/Bug/Display.html?id=18114 > > > > [4] https://perl5.git.perl.org/perl.git/blob/refs/tags/perl- > > > > 5.000:/pp_ctl.c#l694 > > > > > > > > So my interpretation of the rules is this: If the left and right > > > > operands are strings, then check if they looks_like_number. If > > > > they > > > > do, treat them as integers. However, make an exception when the > > > > left- > > > > hand side begins with "0", for the reason stated above. > > > > > > > > The key word here is *begins* with zero; the condition > > > > *SvPVX_const(left)!='0' causes this inconsistency: > > > > > > > > -3..-1 and "-3".."-1" are (-3,-2,-1) > > > > -2..-1 and "-2".."-1" are (-2,-1) > > > > -1..-1 and "-1".."-1" are (-1) > > > > 1..-1 and "1".."-1" are () > > > > however: > > > > 0..-1 is () but "0".."-1" is (0..99) > > > > > > > > That latter behavior may be in line with "01".."-1", which is > > > > ("01","02","03",...), but IMO it's still surprising, and in any > > > > case > > > > the fact that strings that look like numbers are treated as such > > > > appears to be undocumented. > > > > > > > > I have two alternative proposals: (A) leave the behavior as-is, > > > > but > > > > document it, or (B) change the behavior so that the above > > > > condition > > > > is > > > > 'if the LHS is a string that begins with 0, except for the string > > > > "0" > > > > itself' (and document it) - this would cause the "01".."31" case > > > > to > > > > still work, but also cause "0".."-1" to act like 0..-1. > > > > > > > > Patches for both A (just document) and B (change behavior) are > > > > attached, with tests included (a full build passes all tests on > > > > my > > > > end). My internals knowledge is quite limited so I hope my use of > > > > SvCUR in the second patch is correct. > > > > > > > > My personal preference is option B, since it gets rid of the > > > > above > > > > inconsistency, but I understand that if there are worries about > > > > backwards compatibility; option A may be better in that respect. > > > > The > > > > way I've worded the documentation pretty much nails down the > > > > behavior > > > > and wouldn't allow for future changes, a third option might be to > > > > word > > > > the documentation more loosely and leave the door open for future > > > > changes.
> > > > > > I think I prefer B too. It would be nice to find out what anyone > > > else > > > thinks. > > > > > > Unfortunately I don't think I'd want to put a change in behaviour > > > into > > > core at this point in the release cycle. > > > > > > Tony
> > > > Now we're in a brand new release cycle, so I think it's time to > > revisit this ticket. > > > > Personally, I think that the option B is better, it's unlikely that > > anything relies on the current (broken) behaviour.
> > I've applied to the B version to blead, so we should find out if > anything depends on the old behaviour. > > Leaving open for now.
Closing. Tony


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org