Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: list slices to the end #12981

Open
p5pRT opened this issue May 21, 2013 · 42 comments
Open

Feature: list slices to the end #12981

p5pRT opened this issue May 21, 2013 · 42 comments

Comments

@p5pRT
Copy link

p5pRT commented May 21, 2013

Migrated from rt.perl.org#118089 (status was 'open')

Searchable as RT118089$

@p5pRT
Copy link
Author

p5pRT commented May 21, 2013

From @epa

Created by @epa

Sometimes you want to slice a list from some index to the end.
For example, to get the whole list apart from the first element.
If it is in a named array variable then you can say

  my @​end_part = @​a[ 1 .. $#a ];

But this becomes more difficult when the list is an intermediate
expression. It would be handy to have a syntax for slicing to
the end of a list, like this​:

  my @​end_part = @​a[ 1 .. ];

Here only the left part of the '..' is given and the end of the
slice is implicitly the end of the list. For symmetry, it would
also be useful to have

  my @​start_part = @​a[ .. 5 ];

If overloading the .. operator in this way is too yucky, you
could instead consider borrowing the list slice notation
from Python. That would be a bigger addition to the language
but would fix an un-C-like off-by-one feeling with the current
way of specifying an inclusive range. (List slices in Python
are specified as start​:end giving the range start..end-1.)

Perl Info

Flags:
    category=core
    severity=wishlist

Site configuration information for perl 5.16.3:

Configured by Red Hat, Inc. at Thu Apr 11 09:48:29 UTC 2013.

Summary of my perl5 (revision 5 version 16 subversion 3) configuration:
   
  Platform:
    osname=linux, osvers=2.6.32-358.2.1.el6.x86_64, archname=x86_64-linux-thread-multi
    uname='linux buildvm-08.phx2.fedoraproject.org 2.6.32-358.2.1.el6.x86_64 #1 smp wed feb 20 12:17:37 est 2013 x86_64 x86_64 x86_64 gnulinux '
    config_args='-des -Doptimize=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4  -m64 -mtune=generic -Dccdlflags=-Wl,--enable-new-dtags -Dlddlflags=-shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4  -m64 -mtune=generic -Wl,-z,relro  -DDEBUGGING=-g -Dversion=5.16.3 -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl5 -Dsitearch=/usr/local/lib64/perl5 -Dprivlib=/usr/share/perl5 -Dvendorlib=/usr/share/perl5/vendor_perl -Darchlib=/usr/lib64/perl5 -Dvendorarch=/usr/lib64/perl5/vendor_perl -Darchname=x86_64-linux-thread-multi -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64 -Duseshrplib -Dusethreads -Duseithreads -Dusedtrace=/usr/bin/dtrace -Duselargefiles -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_proto -Ud_sethostent_r_proto -Ud_endprotoent_r_proto -Ud_setprotoent_r_proto -Ud_endservent_r_proto -Ud_setservent_r_proto -Dscriptdir=/usr/bin -Dusesitecustomize'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.7.2 20121109 (Red Hat 4.7.2-8)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -fstack-protector'
    libpth=/usr/local/lib64 /lib64 /usr/lib64
    libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc -lgdbm_compat
    perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.16'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,--enable-new-dtags -Wl,-rpath,/usr/lib64/perl5/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -Wl,-z,relro '

Locally applied patches:
    


@INC for perl 5.16.3:
    /home/eda/lib/perl5/
    /usr/local/lib64/perl5
    /usr/local/share/perl5
    /usr/lib64/perl5/vendor_perl
    /usr/share/perl5/vendor_perl
    /usr/lib64/perl5
    /usr/share/perl5
    .


Environment for perl 5.16.3:
    HOME=/home/eda
    LANG=en_GB.UTF-8
    LANGUAGE (unset)
    LC_COLLATE=C
    LC_CTYPE=en_GB.UTF-8
    LC_MESSAGES=en_GB.UTF-8
    LC_MONETARY=en_GB.UTF-8
    LC_NUMERIC=en_GB.UTF-8
    LC_TIME=en_GB.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/eda/bin:/home/eda/bin:/usr/local/bin:/usr/bin:/sbin:/usr/sbin:/sbin:/usr/sbin
    PERL5LIB=/home/eda/lib/perl5/
    PERL_BADLANG (unset)
    SHELL=/bin/bash

-- 
Ed Avis <eda@waniasset.com>


______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

@p5pRT
Copy link
Author

p5pRT commented May 22, 2013

From j.imrie1@virginmedia.com

On 21/05/2013 15​:15, Ed Avis (via RT) wrote​:

# New Ticket Created by "Ed Avis"
# Please include the string​: [perl #118089]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=118089 >

This is a bug report for perl from eda@​waniasset.com,
generated with the help of perlbug 1.39 running under perl 5.16.3.

-----------------------------------------------------------------
[Please describe your issue here]

Sometimes you want to slice a list from some index to the end.
For example, to get the whole list apart from the first element.
If it is in a named array variable then you can say

 my @&#8203;end\_part = @&#8203;a\[ 1 \.\. $\#a \];

But this becomes more difficult when the list is an intermediate
expression. It would be handy to have a syntax for slicing to
the end of a list, like this​:

 my @&#8203;end\_part = @&#8203;a\[ 1 \.\. \];

Here only the left part of the '..' is given and the end of the
slice is implicitly the end of the list. For symmetry, it would
also be useful to have

 my @&#8203;start\_part = @&#8203;a\[ \.\. 5 \];

If overloading the .. operator in this way is too yucky, you
could instead consider borrowing the list slice notation
from Python. That would be a bigger addition to the language
but would fix an un-C-like off-by-one feeling with the current
way of specifying an inclusive range. (List slices in Python
are specified as start​:end giving the range start..end-1.)

As we have removed $# can we use it for the end of the current list now.

my @​end_part = @​a[ 1 .. $# ];

Thus making $# the number of elements in the current list.

John

@p5pRT
Copy link
Author

p5pRT commented May 22, 2013

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jun 9, 2013

From @rjbs

For the record, I am open to a patch to provide this kind of change, and am not
particular as to which of the suggestions is used, as long as the work is
sound.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Jul 7, 2013

From @cpansprout

On Sun Jun 09 06​:55​:59 2013, perl.p5p@​rjbs.manxome.org wrote​:

For the record, I am open to a patch to provide this kind of change,
and am not
particular as to which of the suggestions is used, as long as the work
is
sound.

Of the two suggestions, I think $# is best. If list and array slices
could record the length in an interpreter variable localised to the
brackets on the right, $# could be a magical variable accessing that.

That would make all three of these work​:

# Omit the first element
@​a = (gimme_list())[1..$#];
@​a = @​b[1..$#];
sub end { $# }
@​a = (gimme_list())[1..end];

Do we want that?

As for the [1..] suggestion, would that be a postfix .. operator applied
to the 1? Or would ..] be a special way to end the subscript? I.e.,
should [5.., 2] be allowed or not? And if we can access this number,
why restrict it to ranges?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 7, 2013

From @epa

In the @​a[1..] form I envisaged that 1.. would be syntax valid only inside list slices - not adding a .. postfix operator to the language in general. (Similarly the $# proposal adds a variable valid only inside list slices.)

How about going with the $# proposal but adding a small bit of sugar that the $# can be omitted if it comes at the end of the slice expression after .., similar to how a semicolon can be omitted at the end of a BLOCK?

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http​://www.symanteccloud.com
______________________________________________________________________

@p5pRT
Copy link
Author

p5pRT commented Jul 7, 2013

From @druud62

On 21/05/2013 16​:15, Ed Avis wrote​:

Sometimes you want to slice a list from some index to the end.
[...] It would be handy to have a syntax for slicing to
the end of a list, like this​:

 my @&#8203;end\_part = @&#8203;a\[ 1 \.\. \];

Here only the left part of the '..' is given and the end of the
slice is implicitly the end of the list. For symmetry, it would
also be useful to have

 my @&#8203;start\_part = @&#8203;a\[ \.\. 5 \];

Or support negative indexes inside slicers​:

  my @​end_part = @​a[ 1 .. _(-1) ];

(the _() to express topical run time action)

Or a topical $# alias, like $_#.

  my @​end_part = @​a[ 1 .. $_# ];

--
Ruud

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2013

From @davidnicol

On Sun, Jul 7, 2013 at 1​:51 AM, Dr.Ruud <rvtol+usenet@​isolution.nl> wrote​:

Or support negative indexes inside slicers​:

my @​end_part = @​a[ 1 .. _(-1) ];

(the _() to express topical run time action)

since negative numbers already represent counting from the end of an array,
in both scalar element accesses and C<splice>, it seems to me that altering
the dot-dot operator's meaning inside slice specifiers to allow negative
numbers would be the most compatible new syntax, as well as simplifying
slicing to almost-the-end.

a magical $# clearly allows the most flexibility -- there's no way to
express "give me the right half of this array" without a temporary
variable, otherwise. We don't have that currently, and it's beyond the
scope of the feature request under discussion.

  my @​right_half = @​a[ (0.5 + $# / 2) .. $# ];

is that really an improvement over

  my @​right_half = @​a[ (0.5 + $#a / 2) .. $#a ];

?

having a magic variable that is localized in this new way will be tricky,
and will slow everything down a little as it will have to get localized and
set often.

Allowing a malformed range operation, like C< 3 .. -2 >, within slicing
operations, to resolve the right side with respect to the array being
sliced, might be simpler to implement, by making a new "range in slice"
operator that is slightly different from the normal range operator in that
it handles a negative on the right side and a positive on the left side by
referring to a generally inaccessible $^ReferenceToArrayBeingSliced
variable ---- or better yet, by promoting a range operator appearing in an
array slice to a method call against the array, rather than a operator that
produces indices.

if the range operator when appearing inside a slice stops being just an
expression that yields indices and instead becomes part of the slicing
syntax itself, fancy stuff like lazy arrays that produce lazy slices, or,
non-numeric ranges that slice associative arrays without producing millions
of garbage empty slots, will be possible.

  my %Ns = %Phonebook{ 'N' .. 'NZZZ' }; # might resolve to one SELECT call

dln

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2013

From @ikegami

On Sat, Jul 6, 2013 at 8​:36 PM, Father Chrysostomos via RT <
perlbug-followup@​perl.org> wrote​:

That would make all three of these work​:

@​a = (gimme_list())[1..$#];

Small note​: Right now, the index expression is evaluated before the list
expression, so you'd also have to change the (undocumented) operand
evaluation order for list slices.

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2013

From @druud62

On 08/07/2013 07​:00, David Nicol wrote​:

On Sun, Jul 7, 2013 at 1​:51 AM, Dr.Ruud <rvtol+usenet@​isolution.nl
<mailto​:rvtol+usenet@​isolution.nl>> wrote​:

Or support negative indexes inside slicers&#8203;:

   my @&#8203;end\_part = @&#8203;a\[ 1 \.\. \_\(\-1\) \];

\(the \_\(\) to express topical run time action\)

since negative numbers already represent counting from the end of an
array, in both scalar element accesses and C<splice>, it seems to me
that altering the dot-dot operator's meaning inside slice specifiers to
allow negative numbers would be the most compatible new syntax, as well
as simplifying slicing to almost-the-end.

Well sure, but it 'calls' _(), so I meant no actual negative number
there, but a topical interpretation of it.

--
Ruud

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2013

From @cpansprout

On Mon Jul 08 06​:43​:06 2013, ikegami@​adaelis.com wrote​:

On Sat, Jul 6, 2013 at 8​:36 PM, Father Chrysostomos via RT <
perlbug-followup@​perl.org> wrote​:

That would make all three of these work​:

@​a = (gimme_list())[1..$#];

Small note​: Right now, the index expression is evaluated before the list
expression, so you'd also have to change the (undocumented) operand
evaluation order for list slices.

Of course, I had forgotten about that. I should know, considering that
I implemented this function​:

$ ./perl -Ilib -MO=Concise,CORE​::__PACKAGE__ -e0
CORE​::__PACKAGE__​:
7 <1> leavesublv[1 ref] K/REFC,1 ->(end)
- <@​> lineseq KP ->7
1 <$> coreargs(PV "__PACKAGE__") v ->2
6 <2> lslice K/2 ->7
- <1> ex-list lK ->4
2 <0> pushmark s ->3
3 <$> const[IV 0] s ->4
- <1> ex-list lK ->6
4 <0> pushmark s ->5
5 <0> caller[t1] l ->6
-e syntax OK

I was not sure whether I was entirely comfortable with the proposed
feature. Now I think I am against it, since it really can’t work.
Shall I mark this ticket as rejected?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2013

From @davidnicol

On Mon, Jul 8, 2013 at 3​:45 PM, Father Chrysostomos via RT <
perlbug-followup@​perl.org> wrote​:

Now I think I am against it, since it really can’t work.

it isn't possible to recognize range-in-slice-indices at parse time and
turn it into an internal operation that takes three arguments, the
container, the left range point, and the right range point? Would that
require more than one token of lookahead? A fix-up pass that replaces
slices when the index is a range operation would also work, but that's messy

this is what's in perly.y

  | ary '[' expr ']' /* array slice */
  { $$ = op_prepend_elem(OP_ASLICE,
  newOP(OP_PUSHMARK, 0),
  newLISTOP(OP_ASLICE, 0,
  list($3),
  ref($1, OP_ASLICE)));
  TOKEN_GETMAD($2,$$,'[');
  TOKEN_GETMAD($4,$$,']');
  }

so I'm talking about adding something like this

  | ary '[' term DOTDOT term ']' /* array
range slice */
  { $$ = op_prepend_elem(OP_ASLICE_RANGE,
  newOP(OP_PUSHMARK, 0),
  newLISTOP(OP_ASLICE_RANGE, 0,
  scalar($3), scalar($5),
  ref($1, OP_ASLICE_RANGE)));
  TOKEN_GETMAD($2,$$,'[');
  TOKEN_GETMAD($4,$$,DOTDOT);
  TOKEN_GETMAD($6,$$,']');
  }

before that, and creating OP_ASLICE_RANGE.

which might mean making a second "expr" that doesn't include DOTDOT in its
termbinop, if yacc is greedy about what it puts into the expr slot, or
otherwise handling DOTDOT differently.

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2013

From j.imrie1@virginmedia.com

On 08/07/2013 19​:41, Dr.Ruud wrote​:

On 08/07/2013 07​:00, David Nicol wrote​:

On Sun, Jul 7, 2013 at 1​:51 AM, Dr.Ruud <rvtol+usenet@​isolution.nl
<mailto​:rvtol+usenet@​isolution.nl>> wrote​:

Or support negative indexes inside slicers&#8203;:

   my @&#8203;end\_part = @&#8203;a\[ 1 \.\. \_\(\-1\) \];

\(the \_\(\) to express topical run time action\)

since negative numbers already represent counting from the end of an
array, in both scalar element accesses and C<splice>, it seems to me
that altering the dot-dot operator's meaning inside slice specifiers to
allow negative numbers would be the most compatible new syntax, as well
as simplifying slicing to almost-the-end.

Well sure, but it 'calls' _(), so I meant no actual negative number
there, but a topical interpretation of it.

Isn't _() all ready a valid sub call, being as $_, @​_ and _ (the default
file handle) exist, why not &_ as well. In which case someone may have
already defined 'sub _'

John

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2013

From @cpansprout

On Mon Jul 08 14​:20​:45 2013, davidnicol@​gmail.com wrote​:

On Mon, Jul 8, 2013 at 3​:45 PM, Father Chrysostomos via RT <
perlbug-followup@​perl.org> wrote​:

Now I think I am against it, since it really can’t work.

it isn't possible to recognize range-in-slice-indices at parse time and
turn it into an internal operation that takes three arguments, the
container, the left range point, and the right range point? Would that
require more than one token of lookahead? A fix-up pass that replaces
slices when the index is a range operation would also work, but that's
messy

this is what's in perly.y

    |       ary '\[' expr '\]'                     /\* array slice \*/
                    \{ $$ = op\_prepend\_elem\(OP\_ASLICE\,
                            newOP\(OP\_PUSHMARK\, 0\)\,
                                newLISTOP\(OP\_ASLICE\, 0\,
                                    list\($3\)\,
                                    ref\($1\, OP\_ASLICE\)\)\);
                      TOKEN\_GETMAD\($2\,$$\,'\['\);
                      TOKEN\_GETMAD\($4\,$$\,'\]'\);
                    \}

so I'm talking about adding something like this

    |       ary '\[' term DOTDOT term  '\]'                     /\* array

range slice */
{ $$ = op_prepend_elem(OP_ASLICE_RANGE,
newOP(OP_PUSHMARK, 0),
newLISTOP(OP_ASLICE_RANGE, 0,
scalar($3), scalar($5),
ref($1, OP_ASLICE_RANGE)));
TOKEN_GETMAD($2,$$,'[');
TOKEN_GETMAD($4,$$,DOTDOT);
TOKEN_GETMAD($6,$$,']');
}

before that, and creating OP_ASLICE_RANGE.

which might mean making a second "expr" that doesn't include DOTDOT in its
termbinop, if yacc is greedy about what it puts into the expr slot, or
otherwise handling DOTDOT differently.

If that is to apply to list slices as well, then it would mean slight
changes in syntax, such as (func1())[1..func2()] vs
(func1())[(1..func2()}] would result in the functions being called in a
different order.

I don’t feel comfortable with that at all.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2013

From @davidnicol

On Mon, Jul 8, 2013 at 8​:00 PM, Father Chrysostomos via RT <
perlbug-followup@​perl.org> wrote​:

If that is to apply to list slices as well, then it would mean slight
changes in syntax, such as (func1())[1..func2()] vs
(func1())[(1..func2()}] would result in the functions being called in a
different order.

I don’t feel comfortable with that at all.

The change in undocumented evaluation order is required for a
variable-based solution, as the magic last-index variable must be bound
before the left and right terms are evaluated, which is a change from how
it is now​:

$ perl -wle 'sub A { print "A"; 0..9 } sub L { print "L"; 2 } sub R { print
"R"; 6 } print ((A)[L .. R])'
print (...) interpreted as function at -e line 1.
L
R
A
23456
$

With a new operator solution, there is no reason for the evaluation order
to change.

I don't know all the tools available within the blocks emitted by perly.y.

Are you saying that we currently lack a framework for invoking a new
three-argument operator, after evaluating the left term, the right term,
and the array expression, in that order, and furthermore, that crafting the
required infrastructure would be impossible?

Surely you aren't.

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2013

From @rjbs

* John Imrie <j.imrie1@​virginmedia.com> [2013-07-08T18​:43​:25]

Well sure, but it 'calls' _(), so I meant no actual negative
number there, but a topical interpretation of it.

Isn't _() all ready a valid sub call, being as $_, @​_ and _ (the
default file handle) exist, why not &_ as well. In which case
someone may have already defined 'sub _'

Yes, and it's occasionally used because of its package-transcending property​:

  package X { sub _ { warn "called by " . caller; } }
  package Y { sub hello { _() } }
  Y​::hello()

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2013

From @ap

* John Imrie <j.imrie1@​virginmedia.com> [2013-07-09 00​:45]​:

In which case someone may have already defined 'sub _'

You think so? :-)

--
*AUTOLOAD=*_;sub _{s/​::([^​:]*)$/print$1,(",$\/"," ")[defined wantarray]/e;chop;$_}
&Just->another->Perl->hack;
#Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2013

From @druud62

On 09/07/2013 00​:43, John Imrie wrote​:

On 08/07/2013 19​:41, Dr.Ruud wrote​:

On 08/07/2013 07​:00, David Nicol wrote​:

On Sun, Jul 7, 2013 at 1​:51 AM, Dr.Ruud <rvtol+usenet@​isolution.nl
<mailto​:rvtol+usenet@​isolution.nl>> wrote​:

Or support negative indexes inside slicers&#8203;:

   my @&#8203;end\_part = @&#8203;a\[ 1 \.\. \_\(\-1\) \];

\(the \_\(\) to express topical run time action\)

since negative numbers already represent counting from the end of an
array, in both scalar element accesses and C<splice>, it seems to me
that altering the dot-dot operator's meaning inside slice specifiers to
allow negative numbers would be the most compatible new syntax, as well
as simplifying slicing to almost-the-end.

Well sure, but it 'calls' _(), so I meant no actual negative number
there, but a topical interpretation of it.

Isn't _() all ready a valid sub call, being as $_, @​_ and _ (the default
file handle) exist, why not &_ as well. In which case someone may have
already defined 'sub _'

Nobody is married to the _() looks. It is about the topicalizing.

For a scalar topicalizer, maybe pick $_().
And/or for last-index, consider $#().

--
Ruud

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2013

From @epa

Adding a variant of DOTDOT that works only inside list slices is tricky for the reasons Fr C. suggests.
If a new operator is to be added then I do suggest looking closely at Python's list slices,
which in my opinion are superior to the Perl way of specifying an inclusive range.

--
Ed Avis <eda@​waniasset.com>

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http​://www.symanteccloud.com
______________________________________________________________________

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2013

From @cpansprout

On Mon Jul 08 19​:20​:00 2013, davidnicol@​gmail.com wrote​:

On Mon, Jul 8, 2013 at 8​:00 PM, Father Chrysostomos via RT <
perlbug-followup@​perl.org> wrote​:

If that is to apply to list slices as well, then it would mean slight
changes in syntax, such as (func1())[1..func2()] vs
(func1())[(1..func2()}] would result in the functions being called in a
different order.

I don’t feel comfortable with that at all.

The change in undocumented evaluation order is required for a
variable-based solution, as the magic last-index variable must be bound
before the left and right terms are evaluated, which is a change from how
it is now​:

$ perl -wle 'sub A { print "A"; 0..9 } sub L { print "L"; 2 } sub R {
print
"R"; 6 } print ((A)[L .. R])'
print (...) interpreted as function at -e line 1.
L
R
A
23456
$

With a new operator solution, there is no reason for the evaluation order
to change.

I don't know all the tools available within the blocks emitted by perly.y.

Are you saying that we currently lack a framework for invoking a new
three-argument operator, after evaluating the left term, the right term,
and the array expression, in that order, and furthermore, that
crafting the
required infrastructure would be impossible?

Surely you aren't.

OK, I misunderstood. Yes, if the .. is part of the slice operator
(instead of a separate range), it can execute its arguments in any
order. In which case the syntax would be​:

( a )[ b..c, d.., e..]

but no parentheses allowed around the [d..].

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2013

From @cpansprout

On Tue Jul 09 01​:44​:34 2013, eda@​waniasset.com wrote​:

Adding a variant of DOTDOT that works only inside list slices is
tricky for the reasons Fr C. suggests.
If a new operator is to be added then I do suggest looking closely at
Python's list slices,
which in my opinion are superior to the Perl way of specifying an
inclusive range.

I don’t know Python. Could you summarise its slices?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2013

From @epa

http​://stackoverflow.com/questions/509211/the-python-slice-notation summarizes Python's list slices.
The key point is that it makes a half-closed interval where you specify 'from X, up to but not including Y'.
To my mind this is more in the C tradition of zero-based indexing and makes off-by-one errors rarer.

a[start​:end] # items start through end-1
a[start​:] # items start through the rest of the array
a[​:end] # items from the beginning through end-1

So the current @​a[ 1 .. $#a ] would become @​a[ 1 : ]
and @​a[ 0 .. $#a - 1 ] would become @​a[ : -1 ].
@​a[ $x .. $y - 1 ] would be @​a[ $x : $y ].

Admittedly in perl, writing scalar(@​a) for the number of elements in the array is more typing than $#a
for the index of the last element. I guess the $#a syntax was added specially for use with the .. operator
and if working with half-closed intervals it is less needed.

--
Ed Avis <eda@​waniasset.com>

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http​://www.symanteccloud.com
______________________________________________________________________

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2013

From @cpansprout

On Tue Jul 09 04​:42​:42 2013, sprout wrote​:

On Mon Jul 08 19​:20​:00 2013, davidnicol@​gmail.com wrote​:

On Mon, Jul 8, 2013 at 8​:00 PM, Father Chrysostomos via RT <
perlbug-followup@​perl.org> wrote​:

If that is to apply to list slices as well, then it would mean slight
changes in syntax, such as (func1())[1..func2()] vs
(func1())[(1..func2()}] would result in the functions being called
in a
different order.

I don’t feel comfortable with that at all.

The change in undocumented evaluation order is required for a
variable-based solution, as the magic last-index variable must be bound
before the left and right terms are evaluated, which is a change
from how
it is now​:

$ perl -wle 'sub A { print "A"; 0..9 } sub L { print "L"; 2 } sub R {
print
"R"; 6 } print ((A)[L .. R])'
print (...) interpreted as function at -e line 1.
L
R
A
23456
$

With a new operator solution, there is no reason for the evaluation
order
to change.

I don't know all the tools available within the blocks emitted by
perly.y.

Are you saying that we currently lack a framework for invoking a new
three-argument operator, after evaluating the left term, the right term,
and the array expression, in that order, and furthermore, that
crafting the
required infrastructure would be impossible?

Surely you aren't.

OK, I misunderstood. Yes, if the .. is part of the slice operator
(instead of a separate range), it can execute its arguments in any
order. In which case the syntax would be​:

( a )[ b..c, d.., e..]

but no parentheses allowed around the [d..].

But low-precedence operators like ‘or’ and ‘and’ complicate things
further. If .. is part of the list slice syntax and perly.y treats it
that way, then it will have lower precedence than ‘or’, as in​:

( a )[$b or $c..]

Having the lexer and parser treat postfix .. as a special operator that
makes op.c croak at compile time if it is not a direct child of a slice
might work.

But then branch folding will change the meaning in cases like this​:

( a )[$0 or $2..] # compile-time error
( b )[ 0 or $2..] # ok

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2013

From @davidnicol

TLDR​: David Nicol openly and cheerfully struggles with concepts of yacc
grammars, eventually concluding that the approach he has been championing
will require too much lookahead. Then he suggests a post-parse fixup
approach that would be done at parse-time when range operations appear near
the top of slice index expressions.

On Tue, Jul 9, 2013 at 10​:31 AM, Father Chrysostomos via RT <
perlbug-followup@​perl.org> wrote​:

OK, I misunderstood. Yes, if the .. is part of the slice operator
(instead of a separate range), it can execute its arguments in any
order. In which case the syntax would be​:

( a )[ b..c, d.., e..]

but no parentheses allowed around the [d..].

Python's range operator implies the starting 0 or the ending -1. Perl's
doesn't.

But low-precedence operators like ‘or’ and ‘and’ complicate things
further. If .. is part of the list slice syntax and perly.y treats it
that way, then it will have lower precedence than ‘or’, as in​:

( a )[$b or $c..]

I'm imagining the rangey slice only happening when the rangey slice syntax
is triggered, otherwise, DOTDOT is the same binary infix operator it has
always been. Implying the starting 0 and the ending -1 could also happen
then too, okay -- would perly.y see '$b or $c' as a term? Also, the
precedence shift wouldn't be a problem, just something to document

Having the lexer and parser treat postfix .. as a special operator that
makes op.c croak at compile time if it is not a direct child of a slice
might work.

But then branch folding will change the meaning in cases like this​:

( a )[$0 or $2..] # compile-time error
( b )[ 0 or $2..] # ok

if a new clause in perly.y will work, then so will three new clauses, one
for [ term .. term ] and one for [ term .. ] and one for [ .. term ]. The
second would imply a -1 for the end, the third would imply a 0 for the
start.

How do we guard against [ a .. b , c .. d ] triggering the new syntax? It
should get parsed as an expression yielding the index list like normal, not
a range-slice between a and scalar (b, c .. d). Maybe commas could create
series of ranges, but that may beyond the expressive power of Bison. Can
Bison create lists and map over them? Using an action in a recursively
matched rule?

  range_slice_expression​: term DOTDOT term
  { push the terms onto the
temporary ranges list }
  | term DOTDOT
  { push the term, and -1, onto
the temp.r.l. }
  | DOTDOT term
  { push 0, and the term, onto
the temp.r.l. }
  | expr
  { push something that
indicates that this one
  is not a range
  }
  ;

  range_slice_expression​: range_slice_expression ','
range_slice_expression

  ary '[' range_slice_expression ']'
  { build a list based on the temporaries set up during matching of
range_expression;
  clear the list of ranges
  }

I don't think it can be made to work without more than one token of
lookahead​:

  @​array[ !($. % 5) .. !($. % 7) ? 0 .. 4 : -5 .. -1 ]

should match '[' expr ']' and not match '[' term DOTDOT term ']'

because the token after the second notted modulo term is a hook, not a
right square bracket or a comma

but to realize this, the parser must maintain ambiguity for longer than one
token. So that won't work.

And that leaves, examining the emitted expression tree of the slice indices
to see if it has any range operators in it at the top level -- or, at the
next-to-top if the top are commas.

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2013

From aaron@priven.com

Hmm.

@​a[qs/new slice syntax, whatever it is/]

Then the syntax doesn't get evaluated until the slicing actually happens.

(Personally I like inclusive ranges.)

--
Aaron Priven, aaron@​priven.com, www.priven.com/aaron

--
Aaron Priven, aaron@​priven.com, www.priven.com/aaron

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2013

From tchrist@perl.com

TLDR​: David Nicol openly and cheerfully struggles with concepts of

What a curious way to spell "ABSTRACT"! :)

  yacc grammars\, eventually concluding that the approach he has
  been championing will require too much lookahead\. Then he
  suggests a post\-parse fixup approach that would be done at parse\-
  time when range operations appear near the top of slice index
  expressions\.

--tom

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2013

From @cpansprout

We seem to be talking past each other somewhat.... But this
conversation is fascinating nonetheless.

On Tue Jul 09 12​:55​:04 2013, davidnicol@​gmail.com wrote​:

I'm imagining the rangey slice only happening when the rangey slice syntax
is triggered, otherwise, DOTDOT is the same binary infix operator it has
always been.

Implying the starting 0 and the ending -1 could also happen
then too, okay -- would perly.y see '$b or $c' as a term? Also, the
precedence shift wouldn't be a problem, just something to document

I think it *would* be a problem. It would be too surprising. (I know,
you seem to find nothing surprising. :-)

Having the lexer and parser treat postfix .. as a special operator that
makes op.c croak at compile time if it is not a direct child of a slice
might work.

But then branch folding will change the meaning in cases like this​:

( a )[$0 or $2..] # compile-time error
( b )[ 0 or $2..] # ok

if a new clause in perly.y will work, then so will three new clauses, one
for [ term .. term ] and one for [ term .. ] and one for [ .. term ].

I don’t see why we need special handling for [ term .. term ], since it
would work exactly the same way (wrt observable behaviour) as it
currently does. Also, [ .. term ] is not necessary, as the starting
index is always known.

Having a special rule for parsing [ term .. ] rules out things like [
a.., b..c]

[experimental grammar rules skipped]
...
I don't think it can be made to work without more than one token of
lookahead​:

 @&#8203;array\[ \!\($\. % 5\) \.\. \!\($\. % 7\) ? 0 \.\. 4 : \-5 \.\. \-1 \]

should match '[' expr ']' and not match '[' term DOTDOT term ']'

because the token after the second notted modulo term is a hook, not a
right square bracket or a comma

but to realize this, the parser must maintain ambiguity for longer
than one
token. So that won't work.

And that leaves, examining the emitted expression tree of the slice
indices
to see if it has any range operators in it at the top level -- or, at the
next-to-top if the top are commas.

The only thing we need special handling for is postfix C<..>. If we
make it a rule that it is a syntactic special case occurring in [...]
and only before , => or ] then it is simply a matter of having the lexer
look for the next term and emitting a POSTDOTDOT instead of a DOTDOT.

The op.c can weed out illegal uses of it in finalize_op.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2013

From @cpansprout

On Tue Jul 09 21​:27​:33 2013, sprout wrote​:

Also, [ .. term ] is not necessary, as the starting
index is always known.

Additionally, (foo)[substr $bar,0,1, .. 5] is already valid syntax.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2013

From @davidnicol

On Tue, Jul 9, 2013 at 11​:27 PM, Father Chrysostomos via RT <
perlbug-followup@​perl.org> wrote​:

We seem to be talking past each other somewhat.... But this
conversation is fascinating nonetheless.

Yes indeed, starting with different visions of the goal.

if a new clause in perly.y will work, then so will three new clauses, one

for [ term .. term ] and one for [ term .. ] and one for [ .. term ].

I don’t see why we need special handling for [ term .. term ], since it
would work exactly the same way (wrt observable behaviour) as it
currently does. Also, [ .. term ] is not necessary, as the starting
index is always known.

The ending index is also always known​: it's -1. If we're implying -1 to
make pythonists feel more at-home, we should imply 0 too. That's why [ ..
term ] is necessary, for balance.

The only thing we need special handling for is postfix C<..>. If we

make it a rule that it is a syntactic special case occurring in [...]
and only before , => or ] then it is simply a matter of having the lexer
look for the next term and emitting a POSTDOTDOT instead of a DOTDOT.

The op.c can weed out illegal uses of it in finalize_op.

so you wouldn't support

  @​some = @​array[ 0 .. -3 ]

as a clearer way of writing

  splice @​some = @​array, -2;

Instead of slicing to the end when the RHS of DOTDOT is empty, I envision
using arbitrary negative numbers to mean count-from-the-top, the way scalar
indexing works.

What if the DOTDOT operator produces a "range object" that can be treated
as an iterator -- for (RANGE) acts like an iterator now -- the range object
contains at least the start and end points. The list a range object
flattens to would be unusual when all of the following conditions are true​:

  start is non-negative
  end is negative
  the list the range object is flattening into will become slice indices

when all three of those are true, the range object accesses the target
array's length and adds it to the end number.

And then, the TIEARRAY and TIEHASH interfaces can get more complicated, by
supporting arbitrary range-slice semantics.

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2013

From @davidnicol

On Wed, Jul 10, 2013 at 12​:00 AM, David Nicol <davidnicol@​gmail.com> wrote​:

And then, the TIEARRAY and TIEHASH interfaces can get more complicated, by
supporting arbitrary range-slice semantics.

such as, a sparse array could return existing elements only, when sliced
with a range.

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2013

From @cpansprout

On Tue Jul 09 22​:01​:01 2013, davidnicol@​gmail.com wrote​:

On Tue, Jul 9, 2013 at 11​:27 PM, Father Chrysostomos via RT <
perlbug-followup@​perl.org> wrote​:

I don’t see why we need special handling for [ term .. term ], since it
would work exactly the same way (wrt observable behaviour) as it
currently does. Also, [ .. term ] is not necessary, as the starting
index is always known.

The ending index is also always known​: it's -1. If we're implying -1 to
make pythonists feel more at-home,

No, not for that reason. Forget Python. :-) But because it is
convenient in some_complex_expression->[...].

we should imply 0 too. That's why [ ..
term ] is necessary, for balance.

The only thing we need special handling for is postfix C<..>. If we

make it a rule that it is a syntactic special case occurring in [...]
and only before , => or ] then it is simply a matter of having the lexer
look for the next term and emitting a POSTDOTDOT instead of a DOTDOT.

The op.c can weed out illegal uses of it in finalize_op.

so you wouldn't support

      @&#8203;some = @&#8203;array\[ 0 \.\. \-3 \]

as a clearer way of writing

      splice @&#8203;some = @&#8203;array\, \-2;

I was going to say ‘Definitely not’, but then, as I started to explain
why, I realised my reasoning was flawed.

That is perhaps not such a bad idea, but for backward compatibility it
should only happen under use v5.20.

With that, we don’t even need open-ended ranges, so we don’t have to
abuse perl’s syntax.

If this range syntax only works that way directly inside a slice (not as
an operand to an operator within the slice), then what should 1..-1 do
elsewhere? It should probably croak, to avoid subtle changes to the
code giving wildly different results (empty lists). Of course, 1..0
would have to continue giving an empty list.

Instead of slicing to the end when the RHS of DOTDOT is empty, I envision
using arbitrary negative numbers to mean count-from-the-top, the way
scalar
indexing works.

What if the DOTDOT operator produces a "range object" that can be treated
as an iterator -- for (RANGE) acts like an iterator now -- the range
object
contains at least the start and end points.

If slices can propagate ‘range context’ to their operands the way other
contexts are applied (and if range ops will return special internal
range SVs), then the whole low-precedence branch folding I mentioned
resolves itself. [$a ? 1..-2 : 3..-4 ] will work.

The list a range object
flattens to would be unusual when all of the following conditions are
true​:

start is non\-negative
end is negative
the list the range object is flattening into will become slice indices

when all three of those are true, the range object accesses the target
array's length and adds it to the end number.

Yes, that all works out nicely.

And then, the TIEARRAY and TIEHASH interfaces can get more complicated, by
supporting arbitrary range-slice semantics.

Stop it! You’re making my head spin!

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2013

From aaron@priven.com

How would new slicing syntax work with object methods and other times you'd want to pass a slice specification to a sub?

I can see how you could redefine stuff inside @​a[in_here], but not $obj->slice(in_here). Obviously if it's just an accessor you can do (@​{$obj->elements})(slice), but if something else happens in the method that doesn't work.

--
Aaron Priven, aaron@​priven.com, www.priven.com/aaron

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2013

From @davidnicol

On Wed, Jul 10, 2013 at 7​:03 PM, Aaron Priven <aaron@​priven.com> wrote​:

How would new slicing syntax work with object methods and other times
you'd want to pass a slice specification to a sub?

I can see how you could redefine stuff inside @​a[in_here], but not
$obj->slice(in_here). Obviously if it's just an accessor you can do
(@​{$obj->elements})(slice), but if something else happens in the method
that doesn't work.

It wouldn't. That's another fine argument against introducing situational
syntaces,

If ranges were to defer their expansion as long as possible, a range such
as 5 .. -2 could exist as a range object and get passed as part of a slice
specification, where it would mean something (from the sixth element to the
second element from the end) instead of getting immediately evaulated to
the empty list. How to wrap that up for passing as one argument however?
Maybe within square brackets, as contents of an array-ref, expansion
deferred as long as possible, so if you defer the flattening until actually
constructing the slice, you have succeeded.

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2013

From aaron@priven.com

On Wed, Jul 10, 2013 at 7​:03 PM, Aaron Priven <aaron@​priven.com> wrote​:
How would new slicing syntax work with object methods and other times you'd want to pass a slice specification to a sub?

On Jul 10, 2013, at 5​:41 PM, David Nicol wrote​:

It wouldn't. That's another fine argument against introducing situational syntaces,

If ranges were to defer their expansion as long as possible, a range such as 5 .. -2 could exist as a range object and get passed as part of a slice specification, where it would mean something (from the sixth element to the second element from the end) instead of getting immediately evaulated to the empty list. How to wrap that up for passing as one argument however? Maybe within square brackets, as contents of an array-ref, expansion deferred as long as possible, so if you defer the flattening until actually constructing the slice, you have succeeded.

Yes, this is what I meant by suggesting a new slice syntax with a quote-like operator. So​:

@​ary[qs/slice syntax goes here/]

or

$obj->slice(qs/slice syntax goes here/)

And it would become a slice object of some sort or other, that would not be processed until the slice actually occurred.

Like regex syntax, it could have basically nothing to do with regular perl syntax. (I had this whole idea of using Set​::IntSpan objects before realizing that negative numbers mean something completely different to Set​::IntSpan. Still, there's something about using hyphens instead of dots for ranges that appeals to me)

Of course, [qs/3-/] probably wouldn't satisfy people looking to make [3..] work. I'm not sure how to introduce it more succinctly given the paucity of ASCII characters. @​a​:[3-5] ? With Unicode we could have @​a⁅3-5⁆ or something.

Or perhaps none of this makes any sense at all.

--
Aaron Priven, aaron@​priven.com, http​://www.priven.com/aaron

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2013

From @ap

* Aaron Priven <aaron@​priven.com> [2013-07-11 08​:20]​:

Yes, this is what I meant by suggesting a new slice syntax with
a quote-like operator. So​:

@​ary[qs/slice syntax goes here/]

or

$obj->slice(qs/slice syntax goes here/)

And it would become a slice object of some sort or other, that would
not be processed until the slice actually occurred.

C.f. List​::Maker maybe.

--
*AUTOLOAD=*_;sub _{s/..([^​:]*)$/()[print$1,(",$\/"," ")[defined wantarray]]/e;$_}
&Just->another->Perl->hack;
#Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

p5pRT commented Jul 14, 2013

From @cpansprout

I started to implement two of the suggestions in this ticket, but then
ran into pad bugs that I have already fixed on the sprout/padconst
branch. So this is stalled until that is merged. I am waiting for
either Chip to concede or Ricardo to give a green light. See
<https://rt-archive.perl.org/perl5/Ticket/Display.html?id=109744#txn-1230929>.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 19, 2013

From @cpansprout

On Wed Jul 10 23​:17​:35 2013, aaron@​priven.com wrote​:

On Wed, Jul 10, 2013 at 7​:03 PM, Aaron Priven <aaron@​priven.com>
wrote​:
How would new slicing syntax work with object methods and other times
you'd want to pass a slice specification to a sub?

On Jul 10, 2013, at 5​:41 PM, David Nicol wrote​:

It wouldn't. That's another fine argument against introducing
situational syntaces,

If ranges were to defer their expansion as long as possible, a range
such as 5 .. -2 could exist as a range object and get passed as
part of a slice specification, where it would mean something (from
the sixth element to the second element from the end) instead of
getting immediately evaulated to the empty list. How to wrap that
up for passing as one argument however? Maybe within square
brackets, as contents of an array-ref, expansion deferred as long
as possible, so if you defer the flattening until actually
constructing the slice, you have succeeded.

Yes, this is what I meant by suggesting a new slice syntax with a
quote-like operator. So​:

@​ary[qs/slice syntax goes here/]

or

$obj->slice(qs/slice syntax goes here/)

And it would become a slice object of some sort or other, that would
not be processed until the slice actually occurred.

Like regex syntax, it could have basically nothing to do with regular
perl syntax. (I had this whole idea of using Set​::IntSpan objects
before realizing that negative numbers mean something completely
different to Set​::IntSpan. Still, there's something about using
hyphens instead of dots for ranges that appeals to me)

Of course, [qs/3-/] probably wouldn't satisfy people looking to make
[3..] work. I'm not sure how to introduce it more succinctly given
the paucity of ASCII characters. @​a​:[3-5] ? With Unicode we could
have @​a⁅3-5⁆ or something.

Or perhaps none of this makes any sense at all.

I started to implement [1..] and found myself liking your idea more and
more.

A CPAN module could provide a function that allows slice(@​array,
qs(...)) or slice(@​array, "1.., 3..5") as well as slice(a_list(), ...).
There is plenty of room for experimentation here, and it doesn’t have
to be in core.

Anyway, my unfinished work is on the sprout/slice branch.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 19, 2013

From @cpansprout

On Mon Aug 19 07​:30​:09 2013, sprout wrote​:

Anyway, my unfinished work is on the sprout/slice branch.

And also attached, but without the perly.* regenerated. Including the
latter made RT choke.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 19, 2013

From @cpansprout

From b3fb563 Mon Sep 17 00​:00​:00 2001
From​: Father Chrysostomos <sprout@​cpan.org>
Date​: Thu, 11 Jul 2013 13​:12​:47 -0700
Subject​: [PATCH] wip
MIME-Version​: 1.0
Content-Type​: text/plain; charset=UTF-8
Content-Transfer-Encoding​: 8bit

Only array slices (not list slices) are handled so far, and they are buggy​:
• Negative indices do not work yet.
• @​a[1.., 2..] dies at compile time; you can only use one FOO.. and it
  must be first in the list.

Inline Patch
diff --git a/embed.h b/embed.h
index d89782f..51cd616 100644
--- a/embed.h
+++ b/embed.h
@@ -1056,6 +1056,7 @@
 #define ck_sassign(a)		Perl_ck_sassign(aTHX_ a)
 #define ck_select(a)		Perl_ck_select(aTHX_ a)
 #define ck_shift(a)		Perl_ck_shift(aTHX_ a)
+#define ck_slice(a)		Perl_ck_slice(aTHX_ a)
 #define ck_smartmatch(a)	Perl_ck_smartmatch(aTHX_ a)
 #define ck_sort(a)		Perl_ck_sort(aTHX_ a)
 #define ck_spair(a)		Perl_ck_spair(aTHX_ a)
diff --git a/ext/Opcode/Opcode.pm b/ext/Opcode/Opcode.pm
index f71e700..7593117 100644
--- a/ext/Opcode/Opcode.pm
+++ b/ext/Opcode/Opcode.pm
@@ -6,7 +6,7 @@ use strict;
 
 our($VERSION, @ISA, @EXPORT_OK);
 
-$VERSION = "1.25";
+$VERSION = "1.26";
 
 use Carp;
 use Exporter ();
@@ -350,7 +350,7 @@ These memory related ops are not included in :base_core because they
 can easily be used to implement a resource attack (e.g., consume all
 available memory).
 
-    concat repeat join range
+    concat repeat join range postdotdot
 
     anonlist anonhash
 
diff --git a/op.c b/op.c
index fd8868f..83d3e1d 100644
--- a/op.c
+++ b/op.c
@@ -1752,7 +1752,7 @@ S_finalize_op(pTHX_ OP* o)
 	/* Relocate sv to the pad for thread safety.
 	 * Despite being a "constant", the SV is written to,
 	 * for reference counts, sv_upgrade() etc. */
-	if (cSVOPo->op_sv) {
+	if (cSVOPo->op_sv && cSVOPo->op_sv != &PL_sv_placeholder) {
 	    const PADOFFSET ix = pad_alloc(OP_CONST, SVf_READONLY);
 	    if (o->op_type != OP_METHOD_NAMED
 		&& cSVOPo->op_sv == &PL_sv_undef) {
@@ -1881,6 +1881,8 @@ S_finalize_op(pTHX_ OP* o)
 	break;
     }
 
+    case OP_POSTDOTDOT: Perl_croak(aTHX_ "Postfix .. outside of slice");
+
     case OP_SUBST: {
 	if (cPMOPo->op_pmreplrootu.op_pmreplroot)
 	    finalize_op(cPMOPo->op_pmreplrootu.op_pmreplroot);
@@ -9603,6 +9605,28 @@ Perl_ck_shift(pTHX_ OP *o)
 }
 
 OP *
+Perl_ck_slice(pTHX_ OP *o)
+{
+    dVAR;
+    OP *kid;
+
+    PERL_ARGS_ASSERT_CK_SLICE;
+
+    kid = cBINOPo->op_first;
+    if (kid->op_type != OP_PUSHMARK && kid->op_flags & OPf_KIDS)
+	kid = kUNOP->op_first;
+
+    for (kid = kid->op_sibling; kid; kid = kid->op_sibling) {
+	if (kid->op_type == OP_POSTDOTDOT) {
+	    kid->op_type = OP_CONST;
+	    kid->op_ppaddr = PL_ppaddr[OP_CONST];
+	}
+    }
+
+    return o;
+}
+
+OP *
 Perl_ck_sort(pTHX_ OP *o)
 {
     dVAR;
diff --git a/opcode.h b/opcode.h
index 540dc0b..c159e1a 100644
--- a/opcode.h
+++ b/opcode.h
@@ -142,6 +142,7 @@
 #define Perl_pp_custom Perl_unimplemented_op
 #define Perl_pp_reach Perl_pp_rkeys
 #define Perl_pp_rvalues Perl_pp_rkeys
+#define Perl_pp_postdotdot Perl_unimplemented_op
 START_EXTERN_C
 
 #ifndef DOINIT
@@ -525,6 +526,7 @@ EXTCONST char* const PL_op_name[] = {
 	"introcv",
 	"clonecv",
 	"padrange",
+	"postdotdot",
 	"freed",
 };
 #endif
@@ -910,6 +912,7 @@ EXTCONST char* const PL_op_desc[] = {
 	"private subroutine",
 	"private subroutine",
 	"list of private variables",
+	"postfix .. in slice",
 	"freed op",
 };
 #endif
@@ -1309,6 +1312,7 @@ EXT Perl_ppaddr_t PL_ppaddr[] /* or perlvars.h */
 	Perl_pp_introcv,
 	Perl_pp_clonecv,
 	Perl_pp_padrange,
+	Perl_pp_postdotdot,	/* implemented by Perl_unimplemented_op */
 }
 #endif
 #ifdef PERL_PPADDR_INITED
@@ -1458,7 +1462,7 @@ EXT Perl_check_t PL_check[] /* or perlvars.h */
 	Perl_ck_null,		/* aelemfast */
 	Perl_ck_null,		/* aelemfast_lex */
 	Perl_ck_null,		/* aelem */
-	Perl_ck_null,		/* aslice */
+	Perl_ck_slice,		/* aslice */
 	Perl_ck_each,		/* aeach */
 	Perl_ck_each,		/* akeys */
 	Perl_ck_each,		/* avalues */
@@ -1475,7 +1479,7 @@ EXT Perl_check_t PL_check[] /* or perlvars.h */
 	Perl_ck_split,		/* split */
 	Perl_ck_join,		/* join */
 	Perl_ck_null,		/* list */
-	Perl_ck_null,		/* lslice */
+	Perl_ck_slice,		/* lslice */
 	Perl_ck_fun,		/* anonlist */
 	Perl_ck_fun,		/* anonhash */
 	Perl_ck_fun,		/* splice */
@@ -1704,6 +1708,7 @@ EXT Perl_check_t PL_check[] /* or perlvars.h */
 	Perl_ck_null,		/* introcv */
 	Perl_ck_null,		/* clonecv */
 	Perl_ck_null,		/* padrange */
+	Perl_ck_null,		/* postdotdot */
 }
 #endif
 #ifdef PERL_CHECK_INITED
@@ -2093,6 +2098,7 @@ EXTCONST U32 PL_opargs[] = {
 	0x00000040,	/* introcv */
 	0x00000040,	/* clonecv */
 	0x00000040,	/* padrange */
+	0x00000600,	/* postdotdot */
 };
 #endif
 
diff --git a/opnames.h b/opnames.h
index 5502ba4..9ef8118 100644
--- a/opnames.h
+++ b/opnames.h
@@ -391,10 +391,11 @@ typedef enum opcode {
 	OP_INTROCV	 = 374,
 	OP_CLONECV	 = 375,
 	OP_PADRANGE	 = 376,
+	OP_POSTDOTDOT	 = 377,
 	OP_max		
 } opcode;
 
-#define MAXO 377
+#define MAXO 378
 #define OP_FREED MAXO
 
 /* the OP_IS_* macros are optimized to a simple range check because
diff --git a/perly.y b/perly.y
index 0f98f59..1c99024 100644
--- a/perly.y
+++ b/perly.y
@@ -80,7 +80,7 @@
 %token <i_tkval> FORMAT SUB ANONSUB PACKAGE USE
 %token <i_tkval> WHILE UNTIL IF UNLESS ELSE ELSIF CONTINUE FOR
 %token <i_tkval> GIVEN WHEN DEFAULT
-%token <i_tkval> LOOPEX DOTDOT YADAYADA
+%token <i_tkval> LOOPEX DOTDOT YADAYADA POSTDOTDOT
 %token <i_tkval> FUNC0 FUNC1 FUNC UNIOP LSTOP
 %token <i_tkval> RELOP EQOP MULOP ADDOP
 %token <i_tkval> DOLSHARP DO HASHBRACK NOAMP
@@ -111,7 +111,7 @@
 %left <i_tkval> ','
 %right <i_tkval> ASSIGNOP
 %right <i_tkval> '?' ':'
-%nonassoc DOTDOT YADAYADA
+%nonassoc DOTDOT YADAYADA POSTDOTDOT
 %left <i_tkval> OROR DORDOR
 %left <i_tkval> ANDAND
 %left <i_tkval> BITOROP
@@ -1001,6 +1001,13 @@ termbinop:	term ASSIGNOP term                     /* $x = $y */
 			      token_getmad($2,(OP*)op,'o');
 			    });
 			}
+	|	term POSTDOTDOT                        /* $x.. in slice */
+			{
+			  $$ = op_append_elem(OP_LIST,scalar($1),
+					      newSVOP(OP_POSTDOTDOT, 0,
+						      &PL_sv_placeholder));
+			  TOKEN_GETMAD($2,$$,'o');
+			}
 	|	term ANDAND term                       /* $x && $y */
 			{ $$ = newLOGOP(OP_AND, 0, $1, $3);
 			  TOKEN_GETMAD($2,$$,'o');
diff --git a/pp.c b/pp.c
index b61192d..9a95850 100644
--- a/pp.c
+++ b/pp.c
@@ -4296,6 +4296,7 @@ PP(pp_aslice)
     if (SvTYPE(av) == SVt_PVAV) {
 	const bool localizing = PL_op->op_private & OPpLVAL_INTRO;
 	bool can_preserve = FALSE;
+	SV **dst = MARK;
 
 	if (localizing) {
 	    MAGIC *mg;
@@ -4317,8 +4318,26 @@ PP(pp_aslice)
 	}
 
 	while (++MARK <= SP) {
+	  I32 elem = SvIV(*MARK);
+	  I32 count;
+	  if (MARK < SP && MARK[1] == &PL_sv_placeholder) {
+	    count = AvFILL(av) - elem + 1;
+	    if (count < 0) count = 0;
+	    else if (count > 1) {
+		I32  off = MARK - PL_stack_base;
+		I32 doff = dst  - PL_stack_base;
+		EXTEND(SP, count-1);
+		MARK = PL_stack_base +  off;
+		dst  = PL_stack_base + doff;
+		Move(MARK, MARK+count-1, SP-MARK+1, SV *);
+		SP   += count-1;
+		MARK += count-1;		
+	    }
+	    MARK++; /* skip placeholder */
+	  }
+	  else count = 1;
+	  while (count) {
 	    SV **svp;
-	    I32 elem = SvIV(*MARK);
 	    bool preeminent = TRUE;
 
 	    if (localizing && can_preserve) {
@@ -4340,8 +4359,12 @@ PP(pp_aslice)
 			SAVEADELETE(av, elem);
 		}
 	    }
-	    *MARK = svp ? *svp : &PL_sv_undef;
+	    *++dst = svp ? *svp : &PL_sv_undef;
+
+	    count--, elem++;
+	  }
 	}
+	SP = dst;
     }
     if (GIMME != G_ARRAY) {
 	MARK = ORIGMARK;
diff --git a/proto.h b/proto.h
index e027627..600031b 100644
--- a/proto.h
+++ b/proto.h
@@ -580,6 +580,12 @@ PERL_CALLCONV OP *	Perl_ck_shift(pTHX_ OP *o)
 #define PERL_ARGS_ASSERT_CK_SHIFT	\
 	assert(o)
 
+PERL_CALLCONV OP *	Perl_ck_slice(pTHX_ OP *o)
+			__attribute__warn_unused_result__
+			__attribute__nonnull__(pTHX_1);
+#define PERL_ARGS_ASSERT_CK_SLICE	\
+	assert(o)
+
 PERL_CALLCONV OP *	Perl_ck_smartmatch(pTHX_ OP *o)
 			__attribute__warn_unused_result__
 			__attribute__nonnull__(pTHX_1);
diff --git a/regen/opcode.pl b/regen/opcode.pl
index a081c64..69f013a 100755
--- a/regen/opcode.pl
+++ b/regen/opcode.pl
@@ -67,7 +67,8 @@ my %alias;
 # Format is "this function" => "does these op names"
 my @raw_alias = (
 		 Perl_do_kv => [qw( keys values )],
-		 Perl_unimplemented_op => [qw(padany mapstart custom)],
+		 Perl_unimplemented_op => [qw(padany mapstart custom
+					      postdotdot)],
 		 # All the ops with a body of { return NORMAL; }
 		 Perl_pp_null => [qw(scalar regcmaybe lineseq scope)],
 
diff --git a/regen/opcodes b/regen/opcodes
index 9c86d69..61d69b0 100644
--- a/regen/opcodes
+++ b/regen/opcodes
@@ -216,7 +216,7 @@ rv2av		array dereference	ck_rvconst	dt1
 aelemfast	constant array element	ck_null		s$	A S
 aelemfast_lex	constant lexical array element	ck_null		d0	A S
 aelem		array element		ck_null		s2	A S
-aslice		array slice		ck_null		m@	A L
+aslice		array slice		ck_slice	m@	A L
 
 aeach		each on array		ck_each		%	A
 akeys		keys on array		ck_each		t%	A
@@ -243,7 +243,7 @@ join		join or string		ck_join		mst@	S L
 # List operators.
 
 list		list			ck_null		m@	L
-lslice		list slice		ck_null		2	H L L
+lslice		list slice		ck_slice	2	H L L
 anonlist	anonymous list ([])	ck_fun		ms@	L
 anonhash	anonymous hash ({})	ck_fun		ms@	L
 
@@ -551,3 +551,4 @@ padcv		private subroutine	ck_null		d0
 introcv		private subroutine	ck_null		d0
 clonecv		private subroutine	ck_null		d0
 padrange	list of private variables	ck_null		d0
+postdotdot	postfix .. in slice	ck_null		$
diff --git a/toke.c b/toke.c
index 4f3eee9..339eb16 100644
--- a/toke.c
+++ b/toke.c
@@ -377,6 +377,7 @@ static struct debug_tokens {
     { PLUGSTMT,		TOKENTYPE_OPVAL,	"PLUGSTMT" },
     { PMFUNC,		TOKENTYPE_OPVAL,	"PMFUNC" },
     { POSTDEC,		TOKENTYPE_NONE,		"POSTDEC" },
+    { POSTDOTDOT,	TOKENTYPE_NONE,		"POSTDOTDOT" },
     { POSTINC,		TOKENTYPE_NONE,		"POSTINC" },
     { POWOP,		TOKENTYPE_OPNUM,	"POWOP" },
     { PREDEC,		TOKENTYPE_NONE,		"PREDEC" },
@@ -6776,7 +6777,13 @@ Perl_yylex(pTHX)
 		    pl_yylval.ival = OPf_SPECIAL;
 		}
 		else
+		{
+		    char * const s2 = skipspace(s);
+		    if (*s2 == ',' || *s2 == ']'
+		     || (*s2 == '=' && s2[1] == '>'))
+			OPERATOR(POSTDOTDOT);
 		    pl_yylval.ival = 0;
+		}
 		OPERATOR(DOTDOT);
 	    }
 	    if (*s == '=' && !PL_lex_allbrackets &&

@p5pRT
Copy link
Author

p5pRT commented Aug 19, 2013

From @epa

Thanks for your work on this.

--
Ed Avis <eda@​waniasset.com>

@p5pRT
Copy link
Author

p5pRT commented Aug 19, 2013

From @cpansprout

On Mon Aug 19 07​:30​:09 2013, sprout wrote​:

I started to implement [1..] and found myself liking your idea more and
more.

A CPAN module could provide a function that allows slice(@​array,
qs(...)) or slice(@​array, "1.., 3..5") as well as slice(a_list(), ...).
There is plenty of room for experimentation here, and it doesn’t have
to be in core.

Anyway, my unfinished work is on the sprout/slice branch.

The reason I posted this is that I’m not sure whether it would be
worthwhile to continue work at this stage. I would prefer to see a CPAN
implementation of a slice function, rather than [1..] in core.

Is there enough support for this that it would be worthwhile to continue
with the patch?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 29, 2013

From zefram@fysh.org

Father Chrysostomos via RT wrote​:

The reason I posted this is that I'm not sure whether it would be
worthwhile to continue work at this stage. I would prefer to see a CPAN
implementation of a slice function, rather than [1..] in core.

I'd also prefer to see it on CPAN in some form than to further complicate
core operators.

-zefram

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants