Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perldwim document - Code completions, and other optimizations Perl does for you #13908

Open
p5pRT opened this issue Jun 6, 2014 · 16 comments
Open

Comments

@p5pRT
Copy link

p5pRT commented Jun 6, 2014

Migrated from rt.perl.org#122055 (status was 'open')

Searchable as RT122055$

@p5pRT
Copy link
Author

p5pRT commented Jun 6, 2014

From @b2gills

I went and created a document that lists several modifications that
Perl does for you. Ranging from optimizations to changes Perl does
so that it "Does What I Mean"

The purpose of this document is to enable people to find this information
which is otherwise difficult to find, and is often passed from programmer
to programmer in an ad-hoc fashion

The reason I combined optimizations with DWIM code changes, is that the
reasons you would want to know about them are for the most part the same.

I have previously sent this to P5P where at least Yves Orton (demerphq)
showed approval for such a document being added to the Perl distribution.
( I have added to it in the mean time )

Here is a short snippet of the file as viewed with perldoc
( full patch should be attached )


NAME
  perldwim - Code completions, and other optimizations Perl does for you

DESCRIPTION
  This document lists the modifications Perl does to your code which allow
  you to write your code in a clean way, and have your code still do what
  you actually want.

  That is Perl will re-write the opcodes to do what you probably wanted
  Perl to do. Commonly reffered to as "Do What I Mean" (DWIM) within the
  Perl community.

  These can reduce the amount of code you would normally have to type,
  improve the clarity of your code, or they can improve performance.

  Knowing about these modifications can help you determine the best way to
  structure your code. It can also help you determine which versions of
  Perl your code will work on, or which versions it will work on the best.

OPCODE-TREE MODIFICATIONS
  "<>"
  while( <> ){...}

  gets turned into​:

  while( <ARGV> ){...}

  Which is a construct that reads from the files in @​ARGV, or "STDIN"
  if @​ARGV is empty.

  See "I/O Operators" in perlop for more details.

  range in for loop conditional
  for( 1..10 ){...}
  for( 1..$n ){...}
  for( "a".."z" ){...}

  Normally the ranges would get turned into something like a normal
  array containing all of the items to be iterated over.

  Instead the two endpoints are left in the op-tree, with the rest of
  the items to be iterated over generated as they are needed.

  This is particularly helpful if the range is large.

  for( 0..65536 ){...}
  for( 'a'..'perl' ){...} # 285076

@p5pRT
Copy link
Author

p5pRT commented Jun 6, 2014

From @b2gills

0001-Create-the-perldwim-document.patch
From e0bdad934a2a2335bdab2977c13ce692b6d23919 Mon Sep 17 00:00:00 2001
From: Brad Gilbert <b2gills@gmail.com>
Date: Fri, 6 Jun 2014 17:49:34 -0500
Subject: [PATCH] Create the perldwim document

---
 MANIFEST         |   1 +
 pod/perldwim.pod | 287 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 288 insertions(+)
 create mode 100644 pod/perldwim.pod

diff --git a/MANIFEST b/MANIFEST
index baf405f..3f58e8b 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -4488,6 +4488,7 @@ pod/perldelta.pod		Perl changes since previous version
 pod/perldiag.pod		Perl diagnostic messages
 pod/perldsc.pod			Perl data structures intro
 pod/perldtrace.pod		Perl's support for DTrace
+pod/perldwim.pod		Code completions, and other optimizations Perl does for you
 pod/perlebcdic.pod		Considerations for running Perl on EBCDIC platforms
 pod/perlembed.pod		Perl ways to embed perl in your C or C++ application
 pod/perlexperiment.pod		A listing of experimental features in Perl
diff --git a/pod/perldwim.pod b/pod/perldwim.pod
new file mode 100644
index 0000000..afc5f0f
--- /dev/null
+++ b/pod/perldwim.pod
@@ -0,0 +1,287 @@
+=encoding utf8
+
+=head1 NAME
+
+perldwim - Code completions, and other optimizations Perl does for you
+
+=head1 DESCRIPTION
+
+This document lists the modifications Perl does to your code which allow you to
+write your code in a clean way, and have your code still do what you actually
+want.
+
+That is Perl will re-write the opcodes to do what you probably wanted Perl
+to do. Commonly reffered to as "Do What I Mean" (DWIM) within the Perl
+community.
+
+These can reduce the amount of code you would normally have to type,
+improve the clarity of your code, or they can improve performance.
+
+Knowing about these modifications can help you determine the best way to
+structure your code.
+It can also help you determine which versions of Perl your code will work on,
+or which versions it will work on the best.
+
+=head1 OPCODE-TREE MODIFICATIONS
+
+=over 4
+
+=item C<< <> >>
+
+    while( <> ){...}
+
+gets turned into:
+
+    while( <ARGV> ){...}
+
+Which is a construct that reads from the files in L<C<@ARGV>|perlvar/"@ARGV">,
+or C<STDIN> if L<C<@ARGV>|perlvar/"@ARGV"> is empty.
+
+See L<perlop/"I/O Operators"> for more details.
+
+=item range in for loop conditional
+
+    for( 1..10 ){...}
+    for( 1..$n ){...}
+    for( "a".."z" ){...}
+
+Normally the ranges would get turned into something like a normal array
+containing all of the items to be iterated over.
+
+Instead the two endpoints are left in the op-tree, with the rest of the items
+to be iterated over generated as they are needed.
+
+This is particularly helpful if the range is large.
+
+    for( 0..65536 ){...}
+    for( 'a'..'perl' ){...} # 285076
+
+=item constant folding
+
+Perl will replace most basic operations on constants with the result.
+
+So these
+
+    my $v = 4 * 10 ** 3 + 3 * 10 ** 2 + 2 * 10 ** 1 + 1 * 10 ** 0;
+    my $five = 5 || 3;
+    my $three = 5 && 3;
+    my $true = 5 < 6;
+    my false = "a" gt "b";
+    my $neg = "a" cmp "b";
+
+become
+
+    my $v = 4321;
+    my $five = 5;
+    my $three = 3;
+    my $true = 1;
+    my $false = !1;
+    my $neg = "a" cmp "b";
+
+There are many more ops and functions that Perl will constant fold,
+such as C<sin> C<!> C<< < >> c<cmp>.
+
+This helps when Perl is trying to reduce code that has a
+L<constant value as a conditional|/"constant value in conditional">.
+
+=item constant value in conditional
+
+In constructs where Perl can determine at compile time that it doesn't need to
+check the conditional at run-time, it will optimize it to the fastest code
+that still has the same side effects.
+
+These constructs will effectively be no-ops.
+
+    if( 0 ){...}
+    unless( 1 ){...}
+    while( 0 ){...}
+    while( 0 ){...}continue{...}
+    until( 1 ){...}
+    until( 1 ){...}continue{...}
+    for( ; 0 ; ){...}
+    for( ; 0 ; ){...}continue{...}
+
+While these will end up as C<do> blocks.
+
+    if( 1 ){...}
+    unless( 0 ){...}
+
+The loop constructs that can get entered into will be mostly the same,
+but without the check of the conditional.
+
+    while( 1 ){...}
+    until( 0 ){...}
+    for( ; 1 ; ){...}
+    while(){...}
+
+Constructs that have a section to be run when true, and one to be run when
+false will also get optimized.
+
+    if( 0 ){ say 'true' }else{ say 'false' } # do{ say 'false' }
+    $v = 0 ? 'true' : 'false'; # $v = 'false';
+
+Knowing this you can help performance by storing values that cannot change
+over the course of your program, in constants.
+
+    use constant DEBUG => 0;
+    if( DEBUG ){...}
+
+You can create subroutines that will act similar to constants.
+
+    sub True (){ !!1 }
+    sub False (){ !!0 }
+
+It is usually clearer to use the L<constant> pragma though.
+
+    use constant True => !!1;
+    use constant False => !!0;
+    # or
+    use constant {
+        True => !!1,
+        False => !!0,
+    };
+
+=item iterator function in while loop conditional
+
+    while( readline $fh ){...}
+    while( <> ){...}
+    while( <STDIN> ){...}
+    while( glob '"*e f*"' ){...}
+    while( <"*e f*"> ){...}
+    while( readdir $dh ){...}
+    while( each %h ){...}
+    while( each @a ){...}
+
+Normally the while loop would stop when the return value of these ops were
+false and don't store the result anywhere.
+Which is both rather pointless,
+and would stop early if the function returned C<0>.
+
+Instead it stores the result in C<$_>, and stops when they return C<undef>.
+
+In other words it changes this:
+
+    while( <> ){...}
+
+into this
+
+    while( defined( $_ = <> ) ){...}
+
+This also applies to the conditional part of c-style C<for> loops.
+
+    for ( ; <> ; ) {...}
+    for ( ; defined( $_ = <> ) ; ) {...}
+
+Perl also adds the C<defined> check if you explicitly store the result
+in a variable.
+
+    while( my $line = readline $fh ){...}
+    while( defined( my $line = readline $fh ) ){...}
+
+This code modification helps reduce the amount of code you have to write;
+which reduces the possibility for errors, and improves clarity of intent.
+
+See L<perlop/"I/O Operators"> for more information.
+
+Support for C<readdir> was added in L<5.12|perlfunc/"readdir DIRHANDLE">
+with commit
+114c60ecb1f775ef1deb4fdc8fb8e3a6f343d13d
+
+Support for C<each> was added in L<5.18|perl5180delta/"Selected Bug Fixes">
+with commit
+8ae39f603f0f5778c160e18e08df60affbd5a620
+
+=item padrange op
+
+This single op can, in some circumstances, replace the sequence of a
+pushmark followed by one or more padsv/padav/padhv ops, and possibly
+a trailing C<list> op.
+
+This is generally more efficient, but is particularly so in the case
+of C<void/INTRO> combination: formerly, C<my($a,$b,@c,%d)>; would be compiled as
+
+    pushmark; padsv[$a]; padsv[$b]; padav[@c]; padhv[%d]; list; nextstate
+
+which would have the effect of pushing C<$a>, C<$b> onto the stack, then
+pushing the (non-existent) elements of C<@c>, then pushing the C<%d> C<HV>;
+then C<pp_list> would pop all the elements except the last C<%h>;
+finally C<nextstate> would pop C<%h>.
+Instead C<padrange> skips all the pushing and popping in void context.
+
+This also handles constructs like
+
+    my ($x,$y) = @_
+  
+With the C<OPf_SPECIAL> flag to indicate that in addition,
+C<@_> should be pushed onto the stack, skipping an
+additional C<pushmark>/C<gv[*_]>/C<rv2sv> combination.
+
+
+So in total the above construct
+goes from being
+
+    3  <0> pushmark s
+    4  <$> gv(*_) s
+    5  <1> rv2av[t3] lK/1
+    6  <0> pushmark sRM*/128
+    7  <0> padsv[$x:1,2] lRM*/LVINTRO
+    8  <0> padsv[$y:1,2] lRM*/LVINTRO
+    9  <2> aassign[t4] vKS
+
+to
+
+    3  <0> padrange[$x:1,2; $y:1,2] l*/LVINTRO,2 ->4
+    4  <2> aassign[t4] vKS
+
+Added in L<5.18|perl5180delta/Internal Changes> with the first commit being
+a7fd8ef68b459a13ba95615ec125e2e7ba656b47
+
+=item De Morgan's law
+
+Perl will use De Morgan's law to reduce the number of ops.
+
+So these
+
+    if ( $p and (!$x || !$y) ) { ... }
+    if ( !$x || !$y ) { ... }
+
+become
+
+    if ( $p and not $x && $y ) { ... }
+    unless ( $x and $y ) { ... }
+
+This should not affect your code, unless you are using L<overload|overload>
+improperly.
+
+=item enable features when declaring the lowest supported version
+
+If you declare the minimum version of Perl with a
+L<C<use>|perlfunc/"use VERSION"> statement, Perl will also enable
+any features that came with that version of Perl.
+
+That is this:
+
+    use 5.10.0;
+
+is the same as:
+
+    BEGIN{
+        require 5.10.0;
+        require feature;
+        feature->import(':5.10')
+    }
+
+This is not limited to just enabling features, it will also enable full
+L<strict> mode if the declared version is at least 5.11.0.
+
+See L<feature> for more information.
+
+Added in L<5.10|perl5100delta/"The feature pragma">
+
+The enabling of strict mode was added in
+L<5.12|perl5120delta/"Implicit strictures"> with the first commit being
+53eb19dd57d98e5a28ec6e1a56a1a40ce469145f
+
+=back
+
+=cut
-- 
1.9.1

@p5pRT
Copy link
Author

p5pRT commented Jun 7, 2014

From @jkeenan

On Fri Jun 06 16​:10​:12 2014, brad wrote​:

I went and created a document that lists several modifications that
Perl does for you. Ranging from optimizations to changes Perl does
so that it "Does What I Mean"

The purpose of this document is to enable people to find this information
which is otherwise difficult to find, and is often passed from programmer
to programmer in an ad-hoc fashion

The reason I combined optimizations with DWIM code changes, is that the
reasons you would want to know about them are for the most part the same.

I have previously sent this to P5P where at least Yves Orton (demerphq)
showed approval for such a document being added to the Perl distribution.
( I have added to it in the mean time )

Here is a short snippet of the file as viewed with perldoc
( full patch should be attached )

I like this. There are a few places where the grammar needs to be ironed out, but we don't need to do that yet. We first need to have people review the document for accuracy.

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Jun 7, 2014

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jun 7, 2014

From @xdg

On Fri, Jun 6, 2014 at 7​:10 PM, Brad Gilbert <perlbug-followup@​perl.org> wrote​:

The purpose of this document is to enable people to find this information
which is otherwise difficult to find, and is often passed from programmer
to programmer in an ad-hoc fashion

+1 or more

I'm not sure "perldwim" is the right name, but I love the idea of
consolidating such things for easy reference.

I would omit the padrange optimization -- this isn't something that
most people would know or care about. The other optimizations make
sense because we don't want people worrying about how to code such
things to be faster.

David

--
David Golden <xdg@​xdg.me> Twitter/IRC​: @​xdg

@p5pRT
Copy link
Author

p5pRT commented Jun 7, 2014

From @b2gills

On Fri, Jun 6, 2014 at 8​:12 PM, David Golden <xdg@​xdg.me> wrote​:

On Fri, Jun 6, 2014 at 7​:10 PM, Brad Gilbert <perlbug-followup@​perl.org> wrote​:

The purpose of this document is to enable people to find this information
which is otherwise difficult to find, and is often passed from programmer
to programmer in an ad-hoc fashion

+1 or more

I'm not sure "perldwim" is the right name, but I love the idea of
consolidating such things for easy reference.

I would omit the padrange optimization -- this isn't something that
most people would know or care about. The other optimizations make
sense because we don't want people worrying about how to code such
things to be faster.

David

--
David Golden <xdg@​xdg.me> Twitter/IRC​: @​xdg

Actually there are probably ways of writing a program where you can
prevent the padrange optimization from working. I think is very useful
to show that Perl optimizes lexical variables if they are next to each
other. It also shows that the best way (currently) to write a subroutine
is with a line like​: ` my($x,$y,$z) = @​_; ` instead of shifting off every
element.

I do think that section can be whittled down much further than it is currently.
( I mostly just copy-pasted it from the very well-written commit messages. )
It really should just be a top level view, instead of an in-depth view.

I do think that this is just the tip of the iceberg of the things that should
go into it.

As to the name​:

There is a reason I decided to put DWIM stuff and optimizations together;
they are mostly implemented the same way, and they have a large effect
on each-other. Not to mention that in some cases it could be hard to
decide on which a given change is.

For example `for(1...10){...}` could be considered DWIM because
I don't actually want an array of 10 items, I want a loop.
It could also be considered an optimization as it doesn't create
those elements.

So I figured perldwim was better than perloptimize.
Although thinking about it now, I might be swinging the other way.
As the document can show you how to optimize your
code for readability (using DWIMery) as much as for performance.

Any way, I just thought that the distribution could use such a
document regardless of the name.

@p5pRT
Copy link
Author

p5pRT commented Jun 7, 2014

From @druud62

On 2014-06-07 01​:10, Brad Gilbert wrote​:

         while\( \<ARGV> \)\{\.\.\.\}

I would put a space after "while", because it is not a function call.

         for\( 1\.\.10 \)\{\.\.\.\}
         for\( 1\.\.$n \)\{\.\.\.\}
         for\( "a"\.\."z" \)\{\.\.\.\}

Likewise.

         for\( 0\.\.65536 \)\{\.\.\.\}
         for\( 'a'\.\.'perl' \)\{\.\.\.\} \# 285076

--
Ruud

@p5pRT
Copy link
Author

p5pRT commented Jun 7, 2014

From @Abigail

On Fri, Jun 06, 2014 at 04​:10​:13PM -0700, Brad Gilbert wrote​:

+
+=head1 OPCODE-TREE MODIFICATIONS
+
+=over 4
+
+=item C<< <> >>
+
+ while( <> ){...}
+
+gets turned into​:
+
+ while( <ARGV> ){...}
+
+Which is a construct that reads from the files in L<C<@​ARGV>|perlvar/"@​ARGV">,

This actually gets turned into​:

  while (defined ($_ = <ARGV>)) {...}

+or C<STDIN> if L<C<@​ARGV>|perlvar/"@​ARGV"> is empty.
+
+See L<perlop/"I/O Operators"> for more details.
+
+=item range in for loop conditional
+
+ for( 1..10 ){...}
+ for( 1..$n ){...}
+ for( "a".."z" ){...}
+
+Normally the ranges would get turned into something like a normal array
+containing all of the items to be iterated over.

Not an array. A list.

+Instead the two endpoints are left in the op-tree, with the rest of the items
+to be iterated over generated as they are needed.

Does the intended audience have any idea what the op-tree is? Or what
"left in the op-tree" implies?

+
+This is particularly helpful if the range is large.
+
+ for( 0..65536 ){...}
+ for( 'a'..'perl' ){...} # 285076

What's "285076"?

+
+=item constant folding
+
+Perl will replace most basic operations on constants with the result.
+
+So these
+
+ my $v = 4 * 10 ** 3 + 3 * 10 ** 2 + 2 * 10 ** 1 + 1 * 10 ** 0;
+ my $five = 5 || 3;
+ my $three = 5 && 3;
+ my $true = 5 < 6;
+ my false = "a" gt "b";
+ my $neg = "a" cmp "b";
+
+become
+
+ my $v = 4321;
+ my $five = 5;
+ my $three = 3;
+ my $true = 1;
+ my $false = !1;
+ my $neg = "a" cmp "b";
+
+There are many more ops and functions that Perl will constant fold,
+such as C<sin> C<!> C<< < >> c<cmp>.

This is nice, but is this a DWIM? Will knowing this fact actually cause
people to write different programs?

+
+This helps when Perl is trying to reduce code that has a
+L<constant value as a conditional|/"constant value in conditional">.
+
+=item constant value in conditional
+
+In constructs where Perl can determine at compile time that it doesn't need to
+check the conditional at run-time, it will optimize it to the fastest code
+that still has the same side effects.
+
+These constructs will effectively be no-ops.
+
+ if( 0 ){...}
+ unless( 1 ){...}
+ while( 0 ){...}
+ while( 0 ){...}continue{...}
+ until( 1 ){...}
+ until( 1 ){...}continue{...}
+ for( ; 0 ; ){...}
+ for( ; 0 ; ){...}continue{...}
+
+While these will end up as C<do> blocks.
+
+ if( 1 ){...}
+ unless( 0 ){...}
+
+The loop constructs that can get entered into will be mostly the same,
+but without the check of the conditional.
+
+ while( 1 ){...}
+ until( 0 ){...}
+ for( ; 1 ; ){...}
+ while(){...}
+
+Constructs that have a section to be run when true, and one to be run when
+false will also get optimized.
+
+ if( 0 ){ say 'true' }else{ say 'false' } # do{ say 'false' }
+ $v = 0 ? 'true' : 'false'; # $v = 'false';
+
+Knowing this you can help performance by storing values that cannot change
+over the course of your program, in constants.
+
+ use constant DEBUG => 0;
+ if( DEBUG ){...}
+
+You can create subroutines that will act similar to constants.
+
+ sub True (){ !!1 }
+ sub False (){ !!0 }

Two things​:
  - I think we encourage people to *not* use True and False constants.
  Because this leads to confusion; for example, 2 is true, but
  C<< 2 == True >> is a false statement.
  - What's with the !!1 and !!0? What kind of DWIM is going on here?
  Does using !!1/!!0 instead of 1/0 actually serve a purpose? If
  it does, is that obvious for the intended audience? (It's certainly
  not obvious to me).

+
+It is usually clearer to use the L<constant> pragma though.
+
+ use constant True => !!1;
+ use constant False => !!0;
+ # or
+ use constant {
+ True => !!1,
+ False => !!0,
+ };
+
+=item iterator function in while loop conditional
+
+ while( readline $fh ){...}
+ while( <> ){...}
+ while( <STDIN> ){...}
+ while( glob '"*e f*"' ){...}
+ while( <"*e f*"> ){...}
+ while( readdir $dh ){...}
+ while( each %h ){...}
+ while( each @​a ){...}
+
+Normally the while loop would stop when the return value of these ops were
+false and don't store the result anywhere.
+Which is both rather pointless,
+and would stop early if the function returned C<0>.
+
+Instead it stores the result in C<$_>, and stops when they return C<undef>.
+
+In other words it changes this​:
+
+ while( <> ){...}
+
+into this
+
+ while( defined( $_ = <> ) ){...}

But earlier you said it turns into

  while (<ARGV>) {...}

+
+This also applies to the conditional part of c-style C<for> loops.
+
+ for ( ; <> ; ) {...}
+ for ( ; defined( $_ = <> ) ; ) {...}
+
+Perl also adds the C<defined> check if you explicitly store the result
+in a variable.
+
+ while( my $line = readline $fh ){...}
+ while( defined( my $line = readline $fh ) ){...}
+
+This code modification helps reduce the amount of code you have to write;
+which reduces the possibility for errors, and improves clarity of intent.
+
+See L<perlop/"I/O Operators"> for more information.
+
+Support for C<readdir> was added in L<5.12|perlfunc/"readdir DIRHANDLE">
+with commit
+114c60ecb1f775ef1deb4fdc8fb8e3a6f343d13d
+
+Support for C<each> was added in L<5.18|perl5180delta/"Selected Bug Fixes">
+with commit
+8ae39f603f0f5778c160e18e08df60affbd5a620
+
+=item padrange op
+
+This single op can, in some circumstances, replace the sequence of a
+pushmark followed by one or more padsv/padav/padhv ops, and possibly
+a trailing C<list> op.

I think this item will confuse the majority of the readers of this
document, reaching for "perldoc -f padrange" to find about this Perl
function.

+=item De Morgan's law
+
+Perl will use De Morgan's law to reduce the number of ops.
+
+So these
+
+ if ( $p and (!$x || !$y) ) { ... }
+ if ( !$x || !$y ) { ... }
+
+become
+
+ if ( $p and not $x && $y ) { ... }
+ unless ( $x and $y ) { ... }
+
+This should not affect your code, unless you are using L<overload|overload>
+improperly.

So, that's not a DWIM, but a "do something else than what I mean"...

+=item enable features when declaring the lowest supported version
+
+If you declare the minimum version of Perl with a
+L<C<use>|perlfunc/"use VERSION"> statement, Perl will also enable
+any features that came with that version of Perl.
+
+That is this​:
+
+ use 5.10.0;
+
+is the same as​:
+
+ BEGIN{
+ require 5.10.0;
+ require feature;
+ feature->import('​:5.10')
+ }
+
+This is not limited to just enabling features, it will also enable full
+L<strict> mode if the declared version is at least 5.11.0.
+
+See L<feature> for more information.
+
+Added in L<5.10|perl5100delta/"The feature pragma">
+
+The enabling of strict mode was added in
+L<5.12|perl5120delta/"Implicit strictures"> with the first commit being
+53eb19dd57d98e5a28ec6e1a56a1a40ce469145f

My overall feeling is that this document is just a hodgepot of random
Perl/perl factoids, without a clear audience (some factoids appeal to
a beginning Perl programming, others to an XS or perl internals coder),
without a clear structure.

The points you bring up are good in general, I just don't see the value
of putting them together in a "dwim" document.

Abigail

@p5pRT
Copy link
Author

p5pRT commented Jun 7, 2014

From @jhi

On Saturday-201406-07, 17​:32, Abigail wrote​:

This is nice, but is this a DWIM? Will knowing this fact actually cause
people to write different programs?

I don't know whether it's "DWIM" (and I have doubts about the naming of
the proposed page), but showing that constant folding happens encourages
people not to do the folding themselves, and instead leave the
computations in their original form, which is probably somehow more
informational than the folded result.

my $V = 4/3 * PI() * $r ** 3;

instead of

my $V = 4.18879 * $r ** 3;

@p5pRT
Copy link
Author

p5pRT commented Jun 7, 2014

From @b2gills

On Sat, Jun 7, 2014 at 4​:32 PM, Abigail <abigail@​abigail.be> wrote​:

On Fri, Jun 06, 2014 at 04​:10​:13PM -0700, Brad Gilbert wrote​:

+
+=head1 OPCODE-TREE MODIFICATIONS
+
+=over 4
+
+=item C<< <> >>
+
+ while( <> ){...}
+
+gets turned into​:
+
+ while( <ARGV> ){...}
+
+Which is a construct that reads from the files in L<C<@​ARGV>|perlvar/"@​ARGV">,

This actually gets turned into​:

while \(defined \($\_ = \<ARGV>\)\) \{\.\.\.\}

The defined check and setting $_ is taken care of elsewhere in the doc.

+or C<STDIN> if L<C<@​ARGV>|perlvar/"@​ARGV"> is empty.
+
+See L<perlop/"I/O Operators"> for more details.
+
+=item range in for loop conditional
+
+ for( 1..10 ){...}
+ for( 1..$n ){...}
+ for( "a".."z" ){...}
+
+Normally the ranges would get turned into something like a normal array
+containing all of the items to be iterated over.

Not an array. A list.

That's why I said it was something like a normal array.

+Instead the two endpoints are left in the op-tree, with the rest of the items
+to be iterated over generated as they are needed.

Does the intended audience have any idea what the op-tree is? Or what
"left in the op-tree" implies?

No and yes. That is it's intended as a resource similar to an
index in a book. So it's intended audience is everybody that
is looking for information on opcode modifications that happen
behind the scenes.

So each item in there should be rather short with a
link of where to find more information.

+
+This is particularly helpful if the range is large.
+
+ for( 0..65536 ){...}
+ for( 'a'..'perl' ){...} # 285076

What's "285076"?

$ perl -E'for("a".."perl"){$count++} say $count'

+
+=item constant folding
+
+Perl will replace most basic operations on constants with the result.
+
+So these
+
+ my $v = 4 * 10 ** 3 + 3 * 10 ** 2 + 2 * 10 ** 1 + 1 * 10 ** 0;
+ my $five = 5 || 3;
+ my $three = 5 && 3;
+ my $true = 5 < 6;
+ my false = "a" gt "b";
+ my $neg = "a" cmp "b";
+
+become
+
+ my $v = 4321;
+ my $five = 5;
+ my $three = 3;
+ my $true = 1;
+ my $false = !1;
+ my $neg = "a" cmp "b";
+
+There are many more ops and functions that Perl will constant fold,
+such as C<sin> C<!> C<< < >> c<cmp>.

This is nice, but is this a DWIM? Will knowing this fact actually cause
people to write different programs?

Yes actually.

Just look at t/test_pl/_num_to_alpha.t
If I had wanted to optimize it, I might have replaced

  is( _num_to_alpha(26 ** 3 + 26 ** 2 + 26 + 1 ), 'AAAB');

with

  is( _num_to_alpha( 18279 ), 'AAAB');

( In that case it would have definitely been premature, and still have
not actually achieved anything)

Or if I needed a variable to start with the value of sin(253)
which is more accurate than writing out the resultant value in the
source of a program.

+
+This helps when Perl is trying to reduce code that has a
+L<constant value as a conditional|/"constant value in conditional">.
+
+=item constant value in conditional
+
+In constructs where Perl can determine at compile time that it doesn't need to
+check the conditional at run-time, it will optimize it to the fastest code
+that still has the same side effects.
+
+These constructs will effectively be no-ops.
+
+ if( 0 ){...}
+ unless( 1 ){...}
+ while( 0 ){...}
+ while( 0 ){...}continue{...}
+ until( 1 ){...}
+ until( 1 ){...}continue{...}
+ for( ; 0 ; ){...}
+ for( ; 0 ; ){...}continue{...}
+
+While these will end up as C<do> blocks.
+
+ if( 1 ){...}
+ unless( 0 ){...}
+
+The loop constructs that can get entered into will be mostly the same,
+but without the check of the conditional.
+
+ while( 1 ){...}
+ until( 0 ){...}
+ for( ; 1 ; ){...}
+ while(){...}
+
+Constructs that have a section to be run when true, and one to be run when
+false will also get optimized.
+
+ if( 0 ){ say 'true' }else{ say 'false' } # do{ say 'false' }
+ $v = 0 ? 'true' : 'false'; # $v = 'false';
+
+Knowing this you can help performance by storing values that cannot change
+over the course of your program, in constants.
+
+ use constant DEBUG => 0;
+ if( DEBUG ){...}
+
+You can create subroutines that will act similar to constants.
+
+ sub True (){ !!1 }
+ sub False (){ !!0 }

Two things​:
- I think we encourage people to *not* use True and False constants.
Because this leads to confusion; for example, 2 is true, but
C<< 2 == True >> is a false statement.
- What's with the !!1 and !!0? What kind of DWIM is going on here?
Does using !!1/!!0 instead of 1/0 actually serve a purpose? If
it does, is that obvious for the intended audience? (It's certainly
not obvious to me).

Well 0 is not the same as !!0, and the same is true to a lesser
extent with 1 and !!1.

It was just an example, perhaps these would not attract your ire​:

  sub thousand (){ 1_000 }
  sub million (){ 1_000_000 }
  sub billion (){ 1_000_000_000 }

+
+It is usually clearer to use the L<constant> pragma though.
+
+ use constant True => !!1;
+ use constant False => !!0;
+ # or
+ use constant {
+ True => !!1,
+ False => !!0,
+ };
+
+=item iterator function in while loop conditional
+
+ while( readline $fh ){...}
+ while( <> ){...}
+ while( <STDIN> ){...}
+ while( glob '"*e f*"' ){...}
+ while( <"*e f*"> ){...}
+ while( readdir $dh ){...}
+ while( each %h ){...}
+ while( each @​a ){...}
+
+Normally the while loop would stop when the return value of these ops were
+false and don't store the result anywhere.
+Which is both rather pointless,
+and would stop early if the function returned C<0>.
+
+Instead it stores the result in C<$_>, and stops when they return C<undef>.
+
+In other words it changes this​:
+
+ while( <> ){...}
+
+into this
+
+ while( defined( $_ = <> ) ){...}

But earlier you said it turns into

while \(\<ARGV>\) \{\.\.\.\}

That is not the op-tree modification I was trying to show in this
section.
In this particular case I should have actually put a file handle
in there.

+
+This also applies to the conditional part of c-style C<for> loops.
+
+ for ( ; <> ; ) {...}
+ for ( ; defined( $_ = <> ) ; ) {...}
+
+Perl also adds the C<defined> check if you explicitly store the result
+in a variable.
+
+ while( my $line = readline $fh ){...}
+ while( defined( my $line = readline $fh ) ){...}
+
+This code modification helps reduce the amount of code you have to write;
+which reduces the possibility for errors, and improves clarity of intent.
+
+See L<perlop/"I/O Operators"> for more information.
+
+Support for C<readdir> was added in L<5.12|perlfunc/"readdir DIRHANDLE">
+with commit
+114c60ecb1f775ef1deb4fdc8fb8e3a6f343d13d
+
+Support for C<each> was added in L<5.18|perl5180delta/"Selected Bug Fixes">
+with commit
+8ae39f603f0f5778c160e18e08df60affbd5a620
+
+=item padrange op
+
+This single op can, in some circumstances, replace the sequence of a
+pushmark followed by one or more padsv/padav/padhv ops, and possibly
+a trailing C<list> op.

I think this item will confuse the majority of the readers of this
document, reaching for "perldoc -f padrange" to find about this Perl
function.

Perhaps I should call it an "internal OPCODE" to help prevent any confusion.
Along with saying that it is impossible to see the difference at the Perl level.

+=item De Morgan's law
+
+Perl will use De Morgan's law to reduce the number of ops.
+
+So these
+
+ if ( $p and (!$x || !$y) ) { ... }
+ if ( !$x || !$y ) { ... }
+
+become
+
+ if ( $p and not $x && $y ) { ... }
+ unless ( $x and $y ) { ... }
+
+This should not affect your code, unless you are using L<overload|overload>
+improperly.

So, that's not a DWIM, but a "do something else than what I mean"...

I combined DWIM and opcode optimizations because they are implemented
the same, and in some cases difficult to uniquely categorize.

I had some difficulty figuring out a name that was
both DWIM and optimize; while at the same time also neither.

If you have a better name I would like to hear it.

+=item enable features when declaring the lowest supported version
+
+If you declare the minimum version of Perl with a
+L<C<use>|perlfunc/"use VERSION"> statement, Perl will also enable
+any features that came with that version of Perl.
+
+That is this​:
+
+ use 5.10.0;
+
+is the same as​:
+
+ BEGIN{
+ require 5.10.0;
+ require feature;
+ feature->import('​:5.10')
+ }
+
+This is not limited to just enabling features, it will also enable full
+L<strict> mode if the declared version is at least 5.11.0.
+
+See L<feature> for more information.
+
+Added in L<5.10|perl5100delta/"The feature pragma">
+
+The enabling of strict mode was added in
+L<5.12|perl5120delta/"Implicit strictures"> with the first commit being
+53eb19dd57d98e5a28ec6e1a56a1a40ce469145f

My overall feeling is that this document is just a hodgepot of random
Perl/perl factoids, without a clear audience (some factoids appeal to
a beginning Perl programming, others to an XS or perl internals coder),
without a clear structure.

That was rather the point. It was meant as a place to find where
the information you want is located. Which requires a short description.
It also seemed easy to just add the commit id for further XS/internals
research.

Since some of the things are almost undocumented it
may have had some feature creep.

( I have yet to find any freely available resource discuss the
`for(0..10){...}` optimization. I only found out about that from reading a
comment somewhere (StackOverflow I think) )

The points you bring up are good in general, I just don't see the value
of putting them together in a "dwim" document.

Abigail

I assume that means that you would rather see some of it moved out
to other documents, and possibly a name change?

Really I just wanted to throw something against the wall to see what
sticks. Which is why I initially just sent it to P5P, where it only
got a response from Yves.

If you can think about how this information should be structured
from a "thousand foot view" perspective it would be helpful.

@p5pRT
Copy link
Author

p5pRT commented Jun 7, 2014

From @khwilliamson

On 06/07/2014 04​:43 PM, Brad Gilbert wrote​:

eally I just wanted to throw something against the wall to see what
sticks. Which is why I initially just sent it to P5P, where it only
got a response from Yves.

I think something like this is a fine idea; I thought several people
responded favorably, so there was no need for me to chime in.

@p5pRT
Copy link
Author

p5pRT commented Jun 8, 2014

From @neilb

I went and created a document that lists several modifications that
Perl does for you. Ranging from optimizations to changes Perl does
so that it "Does What I Mean"

++ - this is both interesting and very useful. Reading it made me immediately want to go trawling through my modules for things to adjust, and has already changed one aspect of my personal coding style.

DWIM doesn't feel like the right name - it's more like idiomatic perl, "working with the compiler", "behind the curtain". The latter feels the most appropriate, but perlcurtain is a bit opaque :-)

Thank you for creating this.

Neil

@p5pRT
Copy link
Author

p5pRT commented Jun 8, 2014

From @xdg

On Fri, Jun 6, 2014 at 9​:43 PM, Brad Gilbert <b2gills@​gmail.com> wrote​:

The best way (currently) to write a subroutine
is with a line like​: ` my($x,$y,$z) = @​_; ` instead of shifting off every
element.

Then that's the point to make. You don't need to explain the gory
details of why.

There are reasons to shift, such as C<< my $x = shift // default() >>
so you might mention that to avoid an overreaction of avoiding shift
at all costs.

--
David Golden <xdg@​xdg.me> Twitter/IRC​: @​xdg

@p5pRT
Copy link
Author

p5pRT commented Jun 9, 2014

From @b2gills

On Sun, Jun 8, 2014 at 9​:56 AM, David Golden <xdg@​xdg.me> wrote​:

On Fri, Jun 6, 2014 at 9​:43 PM, Brad Gilbert <b2gills@​gmail.com> wrote​:

The best way (currently) to write a subroutine
is with a line like​: ` my($x,$y,$z) = @​_; ` instead of shifting off every
element.

Then that's the point to make. You don't need to explain the gory
details of why.

There are reasons to shift, such as C<< my $x = shift // default() >>
so you might mention that to avoid an overreaction of avoiding shift
at all costs.

--
David Golden <xdg@​xdg.me> Twitter/IRC​: @​xdg

I will reiterate that the padrange section was a bit out of place
because I mostly copy-pasted from the commit messages.
It should have been written from scratch.
( to some extent I knew that when I was doing it, but they
were just too well written for me to pass-up )

I think my current proposal should be scrapped and restarted
with slightly different goals. ( and a different name )

Perhaps one that is more of a tutorial style which comes
from the point of view of someone trying to optimize her code.
I should probably put it up on Github as it will take longer.

At a later point it might be useful to have another document that goes
into all the gritty details of a few of them.

@p5pRT
Copy link
Author

p5pRT commented Jun 9, 2014

From @jplinderman

+1 for concept, but I agree with many that "perldwim" is not the right name.

In particular, the "poster child"

  while (my $line = <$fh>) ...

caused me considerable confusion when I preceded it with

  if (my $lastline = <$fh>) ...

and I got barked at. Why it's "what I mean" in one context but not the
other is not obvious. If perl cannot consistently guess "what I mean"
about such similar constructs, I'd prefer it encourage me to "say what I
mean", as it does with the "if".

@p5pRT
Copy link
Author

p5pRT commented Jun 9, 2014

From @karenetheridge

On Sun, Jun 08, 2014 at 08​:14​:50PM -0500, Brad Gilbert wrote​:

I think my current proposal should be scrapped and restarted
with slightly different goals. ( and a different name )

Perhap a new document in the perlfaq family?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants