Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

suboptimal memory usage of repetition operator #13793

Open
p5pRT opened this issue May 1, 2014 · 22 comments
Open

suboptimal memory usage of repetition operator #13793

p5pRT opened this issue May 1, 2014 · 22 comments

Comments

@p5pRT
Copy link

p5pRT commented May 1, 2014

Migrated from rt.perl.org#121780 (status was 'open')

Searchable as RT121780$

@p5pRT
Copy link
Author

p5pRT commented May 1, 2014

From efimov@reg.ru

suboptimal memory usage of repetition operator

===
my $s1;
$s1 .= "x" for (1..200*1024*1024);
print "ok\n";
sleep 1000;
$s1 .= '';

Uses 200Mb (during sleep())

===
my $s1;
$s1 = "x" x (200*1024*1024);
print "ok\n";
sleep 1000;
$s1 .= '';

Uses 400Mb (during sleep())

reproduced in 5.14 and 5.19

perl -V
Summary of my perl5 (revision 5 version 14 subversion 2) configuration​:

  Platform​:
  osname=linux, osvers=2.6.42-37-generic,
archname=x86_64-linux-gnu-thread-multi
  uname='linux panlong 2.6.42-37-generic #58-ubuntu smp thu jan 24
15​:28​:10 utc 2013 x86_64 x86_64 x86_64 gnulinux '
  config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN
-Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr
-Dprivlib=/usr/share/perl/5.14 -Darchlib=/usr/lib/perl/5.14
-Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5
-Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local
-Dsitelib=/usr/local/share/perl/5.14.2
-Dsitearch=/usr/local/lib/perl/5.14.2 -Dman1dir=/usr/share/man/man1
-Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1
-Dsiteman3dir=/usr/local/man/man3 -Duse64bitint -Dman1ext=1
-Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh
-Ud_ualarm -Uusesfio -Uusenm -Ui_libutil -DDEBUGGING=-g -Doptimize=-O2
-Duseshrplib -Dlibperl=libperl.so.5.14.2 -des'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=define, usemultiplicity=define
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=define, use64bitall=define, uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize='-O2 -g',
  cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing
-pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='4.6.3', gccosandvers=''
  intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /lib/x86_64-linux-gnu /lib/../lib
/usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/lib
  libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
  perllibs=-ldl -lm -lpthread -lc -lcrypt
  libc=, so=so, useshrplib=true, libperl=libperl.so.5.14.2
  gnulibc_version='2.15'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
  cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib
-fstack-protector'

Characteristics of this binary (from libperl)​:
  Compile-time options​: MULTIPLICITY PERL_DONT_CREATE_GVSV
  PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP
  PERL_PRESERVE_IVUV USE_64_BIT_ALL USE_64_BIT_INT
  USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_PERL_ATOF
  USE_REENTRANT_API
  Locally applied patches​:
  DEBPKG​:debian/arm_thread_stress_timeout -
http​://bugs.debian.org/501970 Raise the timeout of
ext/threads/shared/t/stress.t to accommodate slower build hosts
  DEBPKG​:debian/cpan_definstalldirs - Provide a sensible INSTALLDIRS
default for modules installed from CPAN.
  DEBPKG​:debian/db_file_ver - http​://bugs.debian.org/340047 Remove
overly restrictive DB_File version check.
  DEBPKG​:debian/doc_info - Replace generic man(1) instructions with
Debian-specific information.
  DEBPKG​:debian/enc2xs_inc - http​://bugs.debian.org/290336 Tweak
enc2xs to follow symlinks and ignore missing @​INC directories.
  DEBPKG​:debian/libnet_config_path - Set location of libnet.cfg to
/etc/perl/Net as /usr may not be writable.
  DEBPKG​:debian/m68k_thread_stress - http​://bugs.debian.org/517938
http​://bugs.debian.org/495826 Disable some threads tests on m68k for
now due to missing TLS.
  DEBPKG​:debian/mod_paths - Tweak @​INC ordering for Debian
  DEBPKG​:debian/module_build_man_extensions -
http​://bugs.debian.org/479460 Adjust Module​::Build manual page
extensions for the Debian Perl policy
  DEBPKG​:debian/prune_libs - http​://bugs.debian.org/128355 Prune the
list of libraries wanted to what we actually need.
  DEBPKG​:fixes/net_smtp_docs - [rt.cpan.org #36038]
http​://bugs.debian.org/100195 Document the Net​::SMTP 'Port' option
  DEBPKG​:debian/perlivp - http​://bugs.debian.org/510895 Make perlivp
skip include directories in /usr/local
  DEBPKG​:debian/disable-zlib-bundling - Disable zlib bundling in
Compress​::Raw​::Zlib
  DEBPKG​:debian/cpanplus_definstalldirs -
http​://bugs.debian.org/533707 Configure CPANPLUS to use the site
directories by default.
  DEBPKG​:debian/cpanplus_config_path - Save local versions of
CPANPLUS​::Config​::System into /etc/perl.
  DEBPKG​:debian/deprecate-with-apt - http​://bugs.debian.org/580034
Point users to Debian packages of deprecated core modules
  DEBPKG​:fixes/hurd-ccflags - [a190e64]
http​://bugs.debian.org/587901 [perl #92244] Make hints/gnu.sh append
to $ccflags rather than overriding them
  DEBPKG​:debian/squelch-locale-warnings -
http​://bugs.debian.org/508764 Squelch locale warnings in Debian
package maintainer scripts
  DEBPKG​:debian/skip-upstream-git-tests - Skip tests specific to the
upstream Git repository
  DEBPKG​:fixes/extutils-cbuilder-cflags - [011e8fb]
http​://bugs.debian.org/624460 [perl #89478] Append CFLAGS and LDFLAGS
to their Config.pm counterparts in EU​::CBuilder
  DEBPKG​:fixes/module-build-home-directory -
http​://bugs.debian.org/624850 [rt.cpan.org #67893] Fix failing tilde
test when run under a UID without a passwd entry
  DEBPKG​:debian/patchlevel - http​://bugs.debian.org/567489 List
packaged patches for 5.14.2-6ubuntu2.4 in patchlevel.h
  DEBPKG​:fixes/h2ph-multiarch - [e7ec705]
http​://bugs.debian.org/625808 [perl #90122] Make h2ph correctly search
gcc include directories
  DEBPKG​:fixes/index-tainting - [3b36395]
http​://bugs.debian.org/291450 [perl #64804] RT 64804​: tainting with
index() of a constant
  DEBPKG​:debian/skip-kfreebsd-crash - http​://bugs.debian.org/628493
[perl #96272] Skip a crashing test case in t/op/threads.t on
GNU/kFreeBSD
  DEBPKG​:fixes/document_makemaker_ccflags -
http​://bugs.debian.org/628522 [rt.cpan.org #68613] Document that
CCFLAGS should include $Config{ccflags}
  DEBPKG​:fixes/sys-syslog-socket-timeout-kfreebsd.patch -
http​://bugs.debian.org/627821 [rt.cpan.org #69997] Use a socket
timeout on GNU/kFreeBSD to catch ICMP port unreachable messages
  DEBPKG​:fixes/hurd-hints - http​://bugs.debian.org/636609 Improve
general GNU hints, needed for GNU/Hurd.
  DEBPKG​:fixes/pod_fixes - [7698aed] http​://bugs.debian.org/637816
Fix typos in several pod/perl*.pod files
  DEBPKG​:debian/find_html2text - http​://bugs.debian.org/640479
Configure CPAN​::Distribution with correct name of html2text
  DEBPKG​:fixes/digest_eval_hole - http​://bugs.debian.org/644108
Close the eval "require $module" security hole in
Digest->new($algorithm)
  DEBPKG​:fixes/hurd-ndbm - [f0d0a20] [perl #102680]
http​://bugs.debian.org/645989 Add GNU/Hurd hints for NDBM_File
  DEBPKG​:fixes/sysconf.t-posix - [8040185] [perl #102888]
http​://bugs.debian.org/646016 Fix hang in ext/POSIX/t/sysconf.t on
GNU/Hurd
  DEBPKG​:fixes/hurd-largefile - [1fda587] [perl #103014]
http​://bugs.debian.org/645790 enable LFS on GNU/Hurd
  DEBPKG​:debian/hurd_test_todo_syslog -
http​://bugs.debian.org/650093 Disable failing GNU/Hurd tests in
cpan/Sys-Syslog/t/syslog.t
  DEBPKG​:fixes/hurd_skip_itimer_virtual - [rt.cpan.org #72754]
http​://bugs.debian.org/650094 Skip interval timer tests in Time​::HiRes
on GNU/Hurd
  DEBPKG​:debian/hurd_test_skip_socketpair -
http​://bugs.debian.org/650186 Disable failing GNU/Hurd tests
ext/Socket/t/socketpair.t
  DEBPKG​:debian/hurd_test_skip_sigdispatch -
http​://bugs.debian.org/650188 Disable failing GNU/Hurd tests
op/sigdispatch.t
  DEBPKG​:debian/hurd_test_skip_stack - http​://bugs.debian.org/650175
Disable failing GNU/Hurd tests dist/threads/t/stack.t
  DEBPKG​:debian/hurd_test_skip_recv - http​://bugs.debian.org/650095
Disable failing GNU/Hurd tests cpan/autodie/t/recv.t
  DEBPKG​:debian/hurd_test_skip_libc - http​://bugs.debian.org/650097
Disable failing GNU/Hurd tests dist/threads/t/libc.t
  DEBPKG​:debian/hurd_test_skip_pipe - http​://bugs.debian.org/650187
Disable failing GNU/Hurd tests io/pipe.t
  DEBPKG​:debian/hurd_test_skip_io_pipe -
http​://bugs.debian.org/650096 Disable failing GNU/Hurd tests
dist/IO/t/io_pipe.t
  DEBPKG​:fixes/CVE-2012-5195 - avoid calling memset with a negative count
  DEBPKG​:fixes/CVE-2012-5526 - [PATCH 1/4] CR escaping for P3P header
  DEBPKG​:CVE-2013-1667.patch - [PATCH] Prevent premature hsplit()
calls, and only trigger REHASH after hsplit()
  DEBPKG​:CVE-2012-6329.patch -
http​://bugs.debian.org/cgi-bin/bugreport.cgi?bug=695224 [1735f6f] fix
arbitrary command execution via _compile function in Maketext.pm
  Built under linux
  Compiled at Feb 4 2014 23​:11​:19
  %ENV​:
  PERLBREW_BASHRC_VERSION="0.67"
  PERLBREW_HOME="/home/vse/.perlbrew"
  PERLBREW_MANPATH=""
  PERLBREW_PATH="/home/perlbrew/bin"
  PERLBREW_ROOT="/home/perlbrew"
  PERLBREW_VERSION="0.67"
  @​INC​:
  /etc/perl
  /usr/local/lib/perl/5.14.2
  /usr/local/share/perl/5.14.2
  /usr/lib/perl5
  /usr/share/perl5
  /usr/lib/perl/5.14
  /usr/share/perl/5.14
  /usr/local/lib/site_perl
  .

@p5pRT
Copy link
Author

p5pRT commented May 2, 2014

From @druud62

On 2014-05-02 01​:23, Ð�икÑ�оÑ� Ð�Ñ�имов wrote​:

my $s1;
$s1 = "x" x (200*1024*1024);
print "ok\n";
sleep 1000;
$s1 .= '';

Uses 400Mb (during sleep())

reproduced in 5.14 and 5.19

This "similar thing" uses 200 MB in 5.18, but 100 MB in 5.19​:

perl -wE'
  open my $fh, "<", "100_MB.bin" or die $!;
  my $s = do { local $/; <$fh> };
  sleep 1000;
' &

In 5.18 it is better coded like​:

my $s; { local $/; $s = <$fh> }

--
Ruud

@p5pRT
Copy link
Author

p5pRT commented May 2, 2014

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented May 2, 2014

From @Abigail

On Thu, May 01, 2014 at 04​:23​:33PM -0700, Ð�икÑ�оÑ� Ð�Ñ�имов wrote​:

# New Ticket Created by �ик�о� ��имов
# Please include the string​: [perl #121780]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=121780 >

suboptimal memory usage of repetition operator

===
my $s1;
$s1 .= "x" for (1..200*1024*1024);
print "ok\n";
sleep 1000;
$s1 .= '';

Uses 200Mb (during sleep())

===
my $s1;
$s1 = "x" x (200*1024*1024);
print "ok\n";
sleep 1000;
$s1 .= '';

Uses 400Mb (during sleep())

That's because the latter contains "x" x (200*1024*1024) twice.
It exists as a RHS value (calculated at compile time), which is then
copied into $s1. Perl doesn't throw away the value, as it doesn't
know whether it will be needed again.

The former just has "x" on the RHS, and that gets reused 200*1024*1024
times.

You can reduce the memory usage of the latter by forcing Perl to release
the memory of the expression as soon as it's done by placing it in a
string eval. But you'll pay if you execute it more than once in the life
time of the process​:

  my $s1;
  $s1 = eval '"x" x (200*1024*1024)';
  print "ok\n";
  sleep 1000;
  $s1 .= '';

Abigail

@p5pRT
Copy link
Author

p5pRT commented May 2, 2014

From efimov@reg.ru

2014-05-02 12​:38 GMT+04​:00 Abigail <abigail@​abigail.be>​:

That's because the latter contains "x" x (200*1024*1024) twice.
It exists as a RHS value (calculated at compile time), which is then

at compile time? but same happening when it's runtime (no constants)​:

my $s1;
my $size = 200*1024*1024;
my $char = "x";
$s1 = $char x $size;
print "ok\n";
sleep 1000;

@p5pRT
Copy link
Author

p5pRT commented May 2, 2014

From @Abigail

On Fri, May 02, 2014 at 12​:45​:05PM +0400, Victor Efimov wrote​:

2014-05-02 12​:38 GMT+04​:00 Abigail <abigail@​abigail.be>​:

That's because the latter contains "x" x (200*1024*1024) twice.
It exists as a RHS value (calculated at compile time), which is then

at compile time? but same happening when it's runtime (no constants)​:

my $s1;
my $size = 200*1024*1024;
my $char = "x";
$s1 = $char x $size;
print "ok\n";
sleep 1000;

Same thing happens, except it now does it at run time. It calculates
the RHS, keeps the value, and puts a copy of the value in $s1.

Abigail

@p5pRT
Copy link
Author

p5pRT commented May 2, 2014

From efimov@reg.ru

indeed. That is _not_ issue when RHS side copied to LHS side and
immediately discarded (but perl did not release memory back to OS).

===
my $size = 200*1024*1024;
my $char = "x";
my $s1;
$s1 = $char x $size;
my $s2;
$s2 = $char x $size;
print "ok\n";

(800 Mb)

Instead, RHS of each statement saved for later re-use (through I don't
understand why it needed and how it can be reused).

And that should take 600Mb then, but it takes 800 too​:

===
my $size = 200*1024*1024;
my $char = "x";
{
  my $s1;
  $s1 = $char x $size;
}
{
  my $s2;
  $s2 = $char x $size;
}

print "ok\n";
sleep 1000;

===

2014-05-02 12​:53 GMT+04​:00 Abigail <abigail@​abigail.be>​:

On Fri, May 02, 2014 at 12​:45​:05PM +0400, Victor Efimov wrote​:

2014-05-02 12​:38 GMT+04​:00 Abigail <abigail@​abigail.be>​:

That's because the latter contains "x" x (200*1024*1024) twice.
It exists as a RHS value (calculated at compile time), which is then

at compile time? but same happening when it's runtime (no constants)​:

my $s1;
my $size = 200*1024*1024;
my $char = "x";
$s1 = $char x $size;
print "ok\n";
sleep 1000;

Same thing happens, except it now does it at run time. It calculates
the RHS, keeps the value, and puts a copy of the value in $s1.

Abigail

@p5pRT
Copy link
Author

p5pRT commented May 2, 2014

From @Abigail

On Fri, May 02, 2014 at 01​:09​:18PM +0400, Victor Efimov wrote​:

indeed. That is _not_ issue when RHS side copied to LHS side and
immediately discarded (but perl did not release memory back to OS).

===
my $size = 200*1024*1024;
my $char = "x";
my $s1;
$s1 = $char x $size;
my $s2;
$s2 = $char x $size;
print "ok\n";

(800 Mb)

Instead, RHS of each statement saved for later re-use (through I don't
understand why it needed and how it can be reused).

It keeps the internal structures around, so it later doesn't have to
build them again.

And that should take 600Mb then, but it takes 800 too​:

===
my $size = 200*1024*1024;
my $char = "x";
{
my $s1;
$s1 = $char x $size;
}
{
my $s2;
$s2 = $char x $size;
}

print "ok\n";
sleep 1000;

Nothing is reused here, you end up with two pairs of two copies​:
the first $char x $size, and $s1, and the second $char x $size and $s2.

This, however, will take about 600Mb​:

  my $size = 200*1024*1024;
  my $char = "x";
  my @​s;
  for (1 .. 2) {
  push @​s => $char x $size;
  print "ok\n";
  }
  sleep 1000;

Here we have three copies​: $char x $size, $s [0] and $s [1].

Abigail

@p5pRT
Copy link
Author

p5pRT commented May 2, 2014

From efimov@reg.ru

2014-05-02 13​:23 GMT+04​:00 Abigail <abigail@​abigail.be>​:

Nothing is reused here, you end up with two pairs of two copies​:
the first $char x $size, and $s1, and the second $char x $size and $s2.

But isn't that a memory leak? $s1 goes out of scope and should be
discarded, and its memory re-used.

@p5pRT
Copy link
Author

p5pRT commented May 2, 2014

From @Leont

On Fri, May 2, 2014 at 10​:45 AM, Victor Efimov <efimov@​reg.ru> wrote​:

2014-05-02 12​:38 GMT+04​:00 Abigail <abigail@​abigail.be>​:

That's because the latter contains "x" x (200*1024*1024) twice.
It exists as a RHS value (calculated at compile time), which is then

at compile time? but same happening when it's runtime (no constants)​:

my $s1;
my $size = 200*1024*1024;
my $char = "x";
$s1 = $char x $size;
print "ok\n";
sleep 1000;

You are correct that that is a bug. It seems PADTMP's were not treated as
temporaries, while TEMP sv's where (as is evidenced by the eval in
Abigail's example). Fortunately, it seems Father C has already fixed this
in 9ffd39a.

Leon

@p5pRT
Copy link
Author

p5pRT commented May 2, 2014

From @iabyn

On Fri, May 02, 2014 at 10​:38​:16AM +0200, Abigail wrote​:

That's because the latter contains "x" x (200*1024*1024) twice.
It exists as a RHS value (calculated at compile time),

I rather wonder whether we should not in fact be constant-folding the
repeat operator. For example

  if ($unlikely_condition) {
  $x = 'x' x 200_000_000;
  }
  else
  $x = '';
  }

will consume 200Mb even if that branch is never taken.

--
O Unicef Clearasil!
Gibberish and Drivel!
  -- "Bored of the Rings"

@p5pRT
Copy link
Author

p5pRT commented May 2, 2014

From @Abigail

On Fri, May 02, 2014 at 03​:22​:17PM +0100, Dave Mitchell wrote​:

On Fri, May 02, 2014 at 10​:38​:16AM +0200, Abigail wrote​:

That's because the latter contains "x" x (200*1024*1024) twice.
It exists as a RHS value (calculated at compile time),

I rather wonder whether we should not in fact be constant-folding the
repeat operator. For example

if \($unlikely\_condition\) \{
    $x = 'x' x 200\_000\_000;
\}
else
    $x = '';
\}

will consume 200Mb even if that branch is never taken.

Yeah, that's what I meant by "calculated at compile time".

I'm very aware of the effects of constant-folding, once bringing down
the website of $WORK, because it ran out of memory due to code that
wasn't even executed (but constant folded).

Abigail

@p5pRT
Copy link
Author

p5pRT commented May 2, 2014

From @lizmat

On 02 May 2014, at 16​:27, Abigail <abigail@​abigail.be> wrote​:

On Fri, May 02, 2014 at 03​:22​:17PM +0100, Dave Mitchell wrote​:

On Fri, May 02, 2014 at 10​:38​:16AM +0200, Abigail wrote​:

That's because the latter contains "x" x (200*1024*1024) twice.
It exists as a RHS value (calculated at compile time),

I rather wonder whether we should not in fact be constant-folding the
repeat operator. For example

if ($unlikely_condition) {
$x = 'x' x 200_000_000;
}
else
$x = '';
}

will consume 200Mb even if that branch is never taken.

Yeah, that's what I meant by "calculated at compile time".

I'm very aware of the effects of constant-folding, once bringing down
the website of $WORK, because it ran out of memory due to code that
wasn't even executed (but constant folded).

Yeah. Talking about side-effects :-)

Liz

@p5pRT
Copy link
Author

p5pRT commented May 3, 2014

From @tsee

On 05/02/2014 04​:22 PM, Dave Mitchell wrote​:

On Fri, May 02, 2014 at 10​:38​:16AM +0200, Abigail wrote​:

That's because the latter contains "x" x (200*1024*1024) twice.
It exists as a RHS value (calculated at compile time),

I rather wonder whether we should not in fact be constant-folding the
repeat operator. For example

 if \($unlikely\_condition\) \{
     $x = 'x' x 200\_000\_000;
 \}
 else
     $x = '';
 \}

will consume 200Mb even if that branch is never taken.

I think this is a corner case. Few real programs have string constants
that large and if they do, they generally do for a reason.

This is always a trade-off. Advocatus diaboli​: Should we stop allocation
of scratchpads/targets because it's additional memory in a branch we
never take? (Obviously, that would be a bad call.)

Constant folding is not rocket science. People generally know it's going
to happen and deal with the fallout. The case that Abigail described was
vastly more intricate than the above and it's the only such issue I've
ever heard of.

Maybe the proper solution (in an ideal world, I mean) would be constant
folding being performed at run-time if the branch is first taken (or
only after it's taken N times). Which gets kind of uncomfortably close
to tracing JIT compilation. (I know about all the practical issues such
as not allowing modification of OP trees at run time.)

--Steffen

@p5pRT
Copy link
Author

p5pRT commented May 3, 2014

From @avar

On Sat, May 3, 2014 at 11​:14 AM, Steffen Mueller <smueller@​cpan.org> wrote​:

The case that Abigail described was vastly more intricate
than the above and it's the only such issue I've ever heard of.

Are we remembering different issues? The one I remember came down to​:

  @​cache {1 .. 999_999} = ();

I.e. just perl sizing the variable at compile-time even though it
might not be used. Hardly vastly more intricate, or were you thinking
about some other case?

@p5pRT
Copy link
Author

p5pRT commented May 3, 2014

From @Abigail

On Sat, May 03, 2014 at 12​:44​:19PM +0200, Ã�var Arnfjörð Bjarmason wrote​:

On Sat, May 3, 2014 at 11​:14 AM, Steffen Mueller <smueller@​cpan.org> wrote​:

The case that Abigail described was vastly more intricate
than the above and it's the only such issue I've ever heard of.

Are we remembering different issues? The one I remember came down to​:

@&#8203;cache \{1 \.\. 999\_999\} = \(\);

I.e. just perl sizing the variable at compile-time even though it
might not be used. Hardly vastly more intricate, or were you thinking
about some other case?

That was the one.

Abigail

@p5pRT
Copy link
Author

p5pRT commented May 3, 2014

From @lizmat

On 03 May 2014, at 21​:30, Abigail <abigail@​abigail.be> wrote​:

On Sat, May 03, 2014 at 12​:44​:19PM +0200, Ã�var Arnfjörð Bjarmason wrote​:

On Sat, May 3, 2014 at 11​:14 AM, Steffen Mueller <smueller@​cpan.org> wrote​:

The case that Abigail described was vastly more intricate
than the above and it's the only such issue I've ever heard of.

Are we remembering different issues? The one I remember came down to​:

@​cache {1 .. 999_999} = ();

I.e. just perl sizing the variable at compile-time even though it
might not be used. Hardly vastly more intricate, or were you thinking
about some other case?

That was the one.

Yup, the one.

Although the hash was *not* sized at compile time​: that happened at runtime using the array created by 1..999999 at compile time. :-)

So the idea behind this was doing an optimisation at runtime, with unfortunate severe side-effects at compile time.

Liz

@p5pRT
Copy link
Author

p5pRT commented May 3, 2014

From @lizmat

On 03 May 2014, at 21​:53, Elizabeth Mattijsen <liz@​dijkmat.nl> wrote​:

On 03 May 2014, at 21​:30, Abigail <abigail@​abigail.be> wrote​:

On Sat, May 03, 2014 at 12​:44​:19PM +0200, Ã�var Arnfjörð Bjarmason wrote​:

On Sat, May 3, 2014 at 11​:14 AM, Steffen Mueller <smueller@​cpan.org> wrote​:

The case that Abigail described was vastly more intricate
than the above and it's the only such issue I've ever heard of.

Are we remembering different issues? The one I remember came down to​:

@​cache {1 .. 999_999} = ();

I.e. just perl sizing the variable at compile-time even though it
might not be used. Hardly vastly more intricate, or were you thinking
about some other case?

That was the one.

Yup, the one.

Although the hash was *not* sized at compile time​: that happened at runtime using the array created by 1..999999 at compile time. :-)

So the idea behind this was doing an optimisation at runtime, with unfortunate severe side-effects at compile time.

Actually, going down memory lane, the very quick fix was​:

  eval â��@​cache {1 .. 999_999} = ()â��;

Liz

@p5pRT
Copy link
Author

p5pRT commented May 3, 2014

From @perhunter

On 05/03/2014 04​:14 PM, Elizabeth Mattijsen wrote​:

Actually, going down memory lane, the very quick fix was​:

eval â��@​cache {1 .. 999_999} = ()â��;

why not a quick fix of​:

  $max = 999_999 ;
  @​cache{ 1 .. $max } = () ;

that shouldn't be able to trigger any compile time 'optimizations' of
the generated list.

uri

--
Uri Guttman - The Perl Hunter
The Best Perl Jobs, The Best Perl Hackers
http​://PerlHunter.com

@p5pRT
Copy link
Author

p5pRT commented May 4, 2014

From @tsee

On 05/03/2014 12​:44 PM, Ã�var Arnfjörð Bjarmason wrote​:

On Sat, May 3, 2014 at 11​:14 AM, Steffen Mueller <smueller@​cpan.org> wrote​:

The case that Abigail described was vastly more intricate
than the above and it's the only such issue I've ever heard of.

Are we remembering different issues? The one I remember came down to​:

 @&#8203;cache \{1 \.\. 999\_999\} = \(\);

I.e. just perl sizing the variable at compile-time even though it
might not be used. Hardly vastly more intricate, or were you thinking
about some other case?

Then I misremembered. But again, this is just a case of "don't do that"
if you ask me. And the above could rather easily be spotted.

--Steffen

@p5pRT
Copy link
Author

p5pRT commented May 4, 2014

From @Abigail

On Sun, May 04, 2014 at 09​:28​:48AM +0200, Steffen Mueller wrote​:

On 05/03/2014 12​:44 PM, Ã�var Arnfjörð Bjarmason wrote​:

On Sat, May 3, 2014 at 11​:14 AM, Steffen Mueller <smueller@​cpan.org> wrote​:

The case that Abigail described was vastly more intricate
than the above and it's the only such issue I've ever heard of.

Are we remembering different issues? The one I remember came down to​:

 @&#8203;cache \{1 \.\. 999\_999\} = \(\);

I.e. just perl sizing the variable at compile-time even though it
might not be used. Hardly vastly more intricate, or were you thinking
about some other case?

Then I misremembered. But again, this is just a case of "don't do that"
if you ask me. And the above could rather easily be spotted.

How many Perl programmers outside of p5p have any idea what perl is
doing at compile time? I'd wager most Perl programmers have no idea
what constant folding is.

It's also rarely documented. Perl, the language, is documented well,
in man pages, books and online. But the runtime, its optimizations,
and its actions at compile time? Far less.

Now, I'm not arguing things should change. But compile time actions
will keep biting people. It certainly bit me.

Abigail

@p5pRT
Copy link
Author

p5pRT commented May 10, 2014

From @bulk88

On Sat May 03 02​:14​:59 2014, smueller@​cpan.org wrote​:

I think this is a corner case. Few real programs have string constants
that large and if they do, they generally do for a reason.

I agree with that. If someone will write something like that, it has to have some legitimate purpose. If you write something like that, prepare for your process to have the available free memory to do that. If you don't have the free memory to do that, your accepting the risk of a random process termination and P5P can't help you with that.

This is always a trade-off. Advocatus diaboli​: Should we stop allocation
of scratchpads/targets because it's additional memory in a branch we
never take? (Obviously, that would be a bad call.)

Constant folding is not rocket science. People generally know it's going
to happen and deal with the fallout. The case that Abigail described was
vastly more intricate than the above and it's the only such issue I've
ever heard of.

Maybe the proper solution (in an ideal world, I mean) would be constant
folding being performed at run-time if the branch is first taken (or
only after it's taken N times). Which gets kind of uncomfortably close
to tracing JIT compilation. (I know about all the practical issues such
as not allowing modification of OP trees at run time.)

I have a different idea but my recommendation on this ticket is to do nothing, if the constant folding results in an SV with more than a 10 MB value, reinstate the original optree.

--
bulk88 ~ bulk88 at hotmail.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants