Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undocumented whether string constants are efficient #7853

Closed
p5pRT opened this issue Mar 27, 2005 · 9 comments
Closed

Undocumented whether string constants are efficient #7853

p5pRT opened this issue Mar 27, 2005 · 9 comments

Comments

@p5pRT
Copy link

p5pRT commented Mar 27, 2005

Migrated from rt.perl.org#34584 (status was 'resolved')

Searchable as RT34584$

@p5pRT
Copy link
Author

p5pRT commented Mar 27, 2005

From porton@ex-code.com

Created by porton@ex-code.com

It is unclear from the constant(3pm) manpage whether string
constants are efficient, especially for long strings.
It should be clearly documented in constant(3pm).

Well, anyway they should be made efficient if they are not.

# Example of a string constant
use constant str => "...";

Perl Info

Flags:
    category=docs
    severity=low

Site configuration information for perl v5.8.4:

Configured by Debian Project at Mon Oct 25 01:52:37 EST 2004.

Summary of my perl5 (revision 5 version 8 subversion 4) configuration:
  Platform:
    osname=linux, osvers=2.4.27-ti1211, archname=i386-linux-thread-multi
    uname='linux kosh 2.4.27-ti1211 #1 sun sep 19 18:17:45 est 2004 i686 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i386-linux -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.4 -Dsitearch=/usr/local/lib/perl/5.8.4 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.4 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -I/usr/local/include'
    ccversion='', gccversion='3.3.5 (Debian 1:3.3.5-1)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.3.2.so, so=so, useshrplib=true, libperl=libperl.so.5.8.4
    gnulibc_version='2.3.2'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    


@INC for perl v5.8.4:
    /usr/lib/perl5/5.005/i386-linux
    /etc/perl
    /usr/local/lib/perl/5.8.4
    /usr/local/share/perl/5.8.4
    /usr/lib/perl5
    /usr/share/perl5
    /usr/lib/perl/5.8
    /usr/share/perl/5.8
    /usr/local/lib/site_perl
    .


Environment for perl v5.8.4:
    HOME=/home/porton
    LANG (unset)
    LANGUAGE=en_US
    LC_COLLATE=ru_RU.KOI8-R
    LC_CTYPE=ru_RU.KOI8-R
    LC_MESSAGES=en_US
    LC_MONETARY=ru_RU.KOI8-R
    LC_NUMERIC=C
    LC_TIME=C
    LD_LIBRARY_PATH=/usr/local/lib
    LOGDIR (unset)
    PATH=~/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/games:/sbin:/usr/sbin:/usr/local/sbin
    PERL5LIB=/usr/lib/perl5/5.005/i386-linux
    PERL_BADLANG (unset)
    SHELL=/bin/bash


@p5pRT
Copy link
Author

p5pRT commented Mar 27, 2005

From @schwern

On Sun, Mar 27, 2005 at 01​:51​:26PM -0000, Victor Porton,,, wrote​:

It is unclear from the constant(3pm) manpage whether string
constants are efficient, especially for long strings.
It should be clearly documented in constant(3pm).

Well, anyway they should be made efficient if they are not.

# Example of a string constant
use constant str => "...";

constant.pm makes it pretty clear about the mechanics...

TECHNICAL NOTES
  In the current implementation, scalar constants are actually inlinable
  subroutines. As of version 5.004 of Perl, the appropriate scalar con-
  stant is inserted directly in place of some subroutine calls, thereby
  saving the overhead of a subroutine call. See "Constant Functions" in
  perlsub for details about how and when this happens.

The long string will be inlined straight into the code, so the question
becomes not if constant.pm is efficient but what Perl does about repeated
long strings. Presumably by "efficient" you mean "does it eat more memory".

Well, let's do some testing. First we slurp in a big list of words and
assign that to a constant then mention it once in the program.

$ perl5.8.6 -wle 'BEGIN { open WORDS, "/usr/dict/words" or die $!; } use constant STR => join "", <WORDS>; sleep 99; print STR'

schwern 9760 0.0 4.9 48764 38856 std S 11​:40AM 0​:00.90 perl5.8.6 -wle BEGIN { open WORDS, "/usr/dict/words" or die $!;

Then we mention it twice.

$ perl5.8.6 -wle 'BEGIN { open WORDS, "/usr/dict/words" or die $!; } use constant STR => join "", <WORDS>; sleep 99; print STR; print STR' &

schwern 9766 0.4 5.3 51196 41288 std S 11​:41AM 0​:00.90 perl5.8.6 -wle BEGIN { open WORDS, "/usr/dict/words" or die $!;

Three times...

schwern 9772 0.0 5.6 53628 43720 std S 11​:42AM 0​:00.99 perl5.8.6 -wle BEGIN { open WORDS, "/usr/dict/words" or die $!;

Its going up by 2432K each time which is just how long STR happens to be.

$ perl5.8.6 -wle 'BEGIN { open WORDS, "/usr/dict/words" or die $!; } use constant STR => join "", <WORDS>; print length(STR) / 1024;'
2428.5400390625

So the answer is that long, repeated, inlined strings don't have any special
efficiency about them.

If you want a string which cannot be changed and is memory efficient you can
alias a global to a reference to a string constant.

  our $STR; *STR = \"Some string";

Whether or not this should be documented in constant.pm... probably just
so long as its clear its an *-->implementation detail<--* and can change
with each version of Perl.

@p5pRT
Copy link
Author

p5pRT commented Mar 27, 2005

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Mar 27, 2005

From @demerphq

On Sun, 27 Mar 2005 11​:49​:47 -0800, Michael G Schwern <schwern@​pobox.com> wrote​:

If you want a string which cannot be changed and is memory efficient you can
alias a global to a reference to a string constant.

   our $STR;  \*STR = \\"Some string";

Whether or not this should be documented in constant.pm... probably just
so long as its clear its an *-->implementation detail<--* and can change
with each version of Perl.

I know why you are adding the caveat, but really is it necessary in
this case? Is this likely to change or has it ever changed in the
past?

Im mostly curious as in the current state of the docs I think the
general assumption would be that this would be how it would always
work. And I know ive written code that assumes this.

Cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Mar 27, 2005

From @schwern

On Sun, Mar 27, 2005 at 10​:15​:39PM +0200, demerphq wrote​:

Whether or not this should be documented in constant.pm... probably just
so long as its clear its an *-->implementation detail<--* and can change
with each version of Perl.

I know why you are adding the caveat, but really is it necessary in
this case? Is this likely to change or has it ever changed in the
past?

I don't believe its ever changed in the past but that's not much of an
assurance. There's lots of quirks which have survived from Perl 1 (split
has some) that could get resolved.

Given that we (try) to share hash keys and Arthur's off somewhere working on
Copy-on-Write, you never know what performance hack someone's going to try.

It might make sense to share all constant strings, integers, etc... rather
than allocating and deallocating memory for them all the time. I know
Python and perhaps Ruby already does this trick with small integers.
Would it be worth it? Would less allocation/deallocation and less memory
used offset the extra management cost? Won't know until someone tries.

Im mostly curious as in the current state of the docs I think the
general assumption would be that this would be how it would always
work. And I know ive written code that assumes this.

Beside the point of if its a good idea not not to share strings, its an
IMPLEMENTATION DETAIL and not a LANGUAGE FEATURE. I cannot stress this
enough. Language features are to be relied upon and do not change forever
and ever, amen. Implementation details are variable across versions.
Implementation details which are documented without noting that fact tend to
be used by people and have a way of becoming language features because too
many people rely on it to change it.

So its worth an extra 5 words to avoid mortgaging the future.

@p5pRT
Copy link
Author

p5pRT commented Mar 28, 2005

From @ysth

On Sun, Mar 27, 2005 at 01​:32​:11PM -0800, Michael G Schwern wrote​:

On Sun, Mar 27, 2005 at 10​:15​:39PM +0200, demerphq wrote​:

Whether or not this should be documented in constant.pm... probably just
so long as its clear its an *-->implementation detail<--* and can change
with each version of Perl.

I know why you are adding the caveat, but really is it necessary in
this case? Is this likely to change or has it ever changed in the
past?

I don't believe its ever changed in the past but that's not much of an
assurance. There's lots of quirks which have survived from Perl 1 (split
has some) that could get resolved.

Given that we (try) to share hash keys and Arthur's off somewhere working on
Copy-on-Write, you never know what performance hack someone's going to try.

It might make sense to share all constant strings, integers, etc... rather
than allocating and deallocating memory for them all the time.

One complicating issue with this is that even constants are "owned" by
their code, and code is subject to being freed at runtime, albeit not
in the normal case. It's hard to think of a case where global sharing
of constants and code destruction could interact differently, though.
Perhaps destroying a coderef containing constants with surviving
weakrefs​:

$ perl -MScalar​::Util=weaken -wle'$x=eval"sub{\\1}"; weaken($y=&$x); print "before​:$y"; undef $x; print "after​:$y"'
before​:SCALAR(0x1017e184)
Use of uninitialized value in concatenation (.) or string at -e line 1.
after​:

@p5pRT
Copy link
Author

p5pRT commented Mar 28, 2005

From @schwern

On Sun, Mar 27, 2005 at 07​:07​:45PM -0800, Yitzchak Scott-Thoennes wrote​:

One complicating issue with this

"This" being whether or not sharing constants is viable as opposed to the
original documentation issue (just for those listening on RT).

is that even constants are "owned" by
their code, and code is subject to being freed at runtime, albeit not
in the normal case. It's hard to think of a case where global sharing
of constants and code destruction could interact differently, though.
Perhaps destroying a coderef containing constants with surviving
weakrefs​:

$ perl -MScalar​::Util=weaken -wle'$x=eval"sub{\\1}"; weaken($y=&$x); print "before​:$y"; undef $x; print "after​:$y"'
before​:SCALAR(0x1017e184)
Use of uninitialized value in concatenation (.) or string at -e line 1.
after​:

What I was thinking is a situation where the SV structs are distinct but the
string contained inside points to the same memory.

So instead of...

$ perl -MDevel​::Peek -wle 'my $foo = "foo"; my $bar = "foo"; print Dump $foo; print Dump $bar'
SV = PV(0x801460) at 0x801234
  REFCNT = 1
  FLAGS = (PADBUSY,PADMY,POK,pPOK)
  PV = 0x101c60 "foo"\0
  CUR = 3
  LEN = 4

SV = PV(0x8014a8) at 0x809fc0
  REFCNT = 1
  FLAGS = (PADBUSY,PADMY,POK,pPOK)
  PV = 0x100db0 "foo"\0
  CUR = 3
  LEN = 4

It would be

SV = PV(0x801460) at 0x801234
  REFCNT = 1
  FLAGS = (PADBUSY,PADMY,POK,pPOK)
  PV = 0x101c60 "foo"\0
  CUR = 3
  LEN = 4

SV = PV(0x8014a8) at 0x809fc0
  REFCNT = 1
  FLAGS = (PADBUSY,PADMY,POK,pPOK)
  PV = 0x101c60 "foo"\0
  CUR = 3
  LEN = 4

With sufficient cleverness to deallocate the string when the last SV
referencing it goes away as well as to copy-on-write.

Again, whether the extra management would be worth it... dunno. A first
task might be to examine some "typical" Perl code and watch what strings
get allocated, how often and how often there are duplicates. Class names,
for example, might come up a lot.

@p5pRT
Copy link
Author

p5pRT commented Mar 29, 2005

@rgs - Status changed from 'open' to 'resolved'

@p5pRT p5pRT closed this as completed Mar 29, 2005
@p5pRT
Copy link
Author

p5pRT commented Mar 30, 2005

From @Abigail

On Sun, Mar 27, 2005 at 11​:49​:47AM -0800, Michael G Schwern wrote​:

If you want a string which cannot be changed and is memory efficient you can
alias a global to a reference to a string constant.

our $STR;  \*STR = \\"Some string";

Whether or not this should be documented in constant.pm... probably just
so long as its clear its an *-->implementation detail<--* and can change
with each version of Perl.

Alternatively, one can use the less cryptic​:

  use Readonly;
  Readonly my $STR => "Some string";

Abigail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant