Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive memory waste from duplicate CopFILE in threaded perl #12940

Closed
p5pRT opened this issue May 4, 2013 · 32 comments
Closed

Excessive memory waste from duplicate CopFILE in threaded perl #12940

p5pRT opened this issue May 4, 2013 · 32 comments

Comments

@p5pRT
Copy link

p5pRT commented May 4, 2013

Migrated from rt.perl.org#117855 (status was 'open')

Searchable as RT117855$

@p5pRT
Copy link
Author

p5pRT commented May 4, 2013

From @timbunce

Created by timbo@timac.local

A perl with ithreads enabled saves duplicate copies of the perl source file
name for every CvCOPFILE. That's a lot of wasteful duplication.
This isn't new. It goes back to at least 5.8.

To see how much this costs I modified a copy of Devel​::SizeMe to do

  warn("%p cop_file %s\n", basecop->cop_file, basecop->cop_file);

and ran

  perl -Mblib -MDancer -MDevel​::SizeMe=​:all -e 'perl_size()' 2>x

then

  grep 'cop_file' x | sort -u | perl -pe 's/^\S+ cop_file //' | wc -c

which reported 568813 bytes consumed by cop_file strings.

Then I ran the same but with an extra sort -u at the end​:

  grep 'cop_file' x | sort -u | perl -pe 's/^\S+ cop_file //' | sort -u | wc

which reports there there were only 9074 bytes of distinct cop_file strings
(in 108 distinct cop_file strings).

So that's over 550KB wasted in duplicate by cop_file strings in a relatively
small 'application'.

That's a lot of wasted memory. The scale of the waste is proportional to the
length of the directory paths the modules are loaded from. E.g., mine were​:
/Users/timbo/perl5/perlbrew/perls/perl-5.8.9-thr/lib/site_perl/5.8.9/HTTP/Body/XFormsMultipart.pm

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl v5.8.9:

Configured by timbo at Thu Oct 11 16:23:44 IST 2012.

Summary of my perl5 (revision 5 version 8 subversion 9) configuration:
  Platform:
    osname=darwin, osvers=11.4.2, archname=darwin-thread-multi-2level
    uname='darwin timac.local 11.4.2 darwin kernel version 11.4.2: thu aug 23 16:25:48 pdt 2012; root:xnu-1699.32.7~1release_x86_64 x86_64 '
    config_args='-de -Dprefix=/Users/timbo/perl5/perlbrew/perls/perl-5.8.9-thr -Duseithreads -Aeval:scriptdir=/Users/timbo/perl5/perlbrew/perls/perl-5.8.9-thr/bin'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=define use64bitall=define uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -I/usr/local/include -I/opt/local/include',
    optimize='-O3',
    cppflags='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -I/usr/local/include -I/opt/local/include'
    ccversion='', gccversion='4.2.1 Compatible Apple Clang 3.1 (tags/Apple/clang-318.0.61)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -L/usr/local/lib -L/opt/local/lib'
    libpth=/usr/local/lib /opt/local/lib /usr/lib
    libs=-lgdbm -ldbm -ldl -lm -lutil -lc
    perllibs=-ldl -lm -lutil -lc
    libc=, so=dylib, useshrplib=false, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib'

Locally applied patches:
    


@INC for perl v5.8.9:
    /Users/timbo/perl5/perlbrew/perls/perl-5.8.9-thr/lib/5.8.9/darwin-thread-multi-2level
    /Users/timbo/perl5/perlbrew/perls/perl-5.8.9-thr/lib/5.8.9
    /Users/timbo/perl5/perlbrew/perls/perl-5.8.9-thr/lib/site_perl/5.8.9/darwin-thread-multi-2level
    /Users/timbo/perl5/perlbrew/perls/perl-5.8.9-thr/lib/site_perl/5.8.9
    .


Environment for perl v5.8.9:
    HOME=/Users/timbo
    LANG=en_IE.UTF-8
    LANGUAGE (unset)
    LC_ALL=en_IE.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PERLBREW_BASHRC_VERSION=0.44
    PERLBREW_HOME=/Users/timbo/.perlbrew
    PERLBREW_MANPATH=/Users/timbo/perl5/perlbrew/perls/perl-5.8.9-thr/man
    PERLBREW_PATH=/Users/timbo/perl5/perlbrew/bin:/Users/timbo/perl5/perlbrew/perls/perl-5.8.9-thr/bin
    PERLBREW_PERL=perl-5.8.9-thr
    PERLBREW_ROOT=/Users/timbo/perl5/perlbrew
    PERLBREW_VERSION=0.44
    PERLCRITIC=/Users/timbo/.setdev/perlcriticrc
    PERLTIDY=/Users/timbo/.setdev/perltidyrc
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Jul 5, 2013

From @cpansprout

On Sat May 04 04​:56​:48 2013, timbo wrote​:

This is a bug report for perl from timbo@​timac.local,
generated with the help of perlbug 1.39 running under perl v5.8.9.

-----------------------------------------------------------------
[Please describe your issue here]

A perl with ithreads enabled saves duplicate copies of the perl source
file
name for every CvCOPFILE. That's a lot of wasteful duplication.
This isn't new. It goes back to at least 5.8.

To see how much this costs I modified a copy of Devel​::SizeMe to do

warn\("%p cop\_file %s\\n"\, basecop\->cop\_file\, basecop\->cop\_file\);

and ran

perl \-Mblib \-MDancer \-MDevel​::SizeMe=​:all \-e 'perl\_size\(\)' 2>x

then

grep 'cop\_file' x | sort \-u | perl \-pe 's/^\\S\+ cop\_file //' | wc

-c

which reported 568813 bytes consumed by cop_file strings.

Then I ran the same but with an extra sort -u at the end​:

grep 'cop\_file' x | sort \-u | perl \-pe 's/^\\S\+ cop\_file //' | sort

-u | wc

which reports there there were only 9074 bytes of distinct cop_file
strings
(in 108 distinct cop_file strings).

So that's over 550KB wasted in duplicate by cop_file strings in a
relatively
small 'application'.

That's a lot of wasted memory. The scale of the waste is proportional
to the
length of the directory paths the modules are loaded from. E.g., mine
were​:
/Users/timbo/perl5/perlbrew/perls/perl-5.8.9-
thr/lib/site_perl/5.8.9/HTTP/Body/XFormsMultipart.pm

We could change the implementation to store GVs in the pad for threaded
CopFILEGV. That would make the pad one pointer larger for every statement.

Are there any cases where that would have a downside? I cannot think of
any.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 5, 2013

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jul 5, 2013

From @cpansprout

On Fri Jul 05 13​:37​:21 2013, sprout wrote​:

On Sat May 04 04​:56​:48 2013, timbo wrote​:

This is a bug report for perl from timbo@​timac.local,
generated with the help of perlbug 1.39 running under perl v5.8.9.

-----------------------------------------------------------------
[Please describe your issue here]

A perl with ithreads enabled saves duplicate copies of the perl source
file
name for every CvCOPFILE. That's a lot of wasteful duplication.
This isn't new. It goes back to at least 5.8.

To see how much this costs I modified a copy of Devel​::SizeMe to do

warn\("%p cop\_file %s\\n"\, basecop\->cop\_file\, basecop\->cop\_file\);

and ran

perl \-Mblib \-MDancer \-MDevel​::SizeMe=​:all \-e 'perl\_size\(\)' 2>x

then

grep 'cop\_file' x | sort \-u | perl \-pe 's/^\\S\+ cop\_file //' | wc

-c

which reported 568813 bytes consumed by cop_file strings.

Then I ran the same but with an extra sort -u at the end​:

grep 'cop\_file' x | sort \-u | perl \-pe 's/^\\S\+ cop\_file //' | sort

-u | wc

which reports there there were only 9074 bytes of distinct cop_file
strings
(in 108 distinct cop_file strings).

So that's over 550KB wasted in duplicate by cop_file strings in a
relatively
small 'application'.

That's a lot of wasted memory. The scale of the waste is proportional
to the
length of the directory paths the modules are loaded from. E.g., mine
were​:
/Users/timbo/perl5/perlbrew/perls/perl-5.8.9-
thr/lib/site_perl/5.8.9/HTTP/Body/XFormsMultipart.pm

We could change the implementation to store GVs in the pad for threaded
CopFILEGV. That would make the pad one pointer larger for every
statement.

Are there any cases where that would have a downside? I cannot think of
any.

It cannot go in the lexical pad, because caller needs to access it. I
should know that, since I ran into the same problem with CopSTASH.

So I think we need another global array like PL_stashpad, this time for
GVs. That will use even less memory, as there are more subroutines than
files.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2013

From @cpansprout

On Sat May 04 04​:56​:48 2013, timbo wrote​:

A perl with ithreads enabled saves duplicate copies of the perl source
file
name for every CvCOPFILE. That's a lot of wasteful duplication.
This isn't new. It goes back to at least 5.8.

To see how much this costs I modified a copy of Devel​::SizeMe to do

warn\("%p cop\_file %s\\n"\, basecop\->cop\_file\, basecop\->cop\_file\);

and ran

perl \-Mblib \-MDancer \-MDevel​::SizeMe=​:all \-e 'perl\_size\(\)' 2>x

then

grep 'cop\_file' x | sort \-u | perl \-pe 's/^\\S\+ cop\_file //' | wc

-c

which reported 568813 bytes consumed by cop_file strings.

Then I ran the same but with an extra sort -u at the end​:

grep 'cop\_file' x | sort \-u | perl \-pe 's/^\\S\+ cop\_file //' | sort

-u | wc

which reports there there were only 9074 bytes of distinct cop_file
strings
(in 108 distinct cop_file strings).

So that's over 550KB wasted in duplicate by cop_file strings in a
relatively
small 'application'.

That's a lot of wasted memory. The scale of the waste is proportional
to the
length of the directory paths the modules are loaded from. E.g., mine
were​:
/Users/timbo/perl5/perlbrew/perls/perl-5.8.9-
thr/lib/site_perl/5.8.9/HTTP/Body/XFormsMultipart.pm

Please test the attached patch and see whether it reduces memory usage
for you. Testing it with Devel​::Size will *not* work, as it will need
to be updated for this. (It is needfully very intrusive into perl’s guts.)

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2013

From @cpansprout

Inline Patch
diff --git a/MANIFEST b/MANIFEST
index 7369af2..bc10cd6 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -3852,7 +3852,7 @@ ext/XS-APItest/t/cleanup.t	test stack behaviour on unwinding
 ext/XS-APItest/t/clone-with-stack.t	test clone with CLONEf_COPY_STACKS works
 ext/XS-APItest/t/cophh.t	test COPHH API
 ext/XS-APItest/t/coplabel.t	test cop_*_label
-ext/XS-APItest/t/copstash.t	test alloccopstash
+ext/XS-APItest/t/cop.t		test other cop stuff
 ext/XS-APItest/t/copyhints.t	test hv_copy_hints_hv() API
 ext/XS-APItest/t/customop.t	XS::APItest: tests for custom ops
 ext/XS-APItest/t/eval-filter.t	Simple source filter/eval test
diff --git a/cop.h b/cop.h
index e33dc15..cfa976f 100644
--- a/cop.h
+++ b/cop.h
@@ -389,7 +389,8 @@ struct cop {
 #ifdef USE_ITHREADS
     PADOFFSET	cop_stashoff;	/* offset into PL_stashpad, for the
 				   package the line was compiled in */
-    char *	cop_file;	/* file name the following line # is from */
+    PADOFFSET	cop_filegvoff;	/* PL_filegv offset, for the file name the
+				   following line # is from */
 #else
     HV *	cop_stash;	/* package line was compiled in */
     GV *	cop_filegv;	/* file the following line # is from */
@@ -404,54 +405,32 @@ struct cop {
 };
 
 #ifdef USE_ITHREADS
-#  define CopFILE(c)		((c)->cop_file)
-#  define CopFILEGV(c)		(CopFILE(c) \
-				 ? gv_fetchfile(CopFILE(c)) : NULL)
-				 
-#  ifdef NETWARE
-#    define CopFILE_set(c,pv)	((c)->cop_file = savepv(pv))
-#    define CopFILE_setn(c,pv,l)  ((c)->cop_file = savepv((pv),(l)))
-#  else
-#    define CopFILE_set(c,pv)	((c)->cop_file = savesharedpv(pv))
-#    define CopFILE_setn(c,pv,l)  ((c)->cop_file = savesharedpvn((pv),(l)))
-#  endif
-
-#  define CopFILESV(c)		(CopFILE(c) \
-				 ? GvSV(gv_fetchfile(CopFILE(c))) : NULL)
-#  define CopFILEAV(c)		(CopFILE(c) \
-				 ? GvAV(gv_fetchfile(CopFILE(c))) : NULL)
-#  define CopFILEAVx(c)		(assert_(CopFILE(c)) \
-				   GvAV(gv_fetchfile(CopFILE(c))))
+#  define CopFILEGV(c)		PL_filegvpad[(c)->cop_filegvoff]
+#  define CopFILEGV_set(c,gv)	((c)->cop_filegvoff = (gv) \
+				 ? allocfilegv((GV *)SvREFCNT_inc_NN(gv)) \
+				 : 0)
 
 #  define CopSTASH(c)           PL_stashpad[(c)->cop_stashoff]
 #  define CopSTASH_set(c,hv)	((c)->cop_stashoff = (hv)		\
 				    ? alloccopstash(hv)			\
 				    : 0)
-#  ifdef NETWARE
-#    define CopFILE_free(c) SAVECOPFILE_FREE(c)
-#  else
-#    define CopFILE_free(c)	(PerlMemShared_free(CopFILE(c)),(CopFILE(c) = NULL))
-#  endif
+#  define CopFILE_free(c)	S_CopFILE_free(aTHX_ c)
 #else
 #  define CopFILEGV(c)		((c)->cop_filegv)
 #  define CopFILEGV_set(c,gv)	((c)->cop_filegv = (GV*)SvREFCNT_inc(gv))
-#  define CopFILE_set(c,pv)	CopFILEGV_set((c), gv_fetchfile(pv))
-#  define CopFILE_setn(c,pv,l)	CopFILEGV_set((c), gv_fetchfile_flags((pv),(l),0))
-#  define CopFILESV(c)		(CopFILEGV(c) ? GvSV(CopFILEGV(c)) : NULL)
-#  define CopFILEAV(c)		(CopFILEGV(c) ? GvAV(CopFILEGV(c)) : NULL)
-#  ifdef DEBUGGING
-#    define CopFILEAVx(c)	(assert(CopFILEGV(c)), GvAV(CopFILEGV(c)))
-#  else
-#    define CopFILEAVx(c)	(GvAV(CopFILEGV(c)))
-# endif
-#  define CopFILE(c)		(CopFILEGV(c) && GvSV(CopFILEGV(c)) \
-				    ? SvPVX(GvSV(CopFILEGV(c))) : NULL)
 #  define CopSTASH(c)		((c)->cop_stash)
 #  define CopSTASH_set(c,hv)	((c)->cop_stash = (hv))
 #  define CopFILE_free(c)	(SvREFCNT_dec(CopFILEGV(c)),(CopFILEGV(c) = NULL))
 
 #endif /* USE_ITHREADS */
 
+#define CopFILE_set(c,pv)	CopFILEGV_set((c), gv_fetchfile(pv))
+#define CopFILE_setn(c,pv,l)	CopFILEGV_set((c), gv_fetchfile_flags((pv),(l),0))
+#define CopFILESV(c)		(CopFILEGV(c) ? GvSV(CopFILEGV(c)) : NULL)
+#define CopFILEAV(c)		(CopFILEGV(c) ? GvAV(CopFILEGV(c)) : NULL)
+#define CopFILEAVx(c)		(assert_(CopFILEGV(c)) GvAV(CopFILEGV(c)))
+#define CopFILE(c)		(CopFILEGV(c) && GvSV(CopFILEGV(c)) \
+				    ? SvPVX(GvSV(CopFILEGV(c))) : NULL)
 #define CopSTASHPV(c)		(CopSTASH(c) ? HvNAME_get(CopSTASH(c)) : NULL)
    /* cop_stash is not refcounted */
 #define CopSTASHPV_set(c,pv)	CopSTASH_set((c), gv_stashpv(pv,GV_ADD))
diff --git a/embed.fnc b/embed.fnc
index a6c17ee..7fdacf1 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -1020,6 +1020,7 @@ p	|PADOFFSET|allocmy	|NN const char *const name|const STRLEN len\
 				|const U32 flags
 #ifdef USE_ITHREADS
 AMp	|PADOFFSET|alloccopstash|NN HV *hv
+AMp	|PADOFFSET|allocfilegv	|NN GV *gv
 #endif
 : Used in perly.y
 pR	|OP*	|oopsAV		|NN OP* o
@@ -2643,4 +2644,8 @@ op	|void	|populate_isa	|NN const char *name|STRLEN len|...
 Xop	|bool	|feature_is_enabled|NN const char *const name \
 		|STRLEN namelen
 
+: Some static inline functions that implement macros need predeclaration
+: because they are used inside other static inline functions.
+oi	|void	|SvREFCNT_dec_NN|NN SV *sv
+
 : ex: set ts=8 sts=4 sw=4 noet:
diff --git a/embed.h b/embed.h
index 6f3ac5a..5794812 100644
--- a/embed.h
+++ b/embed.h
@@ -804,6 +804,7 @@
 #endif
 #if defined(USE_ITHREADS)
 #define alloccopstash(a)	Perl_alloccopstash(aTHX_ a)
+#define allocfilegv(a)		Perl_allocfilegv(aTHX_ a)
 #define any_dup(a,b)		Perl_any_dup(aTHX_ a,b)
 #define cx_dup(a,b,c,d)		Perl_cx_dup(aTHX_ a,b,c,d)
 #define dirp_dup(a,b)		Perl_dirp_dup(aTHX_ a,b)
diff --git a/embedvar.h b/embedvar.h
index ef2fa68..240d205 100644
--- a/embedvar.h
+++ b/embedvar.h
@@ -146,6 +146,9 @@
 #define PL_exitlist		(vTHX->Iexitlist)
 #define PL_exitlistlen		(vTHX->Iexitlistlen)
 #define PL_fdpid		(vTHX->Ifdpid)
+#define PL_filegvpad		(vTHX->Ifilegvpad)
+#define PL_filegvpadix		(vTHX->Ifilegvpadix)
+#define PL_filegvpadmax		(vTHX->Ifilegvpadmax)
 #define PL_filemode		(vTHX->Ifilemode)
 #define PL_firstgv		(vTHX->Ifirstgv)
 #define PL_forkprocess		(vTHX->Iforkprocess)
diff --git a/ext/B/B.pm b/ext/B/B.pm
index 8b13dea..599d58a 100644
--- a/ext/B/B.pm
+++ b/ext/B/B.pm
@@ -1224,6 +1224,8 @@ Since perl 5.17.1
 
 =item file
 
+=item filegvoff (threaded only)
+
 =item cop_seq
 
 =item arybase
diff --git a/ext/B/B.xs b/ext/B/B.xs
index fbe6be6..b222cc8 100644
--- a/ext/B/B.xs
+++ b/ext/B/B.xs
@@ -680,7 +680,11 @@ struct OP_methods {
 #ifdef USE_ITHREADS
     STR_WITH_LEN("pmoffset"),IVp,     offsetof(struct pmop, op_pmoffset),/*20*/
     STR_WITH_LEN("filegv"),  0,       -1,                                /*21*/
+#  if PERL_VERSION < 19
     STR_WITH_LEN("file"),    char_pp, offsetof(struct cop, cop_file),    /*22*/
+#  else
+    STR_WITH_LEN("file"),    0,       -1,                                /*22*/
+#  endif
     STR_WITH_LEN("stash"),   0,       -1,                                /*23*/
 #  if PERL_VERSION < 17
     STR_WITH_LEN("stashpv"), char_pp, offsetof(struct cop, cop_stashpv), /*24*/
@@ -718,6 +722,11 @@ struct OP_methods {
     STR_WITH_LEN("warnings"),0,       -1,                                /*44*/
     STR_WITH_LEN("io"),      0,       -1,                                /*45*/
     STR_WITH_LEN("hints_hash"),0,     -1,                                /*46*/
+#  if PERL_VERSION < 19 || !defined(USE_ITHREADS)
+    STR_WITH_LEN("filegvoff"),0,      -1,                                /*47*/
+#  else
+    STR_WITH_LEN("filegvoff"),PADOFFSETp,offsetof(struct cop, cop_filegvoff),/*47*/
+#  endif
 };
 
 #include "const-c.inc"
@@ -1022,7 +1031,7 @@ next(o)
 		ret = make_sv_object(aTHX_ (SV *)CopFILEGV((COP*)o));
 		break;
 #endif
-#ifndef USE_ITHREADS
+#if !defined(USE_ITHREADS) || PERL_VERSION >= 19
 	    case 22: /* file */
 		ret = sv_2mortal(newSVpv(CopFILE((COP*)o), 0));
 		break;
diff --git a/ext/XS-APItest/APItest.xs b/ext/XS-APItest/APItest.xs
index 3f76dd7..b4f5560 100644
--- a/ext/XS-APItest/APItest.xs
+++ b/ext/XS-APItest/APItest.xs
@@ -3387,6 +3387,13 @@ CODE:
 OUTPUT:
     RETVAL
 
+bool
+test_allocfilegv()
+CODE:
+    RETVAL = PL_filegvpad[allocfilegv(PL_defgv)] == PL_defgv;
+OUTPUT:
+    RETVAL
+
 #endif
 
 bool
diff --git a/ext/XS-APItest/t/cop.t b/ext/XS-APItest/t/cop.t
new file mode 100644
index 0000000..b5571e6
--- /dev/null
+++ b/ext/XS-APItest/t/cop.t
@@ -0,0 +1,10 @@
+use Config;
+use Test::More;
+BEGIN { plan skip_all => 'no threads' unless $Config{useithreads} }
+
+plan tests => 2;
+
+use XS::APItest;
+
+ok test_alloccopstash;
+ok test_allocfilegv;
diff --git a/ext/XS-APItest/t/copstash.t b/ext/XS-APItest/t/copstash.t
deleted file mode 100644
index 8ed98a2..0000000
--- a/ext/XS-APItest/t/copstash.t
+++ /dev/null
@@ -1,9 +0,0 @@
-use Config;
-use Test::More;
-BEGIN { plan skip_all => 'no threads' unless $Config{useithreads} }
-
-plan tests => 1;
-
-use XS::APItest;
-
-ok test_alloccopstash;
diff --git a/gv.c b/gv.c
index 8449047..21017c0 100644
--- a/gv.c
+++ b/gv.c
@@ -2103,12 +2103,8 @@ Perl_gv_check(pTHX_ const HV *stash)
 		    continue;
 		file = GvFILE(gv);
 		CopLINE_set(PL_curcop, GvLINE(gv));
-#ifdef USE_ITHREADS
-		CopFILE(PL_curcop) = (char *)file;	/* set for warning */
-#else
-		CopFILEGV(PL_curcop)
-		    = gv_fetchfile_flags(file, HEK_LEN(GvFILE_HEK(gv)), 0);
-#endif
+		/* set file name for warning */
+		CopFILE_setn(PL_curcop, file, HEK_LEN(GvFILE_HEK(gv)));
 		Perl_warner(aTHX_ packWARN(WARN_ONCE),
 			"Name \"%"HEKf"::%"HEKf
 			"\" used only once: possible typo",
diff --git a/inline.h b/inline.h
index 7aeb93d..669f671 100644
--- a/inline.h
+++ b/inline.h
@@ -23,6 +23,20 @@ S_av_top_index(pTHX_ AV *av)
     return AvFILL(av);
 }
 
+/* ------------------------------- cop.h ------------------------------ */
+
+#ifdef USE_ITHREADS
+PERL_STATIC_INLINE void
+S_CopFILE_free(pTHX_ COP * const c)
+{
+    GV * const gv = CopFILEGV(c);
+    if (!gv) return;
+    if (SvREFCNT(gv) == 1) PL_filegvpad[c->cop_filegvoff] = NULL;
+    SvREFCNT_dec_NN(gv);
+    c->cop_filegvoff = 0;
+}
+#endif
+
 /* ------------------------------- cv.h ------------------------------- */
 
 PERL_STATIC_INLINE I32 *
@@ -108,6 +122,7 @@ PERL_STATIC_INLINE void
 S_SvREFCNT_dec_NN(pTHX_ SV *sv)
 {
     U32 rc = SvREFCNT(sv);
+    PERL_ARGS_ASSERT_SVREFCNT_DEC_NN;
     if (LIKELY(rc > 1))
 	SvREFCNT(sv) = rc - 1;
     else
diff --git a/intrpvar.h b/intrpvar.h
index f6827f2..c085d54 100644
--- a/intrpvar.h
+++ b/intrpvar.h
@@ -657,6 +657,9 @@ PERLVAR(I, regex_padav,   AV *)		/* All regex objects, indexed via the
 PERLVAR(I, stashpad,    HV **)		/* for CopSTASH */
 PERLVARI(I, stashpadmax, PADOFFSET, 64)
 PERLVARI(I, stashpadix, PADOFFSET, 0)
+PERLVAR(I, filegvpad,    GV **)		/* for CopFILEGV */
+PERLVARI(I, filegvpadmax, PADOFFSET, 64)
+PERLVARI(I, filegvpadix, PADOFFSET, 0)
 #endif
 
 #ifdef USE_REENTRANT_API
diff --git a/op.c b/op.c
index 8a30264..af9b00c 100644
--- a/op.c
+++ b/op.c
@@ -647,31 +647,64 @@ C<PL_stashpad> for the stash passed to it.
 */
 
 #ifdef USE_ITHREADS
+
 PADOFFSET
-Perl_alloccopstash(pTHX_ HV *hv)
+S_alloc_global_pad_slot(pTHX_ SV *sv, svtype type, SV ***padp,
+			      PADOFFSET *ixp, PADOFFSET *maxp)
 {
     PADOFFSET off = 0, o = 1;
     bool found_slot = FALSE;
+    SV **pad = *padp;
 
-    PERL_ARGS_ASSERT_ALLOCCOPSTASH;
-
-    if (PL_stashpad[PL_stashpadix] == hv) return PL_stashpadix;
+    if (pad[*ixp] == sv) return *ixp;
 
-    for (; o < PL_stashpadmax; ++o) {
-	if (PL_stashpad[o] == hv) return PL_stashpadix = o;
-	if (!PL_stashpad[o] || SvTYPE(PL_stashpad[o]) != SVt_PVHV)
+    for (; o < *maxp; ++o) {
+	if (pad[o] == sv) return *ixp = o;
+	if (!pad[o] || SvTYPE(pad[o]) != type)
 	    found_slot = TRUE, off = o;
     }
     if (!found_slot) {
-	Renew(PL_stashpad, PL_stashpadmax + 10, HV *);
-	Zero(PL_stashpad + PL_stashpadmax, 10, HV *);
-	off = PL_stashpadmax;
-	PL_stashpadmax += 10;
+	Renew(*padp, *maxp + 10, SV *);
+	pad = *padp;
+	Zero(pad + *maxp, 10, SV *);
+	off = *maxp;
+	*maxp += 10;
     }
 
-    PL_stashpad[PL_stashpadix = off] = hv;
+    pad[*ixp = off] = sv;
     return off;
 }
+
+PADOFFSET
+Perl_alloccopstash(pTHX_ HV *hv)
+{
+    PERL_ARGS_ASSERT_ALLOCCOPSTASH;
+    return S_alloc_global_pad_slot(aTHX_
+		(SV *)hv, SVt_PVHV, (SV ***)&PL_stashpad, &PL_stashpadix,
+		&PL_stashpadmax
+	   );
+}
+#endif
+
+/*
+=for apidoc allocfilegv
+
+Available only under threaded builds, this function allocates an entry in
+C<PL_filegvpad> for the GV passed to it.
+
+=cut
+*/
+
+#ifdef USE_ITHREADS
+PADOFFSET
+Perl_allocfilegv(pTHX_ GV *gv)
+{
+    PERL_ARGS_ASSERT_ALLOCFILEGV;
+    return S_alloc_global_pad_slot(aTHX_
+		(SV *)gv, SVt_PVGV, (SV ***)&PL_filegvpad, &PL_filegvpadix,
+		&PL_filegvpadmax
+	   );
+}
 #endif
 
 /* free the body of an op without examining its contents.
@@ -10906,7 +10939,7 @@ Perl_rpeep(pTHX_ OP *o)
 		    firstcop->cop_line = secondcop->cop_line;
 #ifdef USE_ITHREADS
 		    firstcop->cop_stashoff = secondcop->cop_stashoff;
-		    firstcop->cop_file = secondcop->cop_file;
+		    firstcop->cop_filegvoff = secondcop->cop_filegvoff;
 #else
 		    firstcop->cop_stash = secondcop->cop_stash;
 		    firstcop->cop_filegv = secondcop->cop_filegv;
@@ -10918,7 +10951,7 @@ Perl_rpeep(pTHX_ OP *o)
 
 #ifdef USE_ITHREADS
 		    secondcop->cop_stashoff = 0;
-		    secondcop->cop_file = NULL;
+		    secondcop->cop_filegvoff = 0;
 #else
 		    secondcop->cop_stash = NULL;
 		    secondcop->cop_filegv = NULL;
diff --git a/perl.c b/perl.c
index bad66f5..daf3375 100644
--- a/perl.c
+++ b/perl.c
@@ -286,6 +286,7 @@ perl_construct(pTHXx)
     Perl_av_create_and_push(aTHX_ &PL_regex_padav, newSVpvs(""));
     PL_regex_pad = AvARRAY(PL_regex_padav);
     Newxz(PL_stashpad, PL_stashpadmax, HV *);
+    Newxz(PL_filegvpad, PL_filegvpadmax, GV *);
 #endif
 #ifdef USE_REENTRANT_API
     Perl_reentrant_init(aTHX);
@@ -1091,6 +1092,7 @@ perl_destruct(pTHXx)
 
 #ifdef USE_ITHREADS
     Safefree(PL_stashpad); /* must come after sv_clean_all */
+    Safefree(PL_filegvpad);
 #endif
 
     AvREAL_off(PL_fdpid);		/* no surviving entries */
diff --git a/proto.h b/proto.h
index 2389ed8..76edd33 100644
--- a/proto.h
+++ b/proto.h
@@ -35,6 +35,11 @@ PERL_CALLCONV void	Perl_Slab_Free(pTHX_ void *op)
 #define PERL_ARGS_ASSERT_SLAB_FREE	\
 	assert(op)
 
+PERL_STATIC_INLINE void	S_SvREFCNT_dec_NN(pTHX_ SV *sv)
+			__attribute__nonnull__(pTHX_1);
+#define PERL_ARGS_ASSERT_SVREFCNT_DEC_NN	\
+	assert(sv)
+
 PERL_CALLCONV bool	Perl__is_uni_FOO(pTHX_ const U8 classnum, const UV c)
 			__attribute__warn_unused_result__;
 
@@ -7617,6 +7622,11 @@ PERL_CALLCONV PADOFFSET	Perl_alloccopstash(pTHX_ HV *hv)
 #define PERL_ARGS_ASSERT_ALLOCCOPSTASH	\
 	assert(hv)
 
+PERL_CALLCONV PADOFFSET	Perl_allocfilegv(pTHX_ GV *gv)
+			__attribute__nonnull__(pTHX_1);
+#define PERL_ARGS_ASSERT_ALLOCFILEGV	\
+	assert(gv)
+
 PERL_CALLCONV void*	Perl_any_dup(pTHX_ void* v, const PerlInterpreter* proto_perl)
 			__attribute__warn_unused_result__
 			__attribute__nonnull__(pTHX_2);
diff --git a/scope.c b/scope.c
index 3ac3990..2464590 100644
--- a/scope.c
+++ b/scope.c
@@ -1231,6 +1231,11 @@ Perl_leave_scope(pTHX_ I32 base)
 	case SAVEt_READONLY_OFF:
 	    SvREADONLY_off(ARG0_SV);
 	    break;
+#ifdef USE_ITHREADS
+	case SAVEt_COPFILEFREE:
+	    CopFILE_free((COP *)ARG0_PTR);
+	    break;
+#endif
 	default:
 	    Perl_croak(aTHX_ "panic: leave_scope inconsistency %u", type);
 	}
diff --git a/scope.h b/scope.h
index 235212f..97aa1b6 100644
--- a/scope.h
+++ b/scope.h
@@ -39,12 +39,14 @@
 #define SAVEt_PARSER		19
 #define SAVEt_STACK_POS		20
 #define SAVEt_READONLY_OFF	21
+#ifdef USE_ITHREADS
+# define SAVEt_COPFILEFREE	22
+#endif
 
-#define SAVEt_ARG1_MAX		21
+#define SAVEt_ARG1_MAX		22
 
 /* two args */
 
-#define SAVEt_APTR		22
 #define SAVEt_AV		23
 #define SAVEt_DESTRUCTOR	24
 #define SAVEt_DESTRUCTOR_X	25
@@ -69,17 +71,18 @@
 #define SAVEt_SVREF		44
 #define SAVEt_VPTR		45
 #define SAVEt_ADELETE		46
+#define SAVEt_APTR		47
 
-#define SAVEt_ARG2_MAX		46
+#define SAVEt_ARG2_MAX		47
 
 /* three args */
 
-#define SAVEt_DELETE		47
 #define SAVEt_HELEM		48
 #define SAVEt_PADSV_AND_MORTALIZE 49
 #define SAVEt_SET_SVFLAGS	50
 #define SAVEt_GVSLOT		51
 #define SAVEt_AELEM		52
+#define SAVEt_DELETE		53
 
 #define SAVEf_SETMAGIC		1
 #define SAVEf_KEEPOLDELEM	2
@@ -301,8 +304,11 @@ scope has the given name. Name must be a literal string.
 
 #ifdef USE_ITHREADS
 #  define SAVECOPSTASH_FREE(c)	SAVEIV((c)->cop_stashoff)
-#  define SAVECOPFILE(c)	SAVEPPTR(CopFILE(c))
-#  define SAVECOPFILE_FREE(c)	SAVESHAREDPV(CopFILE(c))
+#  define SAVECOPFILE(c)	SAVEIV((c)->cop_filegvoff)
+#  define SAVECOPFILE_FREE(c) ( \
+	SAVEIV((c)->cop_filegvoff),			\
+	save_pushptr((void *)(c), SAVEt_COPFILEFREE)	\
+    )
 #else
 #  /* XXX not refcounted */
 #  define SAVECOPSTASH_FREE(c)	SAVESPTR(CopSTASH(c))
diff --git a/sv.c b/sv.c
index 0e33556..40694a6 100644
--- a/sv.c
+++ b/sv.c
@@ -13396,10 +13396,6 @@ perl_clone_using(PerlInterpreter *proto_perl, UV flags,
 
     Zero(PL_sv_consts, SV_CONSTS_COUNT, SV*);
 
-    /* This PV will be free'd special way so must set it same way op.c does */
-    PL_compiling.cop_file    = savesharedpv(PL_compiling.cop_file);
-    ptr_table_store(PL_ptr_table, proto_perl->Icompiling.cop_file, PL_compiling.cop_file);
-
     ptr_table_store(PL_ptr_table, &proto_perl->Icompiling, &PL_compiling);
     PL_compiling.cop_warnings = DUP_WARNINGS(PL_compiling.cop_warnings);
     CopHINTHASH_set(&PL_compiling, cophh_copy(CopHINTHASH_get(&PL_compiling)));
@@ -13461,6 +13457,14 @@ perl_clone_using(PerlInterpreter *proto_perl, UV flags,
 	for (; o < PL_stashpadmax; ++o)
 	    PL_stashpad[o] = hv_dup(proto_perl->Istashpad[o], param);
     }
+    PL_filegvpadmax	= proto_perl->Ifilegvpadmax;
+    PL_filegvpadix	= proto_perl->Ifilegvpadix ;
+    Newx(PL_filegvpad, PL_filegvpadmax, GV *);
+    {
+	PADOFFSET o = 0;
+	for (; o < PL_filegvpadmax; ++o)
+	    PL_filegvpad[o] = gv_dup(proto_perl->Ifilegvpad[o], param);
+    }
 
     /* shortcuts to various I/O objects */
     PL_ofsgv            = gv_dup_inc(proto_perl->Iofsgv, param);

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2013

From @cpansprout

On Fri Jul 05 22​:51​:10 2013, sprout wrote​:

Please test the attached patch and see whether it reduces memory usage
for you. Testing it with Devel​::Size will *not* work, as it will need
to be updated for this. (It is needfully very intrusive into perl’s
guts.)

You can also find it on the sprout/copfile branch.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 31, 2013

From @cpansprout

On Fri Jul 05 22​:51​:10 2013, sprout wrote​:

Please test the attached patch and see whether it reduces memory usage
for you. Testing it with Devel​::Size will *not* work, as it will need
to be updated for this. (It is needfully very intrusive into perl’s
guts.)

I tried it myself on lib/unicore/TestProp.pl. (I added ‘sleep;’ to the
top, so I was testing memory usage after compilation.) The usage
dropped from 119MB to 118MB. That is not particularly significant.

Is it worth applying this patch, or is it just churn?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 31, 2013

From @timbunce

On Wed, Jul 31, 2013 at 12​:34​:52AM -0700, Father Chrysostomos via RT wrote​:

On Fri Jul 05 22​:51​:10 2013, sprout wrote​:

Please test the attached patch and see whether it reduces memory usage
for you.

I'm sorry I've not got around to this till now. And thank you for
working on it!

I tried it myself on lib/unicore/TestProp.pl. (I added ‘sleep;’ to the
top, so I was testing memory usage after compilation.) The usage
dropped from 119MB to 118MB. That is not particularly significant.

How was the usage measured, exactly? What was it before compilation?

Also, I don't see a lib/unicore/TestProp.pl file either in blead or
sprout/copfile.

Is it worth applying this patch, or is it just churn?

I think it's worth it but let's check it out some more first.

Tim.

@p5pRT
Copy link
Author

p5pRT commented Jul 31, 2013

From @cpansprout

On Wed Jul 31 02​:46​:54 2013, timbo wrote​:

On Wed, Jul 31, 2013 at 12​:34​:52AM -0700, Father Chrysostomos via RT
wrote​:

On Fri Jul 05 22​:51​:10 2013, sprout wrote​:

Please test the attached patch and see whether it reduces memory usage
for you.

I'm sorry I've not got around to this till now. And thank you for
working on it!

I tried it myself on lib/unicore/TestProp.pl. (I added ‘sleep;’ to the
top, so I was testing memory usage after compilation.) The usage
dropped from 119MB to 118MB. That is not particularly significant.

How was the usage measured, exactly?

I looked at the ‘Virtual Memory’ column in Apple’s Activity Monitor.

What was it before compilation?

I have just retaken the numbers, without -DDEBUGGING this time.

With BEGIN{sleep}​: 17.4 MB in both cases
With just ‘sleep;’​:
  unpatched​: 100.9 MB
  patched​: 102 MB

Also, I don't see a lib/unicore/TestProp.pl file either in blead or
sprout/copfile.

That gets built by lib/unicore/mktables. It makes a good test because
it has 100,000 lines of generated code. Run make first and you will see it.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 3, 2013

From @cpansprout

On Wed Jul 31 02​:46​:54 2013, timbo wrote​:

I think it's worth it but let's check it out some more first.

A bug that used to occur only under non-threaded builds can now happen
under threads, too​:

$ ./perl -Ilib -e 'BEGIN{${"_<-e"}=\1} warn __FILE__'
??C? at ??C? line 1.

It looks as though the solution is to change

#define CopFILE(c) (CopFILEGV(c) && GvSV(CopFILEGV(c)) \
  ? SvPVX(GvSV(CopFILEGV(c))) : NULL)

to

#define CopFILE(c) (CopFILEGV(c) && GvSV(CopFILEGV(c)) \
  ? SvPVX(GvSV(CopFILEGV(c))) : NULL)

but I have another idea that might simplify the code and reduce memory
usage even more....

Where do I put tests for things like that __FILE__ bug?
t/comp/retainedlines.t has similar tests for CopFILEGV stuff, but
__FILE__ has nothing to do with retained lines.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 4, 2013

From @cpansprout

On Sat Aug 03 14​:31​:06 2013, sprout wrote​:

but I have another idea that might simplify the code and reduce memory
usage even more....

The other idea is to use HEKs for storing the file name in the cop, both
for threads and no threads.

Currently threads use separately allocated PVs, while non-threaded
builds use GV pointers. The latter are optimised for the debugger while
the former are not really optimised for anything.

CopFILE(GV) is used for error reporting, __FILE__, and the debugger.

Using HEKs will optimise for compilation time, make no noticeable
difference to __FILE__ and error reporting, while slowing down the
debugger a tiny bit on non-threaded builds.

If nobody objects I will go ahead and use HEKs for that.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 4, 2013

From @timbunce

On Sat, Aug 03, 2013 at 10​:52​:13PM -0700, Father Chrysostomos via RT wrote​:

The other idea is to use HEKs for storing the file name in the cop, both
for threads and no threads.

Currently threads use separately allocated PVs, while non-threaded
builds use GV pointers. The latter are optimised for the debugger while
the former are not really optimised for anything.

CopFILE(GV) is used for error reporting, __FILE__, and the debugger.

Using HEKs will optimise for compilation time, make no noticeable
difference to __FILE__ and error reporting, while slowing down the
debugger a tiny bit on non-threaded builds.

If nobody objects I will go ahead and use HEKs for that.

Sounds good to me. Thanks!

Tim.

@p5pRT
Copy link
Author

p5pRT commented Aug 5, 2013

From @cpansprout

On Sun Aug 04 05​:12​:08 2013, timbo wrote​:

On Sat, Aug 03, 2013 at 10​:52​:13PM -0700, Father Chrysostomos via RT
wrote​:

The other idea is to use HEKs for storing the file name in the cop, both
for threads and no threads.

Currently threads use separately allocated PVs, while non-threaded
builds use GV pointers. The latter are optimised for the debugger while
the former are not really optimised for anything.

CopFILE(GV) is used for error reporting, __FILE__, and the debugger.

Using HEKs will optimise for compilation time, make no noticeable
difference to __FILE__ and error reporting, while slowing down the
debugger a tiny bit on non-threaded builds.

If nobody objects I will go ahead and use HEKs for that.

Sounds good to me. Thanks!

I mistakenly though that HEKs were shared between threads. They are
not. Adding a pad to store the HEKs would still be necessary under
threads. And then we would need an API function for B​::C to use, but
HEKs are not part of the API. I didn’t want to dig any further.

So I committed my patch (with fixes) as c82ecf3.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 5, 2013

@cpansprout - Status changed from 'open' to 'resolved'

@p5pRT p5pRT closed this as completed Aug 5, 2013
@p5pRT
Copy link
Author

p5pRT commented Aug 5, 2013

From @cpansprout

On Sat Aug 03 14​:31​:06 2013, sprout wrote​:

On Wed Jul 31 02​:46​:54 2013, timbo wrote​:

I think it's worth it but let's check it out some more first.

A bug that used to occur only under non-threaded builds can now happen
under threads, too​:

$ ./perl -Ilib -e 'BEGIN{${"_<-e"}=\1} warn __FILE__'
??C? at ??C? line 1.

That I fixed in 1311cfc and similar bugs in subsequent commits.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 5, 2013

From @khwilliamson

This is a bug report for perl from khw@​karl.(none),
generated with the help of perlbug 1.39 running under perl 5.19.3.


./blead/Porting/bisect.pl -Doptimize=-ggdb3 -Aoptimize=-O0 -DDEBUGGING
-Dcc=g++ -Dusemorebits -Dusethreads --start v5.19.2 -- ./perl -Ilib
t/TEST op/chr.t

c82ecf3 is the first bad commit
commit c82ecf3
Author​: Father Chrysostomos <sprout@​cpan.org>
Date​: Fri Jul 5 22​:51​:50 2013 -0700

  [perl #117855] Store CopFILEGV in a pad under ithreads

  This saves having to allocate a separate string buffer for every cop
  (control op; every statement has one).

  Under non-threaded builds, every cop has a pointer to the GV for that
  source file, namely *{"_<filename"}.

  Under threaded builds, the name of the GV used to be stored instead.

  Now we store an offset into the per-interpreter PL_filegvpad, which
  points to the GV.

  This makes no significant speed difference, but it reduces mem-
  ory usage.
bisect run success
That took 2525 seconds

I don't know which Configure options did it; I tried bisect first with
just -Dusethreads, and it did not fail.



Flags​:
  category=core
  severity=critical


Site configuration information for perl 5.19.3​:

Configured by khw at Mon Aug 5 07​:58​:22 MDT 2013.

Summary of my perl5 (revision 5 version 19 subversion 3) configuration​:
  Commit id​: d2af8e8
  Platform​:
  osname=linux, osvers=2.6.35-32-generic-pae,
archname=i686-linux-thread-multi-64int-ld
  uname='linux karl 2.6.35-32-generic-pae #67-ubuntu smp mon mar 5
21​:23​:19 utc 2012 i686 gnulinux '
  config_args='-des -Dprefix=/home/khw/blead -Dusedevel
-D'optimize=-ggdb3' -A'optimize=-ggdb3' -A'optimize=-O0' -Dman1dir=none
-Dman3dir=none -DDEBUGGING -Dcc=g++ -Dusemorebits -Dusethreads'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=define, usemultiplicity=define
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=define, use64bitall=undef, uselongdouble=define
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='g++', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize=' -ggdb3 -O0',
  cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='4.4.5', gccosandvers=''
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
  ivtype='long long', ivsize=8, nvtype='long double', nvsize=12,
Off_t='off_t', lseeksize=8
  alignbytes=4, prototype=define
  Linker and Libraries​:
  ld='g++', ldflags =' -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /lib/../lib /usr/lib/../lib /lib /usr/lib
/usr/lib/i686-linux-gnu
  libs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
  perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
  libc=/lib/libc-2.12.1.so, so=so, useshrplib=false, libperl=libperl.a
  gnulibc_version='2.12'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
  cccdlflags='-fPIC', lddlflags='-shared -ggdb3 -ggdb3 -O0
-L/usr/local/lib -fstack-protector'


@​INC for perl 5.19.3​:

/home/khw/blead/lib/perl5/site_perl/5.19.3/i686-linux-thread-multi-64int-ld
  /home/khw/blead/lib/perl5/site_perl/5.19.3
  /home/khw/blead/lib/perl5/5.19.3/i686-linux-thread-multi-64int-ld
  /home/khw/blead/lib/perl5/5.19.3
  /home/khw/blead/lib/perl5/site_perl
  .


Environment for perl 5.19.3​:
  HOME=/home/khw
  LANG=en_US.UTF-8
  LANGUAGE=en_US​:en
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)

PATH=/home/khw/bin​:/home/khw/perl5/perlbrew/bin​:/home/khw/print/bin​:/bin​:/usr/local/sbin​:/usr/local/bin​:/usr/sbin​:/usr/bin​:/sbin​:/usr/games​:/home/khw/cxoffice/bin
  PERL5OPT=-w
  PERL_BADLANG (unset)
  PERL_POD_PEDANTIC=1
  SHELL=/bin/ksh

@p5pRT
Copy link
Author

p5pRT commented Aug 5, 2013

From @cpansprout

On Mon Aug 05 12​:42​:46 2013, public@​khwilliamson.com wrote​:

./blead/Porting/bisect.pl -Doptimize=-ggdb3 -Aoptimize=-O0 -DDEBUGGING
-Dcc=g++ -Dusemorebits -Dusethreads --start v5.19.2 -- ./perl -Ilib
t/TEST op/chr.t

c82ecf3 is the first bad commit
commit c82ecf3
Author​: Father Chrysostomos <sprout@​cpan.org>
Date​: Fri Jul 5 22​:51​:50 2013 -0700

 \[perl \#117855\] Store CopFILEGV in a pad under ithreads

 This saves having to allocate a separate string buffer for every

cop
(control op; every statement has one).

 Under non\-threaded builds\, every cop has a pointer to the GV for

that
source file, namely *{"_<filename"}.

 Under threaded builds\, the name of the GV used to be stored

instead.

 Now we store an offset into the per\-interpreter PL\_filegvpad\,

which
points to the GV.

 This makes no significant speed difference\, but it reduces mem\-
 ory usage\.

bisect run success
That took 2525 seconds

I don't know which Configure options did it; I tried bisect first with
just -Dusethreads, and it did not fail.

I tried exactly the same options, and, at least under miniperl,
t/op/chr.t does not fail. What was the failure?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2013

From @khwilliamson

On 08/05/2013 01​:50 PM, Father Chrysostomos via RT wrote​:

On Mon Aug 05 12​:42​:46 2013, public@​khwilliamson.com wrote​:

./blead/Porting/bisect.pl -Doptimize=-ggdb3 -Aoptimize=-O0 -DDEBUGGING
-Dcc=g++ -Dusemorebits -Dusethreads --start v5.19.2 -- ./perl -Ilib
t/TEST op/chr.t

c82ecf3 is the first bad commit
commit c82ecf3
Author​: Father Chrysostomos <sprout@​cpan.org>
Date​: Fri Jul 5 22​:51​:50 2013 -0700

  \[perl \#117855\] Store CopFILEGV in a pad under ithreads

  This saves having to allocate a separate string buffer for every

cop
(control op; every statement has one).

  Under non\-threaded builds\, every cop has a pointer to the GV for

that
source file, namely *{"_<filename"}.

  Under threaded builds\, the name of the GV used to be stored

instead.

  Now we store an offset into the per\-interpreter PL\_filegvpad\,

which
points to the GV.

  This makes no significant speed difference\, but it reduces mem\-
  ory usage\.

bisect run success
That took 2525 seconds

I don't know which Configure options did it; I tried bisect first with
just -Dusethreads, and it did not fail.

I tried exactly the same options, and, at least under miniperl,
t/op/chr.t does not fail. What was the failure?

tail of the harness output​:
  Test Summary Report
  -------------------
  op/chr.t
(Wstat​: 0 Tests​: 42 Failed​: 1)
  Failed test​: 10
  op/lc.t
(Wstat​: 0 Tests​: 128 Failed​: 1)
  Failed test​: 125
  ../cpan/List-Util/t/readonly.t
(Wstat​: 0 Tests​: 11 Failed​: 0)
  TODO passed​: 10
  ../lib/bytes.t
(Wstat​: 0 Tests​: 20 Failed​: 5)
  Failed tests​: 8-12
  ../lib/feature.t
(Wstat​: 0 Tests​: 57 Failed​: 1)
  Failed test​: 3
  ../lib/locale.t
(Wstat​: 0 Tests​: 461 Failed​: 7)
  Failed tests​: 266, 270, 322, 326, 370, 410, 450
  Files=2334, Tests=690888, 1334 wallclock secs (175.47 usr 6.51 sys +
1174.02 cusr 43.70 csys = 1399.70 CPU)
  Result​: FAIL
  make​: *** [test_harness] Error 15

=============================
op/chr.t .. # Failed test 10 - at op/chr.t line 28
# got "\x{fffd}"
# expected "\000"
op/chr.t .. Failed 1/42 subtests

not ok 125 - lc of above-ASCII Latin1 is itself under use bytes
# Failed test 125 - lc of above-ASCII Latin1 is itself under use bytes
at op/lc.t line 278
# got "\x{e0}"
# expected "\x{c0}"

========================
# From lib/feature/bundle
ok 1
ok 2
PROG​:
# Standard feature bundle, 5.11
use feature "​:5.11";
say ord uc chr 233;
EXPECTED​:
201
GOT​:
233

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2013

@cpansprout - Status changed from 'resolved' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2013

From @cpansprout

On Mon Aug 05 20​:25​:00 2013, public@​khwilliamson.com wrote​:

op/chr.t .. # Failed test 10 - at op/chr.t line 28
# got "\x{fffd}"
# expected "\000"
op/chr.t .. Failed 1/42 subtests

OK, so I broke ‘use bytes’. Isn’t that a feature? :-)

It looks as though cop->cop_hints is getting muddled up somehow, but I
cannot see how.

This is the failing test​:

  use bytes; # Backward compatibility.
  is(chr(-0.1), "\x00");

How small can you reduce chr.t while still getting the failure?

Does ./miniperl -Ilib -Mbytes -le 'print ord chr -0.1' give the wrong
answer?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2013

From [Unknown Contact. See original ticket]

On Mon Aug 05 20​:25​:00 2013, public@​khwilliamson.com wrote​:

op/chr.t .. # Failed test 10 - at op/chr.t line 28
# got "\x{fffd}"
# expected "\000"
op/chr.t .. Failed 1/42 subtests

OK, so I broke ‘use bytes’. Isn’t that a feature? :-)

It looks as though cop->cop_hints is getting muddled up somehow, but I
cannot see how.

This is the failing test​:

  use bytes; # Backward compatibility.
  is(chr(-0.1), "\x00");

How small can you reduce chr.t while still getting the failure?

Does ./miniperl -Ilib -Mbytes -le 'print ord chr -0.1' give the wrong
answer?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2013

From @khwilliamson

On 08/06/2013 06​:45 AM, Father Chrysostomos via RT wrote​:

On Mon Aug 05 20​:25​:00 2013, public@​khwilliamson.com wrote​:

op/chr.t .. # Failed test 10 - at op/chr.t line 28
# got "\x{fffd}"
# expected "\000"
op/chr.t .. Failed 1/42 subtests

OK, so I broke ‘use bytes’. Isn’t that a feature? :-)

It looks as though cop->cop_hints is getting muddled up somehow, but I
cannot see how.

This is the failing test​:

 use bytes; \# Backward compatibility\.
 is\(chr\(\-0\.1\)\, "\\x00"\);

How small can you reduce chr.t while still getting the failure?
# Failed test 1 - at op/chr.t line 28
  # got "\x{fffd}"
  # expected "\000"
  # Looks like you planned 42 tests but ran 1.
  1..42
  not ok 1

So, even if it is the only test that is run (commenting out the others),
it fails.

Does ./miniperl -Ilib -Mbytes -le 'print ord chr -0.1' give the wrong
answer?

  ./miniperl -Ilib -Mbytes -le 'print ord chr -0.1'
0

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2013

From @khwilliamson

On 08/06/2013 11​:40 AM, Karl Williamson wrote​:

On 08/06/2013 06​:45 AM, Father Chrysostomos via RT wrote​:

On Mon Aug 05 20​:25​:00 2013, public@​khwilliamson.com wrote​:

op/chr.t .. # Failed test 10 - at op/chr.t line 28
# got "\x{fffd}"
# expected "\000"
op/chr.t .. Failed 1/42 subtests

OK, so I broke ‘use bytes’. Isn’t that a feature? :-)

It looks as though cop->cop_hints is getting muddled up somehow, but I
cannot see how.

This is the failing test​:

 use bytes; \# Backward compatibility\.
 is\(chr\(\-0\.1\)\, "\\x00"\);

How small can you reduce chr.t while still getting the failure?
# Failed test 1 - at op/chr.t line 28
# got "\x{fffd}"
# expected "\000"
# Looks like you planned 42 tests but ran 1.
1..42
not ok 1

So, even if it is the only test that is run (commenting out the others),
it fails.

Does ./miniperl -Ilib -Mbytes -le 'print ord chr -0.1' give the wrong
answer?

./miniperl -Ilib -Mbytes -le 'print ord chr -0.1'
0

And, the locale.t failures that I didn't previously mention are only
tangentially related to 'use bytes'
not ok 266 uc("à") in C locale (use locale '​:not_characters'; not
encoded in utf8) should be "à", got "À"
not ok 270 uc("ÿ") in C locale (use locale '​:not_characters'; not
encoded in utf8) should be "ÿ", got "Ÿ"
not ok 322 ucfirst("à") in C locale (use locale '​:not_characters'; not
encoded in utf8) should be "à", got "À"
not ok 326 ucfirst("ÿ") in C locale (use locale '​:not_characters'; not
encoded in utf8) should be "ÿ", got "Ÿ"
not ok 370 lc("À") in C locale (use locale '​:not_characters'; not
encoded in utf8) should be "À", got "à"
not ok 410 lcfirst("À") in C locale (use locale '​:not_characters'; not
encoded in utf8) should be "À", got "à"
not ok 450 fc("À") in C locale (use locale '​:not_characters'; not
encoded in utf8) should be "À", got "à"

@p5pRT
Copy link
Author

p5pRT commented Aug 8, 2013

From @cpansprout

On Tue Aug 06 11​:04​:22 2013, public@​khwilliamson.com wrote​:

On 08/06/2013 11​:40 AM, Karl Williamson wrote​:

On 08/06/2013 06​:45 AM, Father Chrysostomos via RT wrote​:

How small can you reduce chr.t while still getting the failure?
# Failed test 1 - at op/chr.t line 28
# got "\x{fffd}"
# expected "\000"
# Looks like you planned 42 tests but ran 1.
1..42
not ok 1

So, even if it is the only test that is run (commenting out the others),
it fails.

Does ./miniperl -Ilib -Mbytes -le 'print ord chr -0.1' give the wrong
answer?

./miniperl -Ilib -Mbytes -le 'print ord chr -0.1'
0

What does ./perl give for the one-liner? What does miniperl give for
the reduced chr.t? What if you eliminate test.pl? etc...

Could you try to find one tiny thing to change that causes the difference?

And, the locale.t failures that I didn't previously mention are only
tangentially related to 'use bytes'

Again, it’s probably the hints getting mixed up.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 8, 2013

From @khwilliamson

On 08/08/2013 07​:11 AM, Father Chrysostomos via RT wrote​:

On Tue Aug 06 11​:04​:22 2013, public@​khwilliamson.com wrote​:

On 08/06/2013 11​:40 AM, Karl Williamson wrote​:

On 08/06/2013 06​:45 AM, Father Chrysostomos via RT wrote​:

How small can you reduce chr.t while still getting the failure?
# Failed test 1 - at op/chr.t line 28
# got "\x{fffd}"
# expected "\000"
# Looks like you planned 42 tests but ran 1.
1..42
not ok 1

So, even if it is the only test that is run (commenting out the others),
it fails.

Does ./miniperl -Ilib -Mbytes -le 'print ord chr -0.1' give the wrong
answer?

./miniperl -Ilib -Mbytes -le 'print ord chr -0.1'
0

What does ./perl give for the one-liner? What does miniperl give for
the reduced chr.t? What if you eliminate test.pl? etc...

Could you try to find one tiny thing to change that causes the difference?

And, the locale.t failures that I didn't previously mention are only
tangentially related to 'use bytes'

Again, it’s probably the hints getting mixed up.

Its not what I thought. The problems in locale.t can be simplified to
the attached script. Its output is​:

117855.pl​: 8​: 100
117855.pl​: 12​: 104
117855.pl​: 16​: 1c020904
117855.pl​: 21​: 1c020900
376
255

What this means is that the hints is getting set properly at compile
time, but the eval is losing the 'unicode_strings', as the proper
uc(\xff) is 376 in both cases.

Does this help in diagnosis?

@p5pRT
Copy link
Author

p5pRT commented Aug 8, 2013

From @khwilliamson

117855.pl

@p5pRT
Copy link
Author

p5pRT commented Aug 9, 2013

From @cpansprout

On Thu Aug 08 09​:58​:45 2013, public@​khwilliamson.com wrote​:

Does this help in diagnosis?

Yes. Does this patch fix it?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 9, 2013

From @cpansprout

Inline Patch
diff --git a/op.c b/op.c
index d10ea86..a0be225 100644
--- a/op.c
+++ b/op.c
@@ -3308,6 +3308,7 @@ S_fold_constants(pTHX_ OP *o)
     /* Verify that we don't need to save it:  */
     assert(PL_curcop == &PL_compiling);
     StructCopy(&PL_compiling, &not_compiling, COP);
+    not_compiling.cop_hints = PL_hints;
     PL_curcop = &not_compiling;
     /* The above ensures that we run with all the correct hints of the
        currently compiling COP, but that IN_PERL_RUNTIME is not true. */

@p5pRT
Copy link
Author

p5pRT commented Aug 9, 2013

From @khwilliamson

On 08/08/2013 07​:21 PM, Father Chrysostomos via RT wrote​:

On Thu Aug 08 09​:58​:45 2013, public@​khwilliamson.com wrote​:

Does this help in diagnosis?

Yes. Does this patch fix it?

Yes!

@p5pRT
Copy link
Author

p5pRT commented Aug 10, 2013

From @cpansprout

On Thu Aug 08 20​:22​:00 2013, public@​khwilliamson.com wrote​:

On 08/08/2013 07​:21 PM, Father Chrysostomos via RT wrote​:

On Thu Aug 08 09​:58​:45 2013, public@​khwilliamson.com wrote​:

Does this help in diagnosis?

Yes. Does this patch fix it?

Yes!

Thank you for the reduced case. It fails for me regardless of compiler,
version, or configuration.

It turns out to be unrelated to c82ecf3 per se. That commit just
exposed an existing bug.

I applied a variation of that patch in commit a547fd2, which
contains a detailed explanation.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 10, 2013

From @cpansprout

On Mon Aug 05 02​:43​:26 2013, sprout wrote​:

On Sun Aug 04 05​:12​:08 2013, timbo wrote​:

On Sat, Aug 03, 2013 at 10​:52​:13PM -0700, Father Chrysostomos via RT
wrote​:

The other idea is to use HEKs for storing the file name in the
cop, both
for threads and no threads.

Currently threads use separately allocated PVs, while non-threaded
builds use GV pointers. The latter are optimised for the debugger
while
the former are not really optimised for anything.

CopFILE(GV) is used for error reporting, __FILE__, and the debugger.

Using HEKs will optimise for compilation time, make no noticeable
difference to __FILE__ and error reporting, while slowing down the
debugger a tiny bit on non-threaded builds.

If nobody objects I will go ahead and use HEKs for that.

Sounds good to me. Thanks!

I mistakenly though that HEKs were shared between threads. They are
not. Adding a pad to store the HEKs would still be necessary under
threads. And then we would need an API function for B​::C to use, but
HEKs are not part of the API. I didn’t want to dig any further.

So I committed my patch (with fixes) as c82ecf3.

It turns out to be fundamentally flawed, as you can’t have a location
shared between threads holding reference counts on things specific to
threads. See for instance,
<CADED=K63DHae9RJqRKnejb-3CZ43Wa=Gsus-DhdYXDXmhgPi2g@​mail.gmail.com>.

So I reverted it in commit 1dc74fd.

To fix this, we would have to implement a new shared string table,
shared between threads. That would be a lot more work.

--

Father Chrysostomos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant