New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CvFILE on threaded, and gv_fetchfile on all perls, burn mad memory #14725
Comments
From @bulk88Created by @bulk88commit a636914 "Re: CvFILE corruption Using "perl -e"system pause; require Test::More; system 'pause'"" as a unthreaded took 11,819,415-10,832,263=987,152 bytes of memory. 1,159,184-987,152=172,032 bytes of memory. So threaded takes 172KB or 17% more memory to load the same Perl module. Reverting ("#define CvFILE_set_from_cop(sv, cop) (CvFILE(sv) = ----------------------------------------------------------------- Dr Memory on op/stash.t shows failures like CvFILE is either a pointer to malloc freed sentinel pattern, or some CvFILE always comes from CopFILE(), but CopFILE() is very different on threaded Another idea proposed in https://metacpan.org/source/MLEHMANN/Devel-FindRef-1.44/FindRef.xs#L204 Another idea is, change CvFILE slot in the CV body to be HEK * owned or That also brings up, why does gv_fetchfile exist to create typeglobs Or Even worse, these GVs have a SV * inside, with YET another unique copy C:\Documents and Settings\Owner>perl -e"require Test::Harness; require C:\Documents and Settings\Owner> Now note, I have not said there are any memory leaks, it is just that Perl Info
|
From @bulk88On Sat May 30 16:29:09 2015, bulk88 wrote: attachments, probably won't make it to the ML due to 250 KB limit -- |
From @bulk88 |
From @bulk88+C:/perl521/srcnewb4opt/lib/Test/More.pm |
From @bulk88 |
From @bulk88C:/perl521/srcnewb4opt/lib/Test/Builder.pm |
From @bulk88Since noone has had any comments, here is a concept patch to remove CvFILE as suggested by http://www.nntp.perl.org/group/perl.perl5.porters/2001/05/msg36781.html This patch doesn't address CopFILE on threaded and its bloat. Test Summary Report ../lib/B/Deparse.t (Wstat: 0 Tests: Deparse failures. ok 224 - no extra output when deparsing foo() Zefram on IRC said those 2 above are functionally identical. ok 228 - CORE::use after my sub use The failures are all related to the todo sub in Deparse.pm. Sub todo is called like this, but I dont understand its purpise enough to say what it does. B::Deparse::todo(B::Deparse=HASH(0xad8ac4), B::CV=SCALAR(0xc06d34), 0) c The failures (specifically I am talking about test 226 " [perl \#121050] prototypes with whitespace") seem to be caused by CVs without GVs. Commit http://perl5.git.perl.org/perl.git/commitdiff/b290562ef436d5316a2f75513def7f4f18c4ef34 "Allow CVs to point to HEKs rather than GVs" by Father C created the scenario where some CVs dont have GVs anymore. Test 226 is such a CV. In test 226 when $cv->GV executes in B, it vivifys the GV, and the GV gets the FILE and LINE from inside sub todo from Deparse.pm. I think that B::CV::GV should not be vivifying things. A getter should not be changing state. Before ->GV is called, CVf_NAMED is on, after ->GV() it is off. Since the GV doesn't exist for test 226, I got the idea, if the CV is a CVf_NAMED CV, use the FILE from the GV that holds the stash HV, it should usually be correct (unless you are create subs from a different .pm than the formal .pm for that package). I did a svref_2object(\*{$cv->STASH->NAME.'::'}->FILE but test 226 still failed since FILE was empty string and not "-e" AKA $O. C:\perl521\srcnewb4opt\t>perl -I../lib -MB -E" say B::svref_2object(\*{'main'})-
C:\perl521\srcnewb4opt\t>perl -I../lib -MB -E" say B::svref_2object(\*{'main::'} But notice main:: has no FILE, that is because PL_curcop->cop_file/cop_filegv is NULL at the time main is created in S_init_main_stash. I guess S_parse_body could be modified to change the creation location file of main to curcop, but I stopped at this point before investing more time. Another conceptual flaw with CvFILE is, where is the line number where the CV was created? What about PP? What about XS (too late to change newXS?) Is it a waste to store line number? Is it a waste to store CvFILE? Like with GVs, if you will store the file, you have to store the line. CvFILE is almost useless as a debugging aid otherwise. Should CVf_NAMED and related code be removed since it removes the GVs which store creation location? -- |
From @bulk880001-WIP-remove-CvFILE.patchFrom 6c737e3bd8b7626a5caf5401db7279421bfdbdcc Mon Sep 17 00:00:00 2001
From: Daniel Dragan <bulk88@hotmail.com>
Date: Tue, 7 Jul 2015 01:43:39 -0400
Subject: [PATCH] WIP remove CvFILE
---
cv.h | 6 +++++-
dump.c | 4 ++--
ext/B/B.xs | 4 ++--
ext/Devel-Peek/t/Peek.t | 34 +++++++++++++++-------------------
gv.c | 2 +-
lib/B/Deparse.pm | 6 ++++--
op.c | 47 +++++++++++++++++++++++++++++------------------
pad.c | 5 +++++
perl.c | 29 ++++++++++++++++-------------
sv.c | 2 ++
sv.h | 1 -
t/op/stash.t | 8 +-------
t/uni/stash.t | 4 +---
universal.c | 2 ++
14 files changed, 85 insertions(+), 69 deletions(-)
diff --git a/cv.h b/cv.h
index d4dfd2d..31b88d9 100644
--- a/cv.h
+++ b/cv.h
@@ -52,6 +52,7 @@ See L<perlguts/Autoloading with XSUBs>.
#define CvGV(sv) S_CvGV(aTHX_ (CV *)(sv))
#define CvGV_set(cv,gv) Perl_cvgv_set(aTHX_ cv, gv)
#define CvHASGV(cv) cBOOL(SvANY(cv)->xcv_gv_u.xcv_gv)
+/*
#define CvFILE(sv) ((XPVCV*)MUTABLE_PTR(SvANY(sv)))->xcv_file
#ifdef USE_ITHREADS
# define CvFILE_set_from_cop(sv, cop) \
@@ -61,6 +62,7 @@ See L<perlguts/Autoloading with XSUBs>.
(CvFILE(sv) = CopFILE(cop), CvDYNFILE_off(sv))
#endif
#define CvFILEGV(sv) (gv_fetchfile(CvFILE(sv)))
+*/
#define CvDEPTH(sv) (*S_CvDEPTHp((const CV *)sv))
/* For use when you only have a XPVCV*, not a real CV*.
Must be assert protected as in S_CvDEPTHp before use. */
@@ -129,7 +131,7 @@ See L<perlguts/Autoloading with XSUBs>.
#ifdef PERL_CORE
# define CVf_SLABBED 0x0800 /* Holds refcount on op slab */
#endif
-#define CVf_DYNFILE 0x1000 /* The filename isn't static */
+/*# define CVf_DYNFILE 0x1000 /* The filename isn't static */
#define CVf_AUTOLOAD 0x2000 /* SvPVX contains AUTOLOADed sub name */
#define CVf_HASEVAL 0x4000 /* contains string eval */
#define CVf_NAMED 0x8000 /* Has a name HEK */
@@ -200,9 +202,11 @@ See L<perlguts/Autoloading with XSUBs>.
# define CvSLABBED_off(cv) (CvFLAGS(cv) &= ~CVf_SLABBED)
#endif
+/*
#define CvDYNFILE(cv) (CvFLAGS(cv) & CVf_DYNFILE)
#define CvDYNFILE_on(cv) (CvFLAGS(cv) |= CVf_DYNFILE)
#define CvDYNFILE_off(cv) (CvFLAGS(cv) &= ~CVf_DYNFILE)
+*/
#define CvAUTOLOAD(cv) (CvFLAGS(cv) & CVf_AUTOLOAD)
#define CvAUTOLOAD_on(cv) (CvFLAGS(cv) |= CVf_AUTOLOAD)
diff --git a/dump.c b/dump.c
index c4d4018..33bb53a 100644
--- a/dump.c
+++ b/dump.c
@@ -1345,7 +1345,7 @@ const struct flag_to_name cv_flags_names[] = {
{CVf_METHOD, "METHOD,"},
{CVf_WEAKOUTSIDE, "WEAKOUTSIDE,"},
{CVf_CVGV_RC, "CVGV_RC,"},
- {CVf_DYNFILE, "DYNFILE,"},
+/* {CVf_DYNFILE, "DYNFILE,"}, */
{CVf_AUTOLOAD, "AUTOLOAD,"},
{CVf_HASEVAL, "HASEVAL,"},
{CVf_SLABBED, "SLABBED,"},
@@ -1960,7 +1960,7 @@ Perl_do_sv_dump(pTHX_ I32 level, PerlIO *file, SV *sv, I32 nest, I32 maxnest, bo
Perl_dump_indent(aTHX_ level, file, " NAME = \"%s\"\n",
HEK_KEY(CvNAME_HEK((CV *)sv)));
else do_gvgv_dump(level, file, " GVGV::GV", CvGV(sv));
- Perl_dump_indent(aTHX_ level, file, " FILE = \"%s\"\n", CvFILE(sv));
+/* Perl_dump_indent(aTHX_ level, file, " FILE = \"%s\"\n", CvFILE(sv)); */
Perl_dump_indent(aTHX_ level, file, " DEPTH = %"IVdf"\n", (IV)CvDEPTH(sv));
Perl_dump_indent(aTHX_ level, file, " FLAGS = 0x%"UVxf"\n", (UV)CvFLAGS(sv));
Perl_dump_indent(aTHX_ level, file, " OUTSIDE_SEQ = %"UVuf"\n", (UV)CvOUTSIDE_SEQ(sv));
diff --git a/ext/B/B.xs b/ext/B/B.xs
index 348a60a..c5eeae6 100644
--- a/ext/B/B.xs
+++ b/ext/B/B.xs
@@ -1594,7 +1594,7 @@ MODULE = B PACKAGE = B::IV
#else
# define PVCV_gv_ix sv_SVp | STRUCT_OFFSET(struct xpvcv, xcv_gv)
#endif
-#define PVCV_file_ix sv_char_pp | STRUCT_OFFSET(struct xpvcv, xcv_file)
+# /* #define PVCV_file_ix sv_char_pp | STRUCT_OFFSET(struct xpvcv, xcv_file) */
#define PVCV_outside_ix sv_SVp | STRUCT_OFFSET(struct xpvcv, xcv_outside)
#define PVCV_outside_seq_ix sv_U32p | STRUCT_OFFSET(struct xpvcv, xcv_outside_seq)
#define PVCV_flags_ix sv_U32p | STRUCT_OFFSET(struct xpvcv, xcv_flags)
@@ -1641,7 +1641,7 @@ IVX(sv)
B::IO::IoFLAGS = PVIO_flags_ix
B::AV::MAX = PVAV_max_ix
B::CV::STASH = PVCV_stash_ix
- B::CV::FILE = PVCV_file_ix
+# B::CV::FILE = PVCV_file_ix
B::CV::OUTSIDE = PVCV_outside_ix
B::CV::OUTSIDE_SEQ = PVCV_outside_seq_ix
B::CV::CvFLAGS = PVCV_flags_ix
diff --git a/ext/Devel-Peek/t/Peek.t b/ext/Devel-Peek/t/Peek.t
index 56522af..6d56462 100644
--- a/ext/Devel-Peek/t/Peek.t
+++ b/ext/Devel-Peek/t/Peek.t
@@ -298,19 +298,18 @@ do_test('reference to anon sub with empty prototype',
RV = $ADDR
SV = PVCV\\($ADDR\\) at $ADDR
REFCNT = 2
- FLAGS = \\($PADMY,POK,pPOK,ANON,WEAKOUTSIDE,CVGV_RC\\) # $] < 5.015 || !thr
- FLAGS = \\($PADMY,POK,pPOK,ANON,WEAKOUTSIDE,CVGV_RC,DYNFILE\\) # $] >= 5.015 && thr
+ FLAGS = \\($PADMY,POK,pPOK,ANON,WEAKOUTSIDE,CVGV_RC\\) # $] < 5.015 || !thr || $] >= 5.023001
+ FLAGS = \\($PADMY,POK,pPOK,ANON,WEAKOUTSIDE,CVGV_RC,DYNFILE\\) # $] >= 5.015 && thr && $] < 5.023001
PROTOTYPE = ""
COMP_STASH = $ADDR\\t"main"
START = $ADDR ===> \\d+
ROOT = $ADDR
GVGV::GV = $ADDR\\t"main" :: "__ANON__[^"]*"
- FILE = ".*\\b(?i:peek\\.t)"
DEPTH = 0(?:
MUTEXP = $ADDR
OWNER = $ADDR)?
- FLAGS = 0x490 # $] < 5.015 || !thr
- FLAGS = 0x1490 # $] >= 5.015 && thr
+ FLAGS = 0x490 # $] < 5.015 || !thr || $] >= 5.023001
+ FLAGS = 0x1490 # $] >= 5.015 && thr && $] < 5.023001
OUTSIDE_SEQ = \\d+
PADLIST = $ADDR
PADNAME = $ADDR\\($ADDR\\) PAD = $ADDR\\($ADDR\\)
@@ -324,14 +323,13 @@ do_test('reference to named subroutine without prototype',
RV = $ADDR
SV = PVCV\\($ADDR\\) at $ADDR
REFCNT = (3|4)
- FLAGS = \\((?:HASEVAL(?:,NAMED)?)?\\) # $] < 5.015 || !thr
- FLAGS = \\(DYNFILE(?:,HASEVAL(?:,NAMED)?)?\\) # $] >= 5.015 && thr
+ FLAGS = \\((?:HASEVAL(?:,NAMED)?)?\\) # $] < 5.015 || !thr || $] >= 5.023001
+ FLAGS = \\(DYNFILE(?:,HASEVAL(?:,NAMED)?)?\\) # $] >= 5.015 && $] < 5.023001 && thr
COMP_STASH = $ADDR\\t"main"
START = $ADDR ===> \\d+
ROOT = $ADDR
NAME = "do_test" # $] >=5.021004
GVGV::GV = $ADDR\\t"main" :: "do_test" # $] < 5.021004
- FILE = ".*\\b(?i:peek\\.t)"
DEPTH = 1(?:
MUTEXP = $ADDR
OWNER = $ADDR)?
@@ -664,8 +662,8 @@ do_test('constant subroutine',
RV = $ADDR
SV = PVCV\\($ADDR\\) at $ADDR
REFCNT = (2)
- FLAGS = \\(POK,pPOK,CONST,ISXSUB\\) # $] < 5.015
- FLAGS = \\(POK,pPOK,CONST,DYNFILE,ISXSUB\\) # $] >= 5.015
+ FLAGS = \\(POK,pPOK,CONST,ISXSUB\\) # $] < 5.015 || $] >= 5.023001
+ FLAGS = \\(POK,pPOK,CONST,DYNFILE,ISXSUB\\) # $] >= 5.015 && $] < 5.023001
PROTOTYPE = ""
COMP_STASH = 0x0 # $] < 5.021004
COMP_STASH = $ADDR "main" # $] >=5.021004
@@ -680,13 +678,12 @@ do_test('constant subroutine',
LEN = \\d+
COW_REFCNT = 0
GVGV::GV = $ADDR\\t"main" :: "const"
- FILE = ".*\\b(?i:peek\\.t)"
DEPTH = 0(?:
MUTEXP = $ADDR
OWNER = $ADDR)?
FLAGS = 0xc00 # $] < 5.013
- FLAGS = 0xc # $] >= 5.013 && $] < 5.015
- FLAGS = 0x100c # $] >= 5.015
+ FLAGS = 0xc # ($] >= 5.013 && $] < 5.015) || $] >= 5.023001
+ FLAGS = 0x100c # $] >= 5.015 && $] < 5.023001
OUTSIDE_SEQ = 0
PADLIST = 0x0 # $] < 5.021006
HSCXT = $ADDR # $] >= 5.021006
@@ -734,19 +731,18 @@ do_test('FORMAT',
RV = $ADDR
SV = PVFM\\($ADDR\\) at $ADDR
REFCNT = 2
- FLAGS = \\(\\) # $] < 5.015 || !thr
- FLAGS = \\(DYNFILE\\) # $] >= 5.015 && thr
+ FLAGS = \\(\\) # $] < 5.015 || !thr || $] >= 5.023001
+ FLAGS = \\(DYNFILE\\) # $] >= 5.015 && thr && $] < 5.023001
(?: PV = 0
)? COMP_STASH = 0x0
START = $ADDR ===> \\d+
ROOT = $ADDR
- GVGV::GV = $ADDR\\t"main" :: "PIE"
- FILE = ".*\\b(?i:peek\\.t)"(?:
+ GVGV::GV = $ADDR\\t"main" :: "PIE"(?:
DEPTH = 0)?(?:
MUTEXP = $ADDR
OWNER = $ADDR)?
- FLAGS = 0x0 # $] < 5.015 || !thr
- FLAGS = 0x1000 # $] >= 5.015 && thr
+ FLAGS = 0x0 # $] < 5.015 || !thr || $] >= 5.023001
+ FLAGS = 0x1000 # $] >= 5.015 && thr && $] < 5.023001
OUTSIDE_SEQ = \\d+
LINES = 0 # $] < 5.017_003
PADLIST = $ADDR
diff --git a/gv.c b/gv.c
index c9058a7..8ec6401 100644
--- a/gv.c
+++ b/gv.c
@@ -612,7 +612,7 @@ S_maybe_add_coresub(pTHX_ HV * const stash, GV *gv,
/* XSUBs can't be perl lang/perl5db.pl debugged
if (PERLDB_LINE || PERLDB_SAVESRC)
(void)gv_fetchfile(file); */
- CvFILE(cv) = (char *)file;
+/* CvFILE(cv) = (char *)file; */
/* XXX This is inefficient, as doing things this order causes
a prototype check in newATTRSUB. But we have to do
it this order as we need an op number before calling
diff --git a/lib/B/Deparse.pm b/lib/B/Deparse.pm
index d4c6f60..fbe4ab8 100644
--- a/lib/B/Deparse.pm
+++ b/lib/B/Deparse.pm
@@ -46,7 +46,7 @@ use B qw(class main_root main_start main_cv svref_2object opnumber perlstring
MDEREF_SHIFT
);
-$VERSION = '1.35';
+$VERSION = '1.36';
use strict;
use vars qw/$AUTOLOAD/;
use warnings ();
@@ -477,7 +477,9 @@ sub null {
sub todo {
my $self = shift;
my($cv, $is_form, $name) = @_;
- my $cvfile = $cv->FILE//'';
+ #if this a CVf_NAMED CV, GV is being reified on the next line
+ #and GV's GP's FILE is Deparse.pm :-(
+ my $cvfile = $cv->GV->FILE//'';
return unless ($cvfile eq $0 || exists $self->{files}{$cvfile});
my $seq;
if ($cv->OUTSIDE_SEQ) {
diff --git a/op.c b/op.c
index f67b631..76bb8e0 100644
--- a/op.c
+++ b/op.c
@@ -8187,7 +8187,7 @@ Perl_newMYSUB(pTHX_ I32 floor, OP *o, OP *proto, OP *attrs, OP *block)
}
else {
cv = MUTABLE_CV(newSV_type(SVt_PVCV));
- CvFILE_set_from_cop(cv, PL_curcop);
+ /* CvFILE_set_from_cop(cv, PL_curcop); */
CvSTASH_set(cv, PL_curstash);
*spot = cv;
}
@@ -8240,9 +8240,11 @@ Perl_newMYSUB(pTHX_ I32 floor, OP *o, OP *proto, OP *attrs, OP *block)
CvFLAGS(compcv) &= ~(CVf_SLABBED|CVf_WEAKOUTSIDE);
CvFLAGS(compcv) |= other_flags;
+/*
if (CvFILE(cv) && CvDYNFILE(cv)) {
Safefree(CvFILE(cv));
}
+*/
/* inner references to compcv must be fixed up ... */
pad_fixup_inner_anons(CvPADLIST(cv), compcv, cv);
@@ -8277,7 +8279,7 @@ Perl_newMYSUB(pTHX_ I32 floor, OP *o, OP *proto, OP *attrs, OP *block)
}
if (const_sv) goto clone;
- CvFILE_set_from_cop(cv, PL_curcop);
+ /*CvFILE_set_from_cop(cv, PL_curcop);*/
CvSTASH_set(cv, PL_curstash);
if (ps) {
@@ -8695,10 +8697,12 @@ Perl_newATTRSUB_x(pTHX_ I32 floor, OP *o, OP *proto, OP *attrs,
CvFLAGS(PL_compcv) &= ~(CVf_SLABBED|CVf_WEAKOUTSIDE);
CvFLAGS(PL_compcv) |= other_flags;
+/*
if (CvFILE(cv) && CvDYNFILE(cv)) {
Safefree(CvFILE(cv));
}
CvFILE_set_from_cop(cv, PL_curcop);
+*/
CvSTASH_set(cv, PL_curstash);
/* inner references to PL_compcv must be fixed up ... */
@@ -8745,7 +8749,9 @@ Perl_newATTRSUB_x(pTHX_ I32 floor, OP *o, OP *proto, OP *attrs,
: (SSize_t)namlen,
hash));
}
+/*
CvFILE_set_from_cop(cv, PL_curcop);
+*/
CvSTASH_set(cv, PL_curstash);
}
@@ -9119,22 +9125,23 @@ Perl_newXS_len_flags(pTHX_ const char *name, STRLEN len,
}
CvGV_set(cv, gv);
- if(filename) {
- /* XSUBs can't be perl lang/perl5db.pl debugged
- if (PERLDB_LINE || PERLDB_SAVESRC)
- (void)gv_fetchfile(filename); */
- assert(!CvDYNFILE(cv)); /* cv_undef should have turned it off */
- if (flags & XS_DYNAMIC_FILENAME) {
- CvDYNFILE_on(cv);
- CvFILE(cv) = savepv(filename);
- } else {
- /* NOTE: not copied, as it is expected to be an external constant string */
- CvFILE(cv) = (char *)filename;
- }
- } else {
- assert((flags & XS_DYNAMIC_FILENAME) == 0 && PL_xsubfilename);
- CvFILE(cv) = (char*)PL_xsubfilename;
- }
+ //if(filename) {
+ // /* XSUBs can't be perl lang/perl5db.pl debugged
+ // if (PERLDB_LINE || PERLDB_SAVESRC)
+ // (void)gv_fetchfile(filename); */
+ // assert(!CvDYNFILE(cv)); /* cv_undef should have turned it off */
+ // if (flags & XS_DYNAMIC_FILENAME) {
+ // CvDYNFILE_on(cv);
+ // CvFILE(cv) = savepv(filename);
+ // } else {
+ // /* NOTE: not copied, as it is expected to be an external constant string */
+ // CvFILE(cv) = (char *)filename;
+ // }
+ //} else {
+ // assert((flags & XS_DYNAMIC_FILENAME) == 0 && PL_xsubfilename);
+ // CvFILE(cv) = (char*)PL_xsubfilename;
+ //}
+
CvISXSUB_on(cv);
CvXSUB(cv) = subaddr;
#ifndef PERL_IMPLICIT_CONTEXT
@@ -9172,7 +9179,9 @@ Perl_newSTUB(pTHX_ GV *gv, bool fake)
}
else cvgv = gv;
CvGV_set(cv, cvgv);
+/*
CvFILE_set_from_cop(cv, PL_curcop);
+*/
CvSTASH_set(cv, PL_curstash);
GvMULTI_on(gv);
return cv;
@@ -9215,7 +9224,9 @@ Perl_newFORM(pTHX_ I32 floor, OP *o, OP *block)
cv = PL_compcv;
GvFORM(gv) = (CV *)SvREFCNT_inc_simple_NN(cv);
CvGV_set(cv, gv);
+/*
CvFILE_set_from_cop(cv, PL_curcop);
+*/
pad_tidy(padtidy_FORMAT);
diff --git a/pad.c b/pad.c
index f5ce5f5..4e54239 100644
--- a/pad.c
+++ b/pad.c
@@ -316,12 +316,14 @@ Perl_cv_undef_flags(pTHX_ CV *cv, U32 flags)
PTR2UV(cv), PTR2UV(PL_comppad))
);
+/*
if (CvFILE(&cvbody)) {
char * file = CvFILE(&cvbody);
CvFILE(&cvbody) = NULL;
if(CvDYNFILE(&cvbody))
Safefree(file);
}
+*/
/* CvSLABBED_off(&cvbody); *//* turned off below */
/* release the sub's body */
@@ -2270,8 +2272,11 @@ S_cv_clone(pTHX_ CV *proto, CV *cv, CV *outside, HV *cloned)
|CVf_SLABBED);
CvCLONED_on(cv);
+/*
CvFILE(cv) = CvDYNFILE(proto) ? savepv(CvFILE(proto))
: CvFILE(proto);
+*/
+
if (CvNAMED(proto))
CvNAME_HEK_set(cv, share_hek_hek(CvNAME_HEK(proto)));
else CvGV_set(cv,CvGV(proto));
diff --git a/perl.c b/perl.c
index cbb66e0..0dd5afe 100644
--- a/perl.c
+++ b/perl.c
@@ -2144,6 +2144,22 @@ S_parse_body(pTHX_ char **env, XSINIT_t xsinit)
assert (!TAINT_get);
init_perllib();
+ CopFILE_free(PL_curcop); /* this should be asserted NULL here and the free removed */
+ CopFILE_set(PL_curcop, "perl.c"); /* register perma-XSUBs's GVs as being from core, not -e or initial script */
+
+ boot_core_PerlIO();
+ boot_core_UNIVERSAL();
+ boot_core_mro();
+ newXS("Internals::V", S_Internals_V, __FILE__);
+
+ if (xsinit)
+ (*xsinit)(aTHX); /* in case linked C routines want magical variables */
+#ifndef PERL_MICRO
+#if defined(VMS) || defined(WIN32) || defined(DJGPP) || defined(__CYGWIN__) || defined(SYMBIAN)
+ init_os_extras();
+#endif
+#endif
+
{
bool suidscript = FALSE;
@@ -2193,19 +2209,6 @@ S_parse_body(pTHX_ char **env, XSINIT_t xsinit)
PL_isarev = newHV();
- boot_core_PerlIO();
- boot_core_UNIVERSAL();
- boot_core_mro();
- newXS("Internals::V", S_Internals_V, __FILE__);
-
- if (xsinit)
- (*xsinit)(aTHX); /* in case linked C routines want magical variables */
-#ifndef PERL_MICRO
-#if defined(VMS) || defined(WIN32) || defined(DJGPP) || defined(__CYGWIN__) || defined(SYMBIAN)
- init_os_extras();
-#endif
-#endif
-
#ifdef USE_SOCKS
# ifdef HAS_SOCKS5_INIT
socks5_init(argv[0]);
diff --git a/sv.c b/sv.c
index b4a36e5..d2a641f 100644
--- a/sv.c
+++ b/sv.c
@@ -13607,7 +13607,9 @@ S_sv_dup_common(pTHX_ const SV *const sstr, CLONE_PARAMS *const param)
sv_dup_inc((const SV *)CvXSUBANY(dstr).any_ptr, param);
}
assert(!CvSLABBED(dstr));
+/*
if (CvDYNFILE(dstr)) CvFILE(dstr) = SAVEPV(CvFILE(dstr));
+*/
if (CvNAMED(dstr))
SvANY((CV *)dstr)->xcv_gv_u.xcv_hek =
hek_dup(CvNAME_HEK((CV *)sstr), param);
diff --git a/sv.h b/sv.h
index 61d5275..d8ed612 100644
--- a/sv.h
+++ b/sv.h
@@ -644,7 +644,6 @@ typedef U32 cv_flags_t;
GV * xcv_gv; \
HEK * xcv_hek; \
} xcv_gv_u; \
- char * xcv_file; \
union { \
PADLIST * xcv_padlist; \
void * xcv_hscxt; \
diff --git a/t/op/stash.t b/t/op/stash.t
index 151b729..d0e383a 100644
--- a/t/op/stash.t
+++ b/t/op/stash.t
@@ -7,7 +7,7 @@ BEGIN {
BEGIN { require "./test.pl"; }
-plan( tests => 51 );
+plan( tests => 49 );
# Used to segfault (bug #15479)
fresh_perl_like(
@@ -215,10 +215,6 @@ SKIP: {
my $br = B::svref_2object($r);
is ($br->STASH->NAME, 'bloop',
'stub records the package it was compiled in');
- # Arguably this shouldn't quite be here, but it's easy to add it
- # here, and tricky to figure out a different good place for it.
- like ($br->FILE, qr/stash/i,
- 'stub records the file it was compiled in');
# We need to take this reference "late", after the subroutine is
# defined.
@@ -227,8 +223,6 @@ SKIP: {
is ($br->STASH->NAME, 'main',
'definition overrides the package it was compiled in');
- like ($br->FILE, qr/eval/,
- 'definition overrides the file it was compiled in');
}
}
diff --git a/t/uni/stash.t b/t/uni/stash.t
index 31d6c9d..680d900 100644
--- a/t/uni/stash.t
+++ b/t/uni/stash.t
@@ -13,7 +13,7 @@ BEGIN {
use utf8;
use open qw( :utf8 :std );
-plan( tests => 49 );
+plan( tests => 48 );
#These come from op/my_stash.t
{
@@ -208,8 +208,6 @@ plan( tests => 49 );
is ($br->STASH->NAME, 'main',
'definition overrides the package it was compiled in');
- like ($br->FILE, qr/eval/,
- 'definition overrides the file it was compiled in');
}
}
diff --git a/universal.c b/universal.c
index 9b34df9..2050997 100644
--- a/universal.c
+++ b/universal.c
@@ -1110,11 +1110,13 @@ Perl_boot_core_UNIVERSAL(pTHX)
{
CV * const cv =
newCONSTSUB(get_hv("Regexp::", GV_ADD), "DESTROY", NULL);
+/*
char ** cvfile = &CvFILE(cv);
char * oldfile = *cvfile;
CvDYNFILE_off(cv);
*cvfile = (char *)file;
Safefree(oldfile);
+*/
}
}
--
1.7.9.msysgit.0
|
From @rurbanOn 07/07/2015 09:08 AM, bulk88 via RT wrote:
As Sarathy already mentioned I need $cv->FILE in B::C to check if the Sarathy was wrong with the dynaloader bit: I use now $gv->FILE, not the I agree with the bloat, and CvFILE causes many more headaches. The main problem I see is with lexical subs and the unNAMED subs.
You get the main->FILE via
Hmm [Disclaimer: The views expressed in this letter are my own, and do not Working towards a true Modern Perl. |
The RT System itself - Status changed from 'new' to 'open' |
From @bulk88On Tue Jul 07 00:08:49 2015, bulk88 wrote:
A much more rough patch I earlier worked on where instead of deleting CvFILE, CvFILE was extracted from the first op of the PP CV, which usually a nextstate op. This avoids the savepv() and additional alloc, since the storage in pp_nextstate is like a C literal string for XSUBs, it will be around until the sub is redefined and at that time a new CvFILE is put in. In one place in testing I found if Perl_pmruntime is the caller of Perl_newATTRSUB_x, OP * start is a pp_qr not a pp_nextstate, IDK enough about the optree to say if this is a bug. -- |
From @bulk880001-dont-report-a-exception-with-uninitialized-s-message.patchFrom c43b488191f961a0a77462f931b04e4915b215af Mon Sep 17 00:00:00 2001
From: Daniel Dragan <bulk88@hotmail.com>
Date: Tue, 7 Jul 2015 12:59:01 -0400
Subject: [PATCH] dont report a $@ exception with uninitialized $!'s message
in IPC::Open3
Commit a24d8dfd08 "Make IPC::Open3 work without fork()" from 5.003 created
an eval block, and if that eval block threw an exception, instead of
propagating $@, the code propagated $!, even though no system call was done
and $! is effectivly unintialized data. In one case for me, a taint
exception inside system was turned into open3() throwing an exception
about "Inappropriate I/O control operation" or "Bad file descriptor", which
had nothing to do with the real fault which was a Perl C level croak with
the message "Insecure $ENV{PATH} while running with -T switch at ..."
which was called as Perl_pp_system->Perl_taint_env->Perl_taint_proper->
Perl_croak->Perl_vcroak. This patch does not try to fix the ambiguity of
the error messages between the !DO_SPAWN and IO::Pipe
branches/implementations of _open3.
---
ext/IPC-Open3/lib/IPC/Open3.pm | 8 +++++-
ext/IPC-Open3/t/IPC-Open3.t | 42 +++++++++++++++++++++++++++++++++++++++-
pod/perldelta.pod | 7 +++++-
3 files changed, 53 insertions(+), 4 deletions(-)
diff --git a/ext/IPC-Open3/lib/IPC/Open3.pm b/ext/IPC-Open3/lib/IPC/Open3.pm
index 7c7e9b5..273f205 100644
--- a/ext/IPC-Open3/lib/IPC/Open3.pm
+++ b/ext/IPC-Open3/lib/IPC/Open3.pm
@@ -9,7 +9,7 @@ require Exporter;
use Carp;
use Symbol qw(gensym qualify);
-$VERSION = '1.18';
+$VERSION = '1.19';
@ISA = qw(Exporter);
@EXPORT = qw(open3);
@@ -412,7 +412,11 @@ sub spawn_with_handles {
} else {
$pid = eval { system 1, @_ }; # 1 == P_NOWAIT
}
- push @errs, "IO::Pipe: Can't spawn-NOWAIT: $!" if !$pid || $pid < 0;
+ if($@) {
+ push @errs, "IO::Pipe: Can't spawn-NOWAIT: $@";
+ } elsif(!$pid || $pid < 0) {
+ push @errs, "IO::Pipe: Can't spawn-NOWAIT: $!";
+ }
}
# Do this in reverse, so that STDERR is restored first:
diff --git a/ext/IPC-Open3/t/IPC-Open3.t b/ext/IPC-Open3/t/IPC-Open3.t
index fcaecef..25cfdfb 100644
--- a/ext/IPC-Open3/t/IPC-Open3.t
+++ b/ext/IPC-Open3/t/IPC-Open3.t
@@ -14,7 +14,7 @@ BEGIN {
}
use strict;
-use Test::More tests => 44;
+use Test::More tests => 45;
use IO::Handle;
use IPC::Open3;
@@ -165,6 +165,46 @@ $pid = eval { open3 'WRITE', '', 'ERROR', '/non/existent/program'; };
like($@, qr/^open3: Modification of a read-only value attempted at /,
'open3 faults read-only parameters correctly') or do {waitpid $pid, 0};
+package NoFetch;
+
+my $fetchcount = 1;
+
+sub TIESCALAR {
+ my $class = shift;
+ my $instance = shift || undef;
+ return bless \$instance => $class;
+}
+
+sub FETCH {
+ my $cmd; #dont let "@args = @DB::args;" in Carp::caller_info fire this die
+ #fetchcount may need to be increased to 2 if this code is being stepped with
+ #a perl debugger
+ if($fetchcount == 1 && (caller(1))[3] ne 'Carp::caller_info') {
+ #Carp croak reports the errors as being in IPC-Open3.t, so it is
+ #unacceptable for testing where the FETCH failure occured, we dont want
+ #it failing in a $foo = $_[0]; #later# system($foo), where the failure
+ #is supposed to be triggered in the inner most syscall, aka system()
+ my ($package, $filename, $line, $subroutine) = caller(2);
+
+ die("FETCH not allowed in ".((caller(1))[3])." in ".((caller(2))[3])."\n");
+ } else {
+ $fetchcount++;
+ return tie($cmd, 'NoFetch');
+ }
+}
+
+package main;
+
+{
+ my $cmd;
+ tie($cmd, 'NoFetch');
+
+ $pid = eval { open3 'WRITE', 'READ', 'ERROR', $cmd; };
+ like($@, qr/^(?:open3: IO::Pipe: Can't spawn-NOWAIT: FETCH not allowed in \(eval\) (?x:
+ )in IPC::Open3::spawn_with_handles|FETCH not allowed in \(eval\) in IPC::Open3::_open3)/,
+ 'dieing inside Tied arg propagates correctly') or do {waitpid $pid, 0};
+}
+
foreach my $handle (qw (DUMMY STDIN STDOUT STDERR)) {
local $::{$handle};
my $out = IO::Handle->new();
diff --git a/pod/perldelta.pod b/pod/perldelta.pod
index d985359..a1c6730 100644
--- a/pod/perldelta.pod
+++ b/pod/perldelta.pod
@@ -148,7 +148,12 @@ XXX
=item *
-L<XXX> has been upgraded from version A.xx to B.yy.
+L<IPC::Open3> has been upgraded from version 1.18 to 1.19.
+
+If a Perl exception was thrown from inside this module, the exception
+C<IPC::Open3> threw to the callers of C<open3> would have an irrelavent
+message derived from C<$!> which was in an undefined state, instead of the
+C<$@> message which triggers the failure path inside C<open3>.
=back
--
1.7.9.msysgit.0
|
From @bulk88On Tue Jul 07 00:08:49 2015, bulk88 wrote:
****WRONG PATCH ABOVE************* A much more rough patch I earlier worked on where instead of deleting CvFILE, CvFILE was extracted from the first op of the PP CV, which usually a nextstate op. This avoids the savepv() and additional alloc, since the storage in pp_nextstate is like a C literal string for XSUBs, it will be around until the sub is redefined and at that time a new CvFILE is put in. In one place in testing I found if Perl_pmruntime is the caller of Perl_newATTRSUB_x, OP * start is a pp_qr not a pp_nextstate, IDK enough about the optree to say if this is a bug. -- |
From @bulk880001-wip-CvFILE-mem-waste.patchFrom 116483b264e69135efcd1a639655caca36d1e1db Mon Sep 17 00:00:00 2001
From: Daniel Dragan <bulk88@hotmail.com>
Date: Mon, 1 Jun 2015 15:35:49 -0400
Subject: [PATCH] wip CvFILE mem waste
---
cv.h | 7 +------
ext/Devel-Peek/t/Peek.t | 24 ++++++++++++------------
op.c | 28 ++++++++++++++++++++++++----
3 files changed, 37 insertions(+), 22 deletions(-)
diff --git a/cv.h b/cv.h
index d4dfd2d..5ba8133 100644
--- a/cv.h
+++ b/cv.h
@@ -53,13 +53,8 @@ See L<perlguts/Autoloading with XSUBs>.
#define CvGV_set(cv,gv) Perl_cvgv_set(aTHX_ cv, gv)
#define CvHASGV(cv) cBOOL(SvANY(cv)->xcv_gv_u.xcv_gv)
#define CvFILE(sv) ((XPVCV*)MUTABLE_PTR(SvANY(sv)))->xcv_file
-#ifdef USE_ITHREADS
-# define CvFILE_set_from_cop(sv, cop) \
- (CvFILE(sv) = savepv(CopFILE(cop)), CvDYNFILE_on(sv))
-#else
-# define CvFILE_set_from_cop(sv, cop) \
+#define CvFILE_set_from_cop(sv, cop) \
(CvFILE(sv) = CopFILE(cop), CvDYNFILE_off(sv))
-#endif
#define CvFILEGV(sv) (gv_fetchfile(CvFILE(sv)))
#define CvDEPTH(sv) (*S_CvDEPTHp((const CV *)sv))
/* For use when you only have a XPVCV*, not a real CV*.
diff --git a/ext/Devel-Peek/t/Peek.t b/ext/Devel-Peek/t/Peek.t
index 56522af..e31d729 100644
--- a/ext/Devel-Peek/t/Peek.t
+++ b/ext/Devel-Peek/t/Peek.t
@@ -298,8 +298,8 @@ do_test('reference to anon sub with empty prototype',
RV = $ADDR
SV = PVCV\\($ADDR\\) at $ADDR
REFCNT = 2
- FLAGS = \\($PADMY,POK,pPOK,ANON,WEAKOUTSIDE,CVGV_RC\\) # $] < 5.015 || !thr
- FLAGS = \\($PADMY,POK,pPOK,ANON,WEAKOUTSIDE,CVGV_RC,DYNFILE\\) # $] >= 5.015 && thr
+ FLAGS = \\($PADMY,POK,pPOK,ANON,WEAKOUTSIDE,CVGV_RC\\) # $] < 5.015 || !thr || $] >= 5.022
+ FLAGS = \\($PADMY,POK,pPOK,ANON,WEAKOUTSIDE,CVGV_RC,DYNFILE\\) # $] >= 5.015 && thr && $] < 5.022
PROTOTYPE = ""
COMP_STASH = $ADDR\\t"main"
START = $ADDR ===> \\d+
@@ -309,8 +309,8 @@ do_test('reference to anon sub with empty prototype',
DEPTH = 0(?:
MUTEXP = $ADDR
OWNER = $ADDR)?
- FLAGS = 0x490 # $] < 5.015 || !thr
- FLAGS = 0x1490 # $] >= 5.015 && thr
+ FLAGS = 0x490 # $] < 5.015 || !thr || $] >= 5.022
+ FLAGS = 0x1490 # $] >= 5.015 && thr && $] < 5.022
OUTSIDE_SEQ = \\d+
PADLIST = $ADDR
PADNAME = $ADDR\\($ADDR\\) PAD = $ADDR\\($ADDR\\)
@@ -324,8 +324,8 @@ do_test('reference to named subroutine without prototype',
RV = $ADDR
SV = PVCV\\($ADDR\\) at $ADDR
REFCNT = (3|4)
- FLAGS = \\((?:HASEVAL(?:,NAMED)?)?\\) # $] < 5.015 || !thr
- FLAGS = \\(DYNFILE(?:,HASEVAL(?:,NAMED)?)?\\) # $] >= 5.015 && thr
+ FLAGS = \\((?:HASEVAL(?:,NAMED)?)?\\) # $] < 5.015 || !thr || $] >= 5.022
+ FLAGS = \\(DYNFILE(?:,HASEVAL(?:,NAMED)?)?\\) # $] >= 5.015 && thr && $] < 5.022
COMP_STASH = $ADDR\\t"main"
START = $ADDR ===> \\d+
ROOT = $ADDR
@@ -335,8 +335,8 @@ do_test('reference to named subroutine without prototype',
DEPTH = 1(?:
MUTEXP = $ADDR
OWNER = $ADDR)?
- FLAGS = 0x(?:[c4]00)?0 # $] < 5.015 || !thr
- FLAGS = 0x[cd145]000 # $] >= 5.015 && thr
+ FLAGS = 0x(?:[c4]00)?0 # $] < 5.015 || !thr || $] >= 5.022
+ FLAGS = 0x[cd145]000 # $] >= 5.015 && thr && $] < 5.022
OUTSIDE_SEQ = \\d+
PADLIST = $ADDR
PADNAME = $ADDR\\($ADDR\\) PAD = $ADDR\\($ADDR\\)
@@ -734,8 +734,8 @@ do_test('FORMAT',
RV = $ADDR
SV = PVFM\\($ADDR\\) at $ADDR
REFCNT = 2
- FLAGS = \\(\\) # $] < 5.015 || !thr
- FLAGS = \\(DYNFILE\\) # $] >= 5.015 && thr
+ FLAGS = \\(\\) # $] < 5.015 || !thr || $] >= 5.022
+ FLAGS = \\(DYNFILE\\) # $] >= 5.015 && thr && $] < 5.022
(?: PV = 0
)? COMP_STASH = 0x0
START = $ADDR ===> \\d+
@@ -745,8 +745,8 @@ do_test('FORMAT',
DEPTH = 0)?(?:
MUTEXP = $ADDR
OWNER = $ADDR)?
- FLAGS = 0x0 # $] < 5.015 || !thr
- FLAGS = 0x1000 # $] >= 5.015 && thr
+ FLAGS = 0x0 # $] < 5.015 || !thr || $] >= 5.022
+ FLAGS = 0x1000 # $] >= 5.015 && thr && $] < 5.022
OUTSIDE_SEQ = \\d+
LINES = 0 # $] < 5.017_003
PADLIST = $ADDR
diff --git a/op.c b/op.c
index cab214a..a3d43c9 100644
--- a/op.c
+++ b/op.c
@@ -8690,10 +8690,10 @@ Perl_newATTRSUB_x(pTHX_ I32 floor, OP *o, OP *proto, OP *attrs,
CvFLAGS(PL_compcv) &= ~(CVf_SLABBED|CVf_WEAKOUTSIDE);
CvFLAGS(PL_compcv) |= other_flags;
- if (CvFILE(cv) && CvDYNFILE(cv)) {
+ if (CvFILE(cv) && CvDYNFILE(cv))
Safefree(CvFILE(cv));
- }
- CvFILE_set_from_cop(cv, PL_curcop);
+
+ //CvFILE_set_from_cop(cv, PL_curcop);
CvSTASH_set(cv, PL_curstash);
/* inner references to PL_compcv must be fixed up ... */
@@ -8769,6 +8769,22 @@ Perl_newATTRSUB_x(pTHX_ I32 floor, OP *o, OP *proto, OP *attrs,
slab = (OPSLAB *)CvSTART(cv);
#endif
CvSTART(cv) = start;
+ {
+ COP * startcop = (COP*) start;
+ //assert(startcop->op_type == OP_NEXTSTATE);
+ //if Perl_pmruntime is the caller of Perl_newATTRSUB_x, OP * start is a pp_qr not a pp_nextstate
+ //do {
+ // # In pre-5.9.5 world we have to do dirty tricks.
+ // # (we use 'our' rather than 'my' here, due to the rather complex and buggy
+ // # behaviour of lexicals with qr// and (??{$lex}) )
+ // our $trick1; # yes, cannot our and assign at the same time.
+ // $trick1 = qr{ \( (?: (?> [^()]+ ) | (??{ $trick1 }) )* \) }x;
+ // our $trick2 = qr{ (?> (?! \d) \w+ (?:$trick1)? ) (?:\s*\:\s*|\s+(?!\:)) }x;
+ // qr{ \s* : \s* (?: $trick2 )* }x;
+ //};
+ if(startcop->op_type == OP_NEXTSTATE)
+ CvFILE_set_from_cop(cv, startcop);
+ }
CALL_PEEP(start);
finalize_optree(CvROOT(cv));
S_prune_chain_head(&CvSTART(cv));
@@ -9165,7 +9181,11 @@ Perl_newSTUB(pTHX_ GV *gv, bool fake)
}
else cvgv = gv;
CvGV_set(cv, cvgv);
- CvFILE_set_from_cop(cv, PL_curcop);
+// CvFILE_set_from_cop(cv, PL_curcop); //DYNFILE here?
+
+ CvFILE(cv) = savepv(CopFILE(PL_curcop));
+ CvDYNFILE_on(cv);
+
CvSTASH_set(cv, PL_curstash);
GvMULTI_on(gv);
return cv;
--
1.7.9.msysgit.0
|
From @bulk88Father C reported the same issue in https://rt.perl.org/Public/Bug/Display.html?id=117855 -- |
From @bulk88On Wed Jul 08 22:10:42 2015, bulk88 wrote:
I push a concept of the more permanent solution to CvFILE/GvFILE/CopFILE memory burn in http://perl5.git.perl.org/perl.git/shortlog/refs/heads/smoke-me/bulk88/rt125296-wip-CopFILE-on-threads . Comments appreciated. Currently I am calling the new data type a CHEK (compiling hek), as in "check". The other name I thought of was a FHEK (filename hash entry key), but it doesn't sound good to pronounce as in "FHEK off you motherFHEKers". PHEK (path hash entry key) isn't any better than FHEK to pronounce. PHKS (path hash key shared) also was rejected. SPHEK (shared path hash entry) sounds like "spec" and cause confusion in speech. I discovered term SHEK (shared HEK) is already kindda take by the HEKs stored in PL_strtab and would could very much confusion between a "Shared HEK" and a "shared HEK". I am debating changing Perl_gv_fetchfile_hek to take a CHEK "char *" instead of a HEK * since Perl_gv_fetchfile_hek is so specialized and not public it might as well be more specialized and save the +/- asm ops backing up the char * to a HEK *, in the caller's CopFILEGV/CopFILESV/CopFILEAV macros. I am also thinking of removing GvFILE_HEK would also be mathomed or outright removed or PERL_CORE-ed since it would return HEK * with the "_<" prefix. As a very inefficient back compat, it would be made non-lvalue and be #define GvFILE_HEK(gv) SvSHARED_HEK_FROM_PV(SvPVX_const(sv_2mortal(newSVpvn_share(HEK_KEY(GvFILE_HEK2(gv)+2, HEK_LEN(GvFILE_HEK2(gv))-2, 0)) About PL_strtab, I think that making PL_strtab shared between ithreads would cause perf problems due to the high volume of lock activity but IDK how to prove that but clone/psuedofork would be slightly faster since all HEKs would just be refcnt++ed during the clone/psuedofork and not looked up in the ptr equivelence hash between the 2 interps. Also slightly more memory shared between the 2 interps (almost all hash key names would be shared). Deleting %INC and reloading a module is so rare that I think its pointless to de-dup the CHEKs in a big central DB. One CHEK * set by S_open_script/pp_require spreads to all the COPs and CVs (and in future GVs) created by that src code file. In theory I think pp_entereval/S_incline need de-duping but "#line" is very rarely used. But with my patch the worst case scenario (1 mem alloc for file name per COP) with pp_entereval/S_incline is identical to the current situation. -- |
From @iabynOn Tue, Feb 16, 2016 at 01:43:07PM -0800, bulk88 via RT wrote:
I feel I would better be able to provide feedback if you were to give an How does your solution work, and how (if at all) does it interact with or -- |
From @bulk88On Wed Feb 17 01:28:34 2016, davem wrote:
http://perl5.git.perl.org/perl.git/commitdiff/345d4464184a234301341596f73ee2d3550c9799#patch8 +/* A CHEK is a derived class of HEK, and struct shared_he is also a derived
There are no changes to public API except removal of GvFILE_HEK macro. HEKs are not public API, and have never been mentioned in perlguts. Therefore CHEKs are not public API. I did not touch any of the structs of unthreaded perl on purpose. Perhaps one day the threaded CHEK system will be used on unthreaded perl, without any lock acquire/release cycles done on the refcount of the CHEK, and the CHEK being allocated in per interp memory with unthreaded perl obviously. PP __FILE__, CopFILE, CvFILE on const subs, GvFILE on GP *s, and PP caller all use CHEK *s now in threaded perl. In a perl proc memory dump, you wont find a "_<"-less path string in the process that is identical to a "_<" prefixed path string. I caught all of them and converted them to CHEKs. CHEKs DO NOT live in PL_strtab due my fears over lock contention between different ithreads. I'd have to lock all operations with the PL_strtab hash or have many fine scope lockable HvARRAY elements and then a master https://msdn.microsoft.com/en-us/library/windows/desktop/aa904937%28v=vs.85%29.aspx (and port that MS code to unix) lock to control hash splitting. Some OSes, especially unixy ones do all locks with kernel mode transitions, user mode locks using atomic CPU features are rare or unheard of unlike MS land. There is no de-duping of CHEKs from a PV arg and LEN arg like with SHEKs. http://perl5.git.perl.org/perl.git/shortlog/refs/heads/smoke-me/bulk88/rt125296-wip-COPFILE-threads branch is now almost, or is feature complete (last issue is static or static const HEKs and CHEKs for B::CC style compiled code I need to bring up in a separate P5P ML thread and that involves discussion P5P's hash seed design and I dont think CHEKs or seed changes will get in for 5.24 since there is a code freeze for user visible (but CHEKs aren't visible, maybe there is hope), but seed changes will have drama). The branch is missing all abstraction macros, and proper names for the APIs and macros. My CS names are poor everywhere so I dont want to name things just for it to be suggested to something else. When I cant think of var names, I start naming macros and vars after my cats. ppl feel free to suggest API names and macro names. Also should the code be "#ifdef USE_ITHREADS" or a different name all together, so you can use it on threaded and unthreaded as a build option if you choose. I still dont know whether to call my API a CHEK, FHEK, PHEK, PHKS, or SPHEK. -- |
From @bulk88On Fri Feb 26 04:01:10 2016, bulk88 wrote:
test prog: perl -MTest::Harness -MTest::More -e"sleep 1000" summary: 8-12 ms of CPU are saved in PP compile time and 280 KB of heap memory raw data: after (head of my branch) 5216 KB Private Bytes (heap-ish memory on Win32) C:\p523\src>timeit -f t.dat perl -MTest::Harness -MTest::More -e"0" Version Number: Windows NT 6.1 (Build 7601) C:\p523\src>timeit -f t.dat perl -MTest::Harness -MTest::More -e"0" Version Number: Windows NT 6.1 (Build 7601) C:\p523\src>timeit -f t.dat perl -MTest::Harness -MTest::More -e"0" Version Number: Windows NT 6.1 (Build 7601) C:\p523\src> b4 (last blead commit) 5496 KB Private Bytes (heap-ish memory on Win32) C:\p523\src>timeit -f t.dat perl -MTest::Harness -MTest::More -e"0" Version Number: Windows NT 6.1 (Build 7601) C:\p523\src>timeit -f t.dat perl -MTest::Harness -MTest::More -e"0" Version Number: Windows NT 6.1 (Build 7601) C:\p523\src>timeit -f t.dat perl -MTest::Harness -MTest::More -e"0" Version Number: Windows NT 6.1 (Build 7601) C:\p523\src> -- |
From @bulk88I am still waiting for feedback, or everyone agree with the design so they are staying quiet? Unthreaded perl, which my branch doesn't touch the bahavior of, also has a duplicate .pm paths in memory problem, but not as big as threaded. But it is big enough that I think the CHEK API needs to also be on unthreaded. I've attached a fitlered strings dump of an unthreaded perl process "perl -MTest::Harness -MTest::More -e"sleep 1000"". I suspect __FILE__ and CvFILE are the sources of the file paths. Fixing __FILE__ to COW means implementing the "ptr & ~0x3" thing on unthreaded to strip the _< from the CHEK COW SV, there is no way around that because of CopFILE's design on unthreaded, even if the COP still stores a GV * instead of a CHEK * on unthreaded in the potential future design, "ptr & ~0x3" thing must be implemented. # define CopFILE(c) (CopFILEGV(c) \ CvFILE is the "each const sub use CvDYNFILE/unique Newx path" problem. Storing a CHEK * or a HEK * in CvFILE instead of Newx * is the only way to fix that. So it really seems like the CHEK API needs to be implemented on unthreaded, and toss larry's "GV * cop_filegv;" member forever from 5.0 alpha 2 at the expensive of making the perl debugger every slightly slower than before (or downgrade the "speed" of the unthreaded perl debugger to the threaded perl debugger? does speed even matter for the perl debugger since it waits for human responses most of the time?). Perl 4's COP struct was struct cmd { I believe a STAB is now called a GV and STR is now a SV. -- |
From @bulk88C:/p523/src/lib/TAP/Parser/Aggregator.pm |
From @bulk88On Wed Mar 02 13:35:22 2016, bulk88 wrote:
I added a commit to the branch that uses CHEK API on unthreaded. perl -MTest::Harness -MTest::More -e"sleep 1000" b4 4812 KB So 8 KB is saved. It is a little, not very much, but it is something. Its not as big as a savings as I thought it would be. -- |
From @bulk88On Fri Mar 04 01:05:24 2016, bulk88 wrote:
There were some optimizations I forgot to do in the unthreaded build, so 20KB are saved, not 8KB, 5140 kb b4 -- |
From @rjbsThis ticket could sure use some replies from technical eyes. My understanding — and please correct me if I'm wrong, bulk88 — is that http://perl5.git.perl.org/perl.git/commitdiff/345d4464184a234301341596f73ee2d3550c9799 (roughly) gets us a few kilobytes of heap saved per file loaded. To me, this seems to multiply the total number of entity types in the implementation without sufficiently benefiting from the increase in complexity. That said, the implementation is a place where I am much more likely to listen to outside opinions. ;) So, please, readers: speak up. -- |
From @tonycozOn Wed Mar 02 13:35:22 2016, bulk88 wrote:
I think it makes sense, at least for threads. I do think the ref counting functions need names that indicate what they're incing/deccing: p |void |chek_inc |NN CHEK * chek maybe chek_ref_(inc|dec). This code from newchek() seems like it could be more efficient: /* was alloca */ Rather than creating an SV then growing it, why not just allocate For use in non-threads, I suspect code that uses a lot of #line directives (generated code, eg. by Mason) could result in an increase in memory I don't think the changes belong in 5.24. Tony |
From @tonycozOn Tue Mar 08 21:23:04 2016, tonyc blathered:
Forgot to mention, this introduced new warnings to the build: n file included from perl.h:3904:0, |
From @iabynOn Fri, Feb 26, 2016 at 04:01:11AM -0800, bulk88 via RT wrote:
I'm sorry, I'm finding your descriptions *really* hard to follow. What I would really appreciate is an overall "big picture" description As I understand it, the issue is that at the moment on threaded builds, So you have a solution that aims to reduce this duplication, by sharing This sharing is done using a structure called a CHEK, which is basically So some questions which immediately occur to me: * How is the de-duplication done? * how is change of filename handled, e.g. C<#line 100 "foo">? * HEK's are normally stored in hashes. Are CHEKs stored in hashes too? * If so, what hash, and where is this hash stored? * Also if so, how is this hash accessed across threads? * If not, why is it a CHEK rather than something unassociated with hashes? -- |
From @bulk88On Wed Mar 09 08:40:44 2016, davem wrote:
Correct.
Correct.
Yes. It is derived from a HEK. It is smaller/lighter weight than the existing "struct shared_he" HEK and was inspired by the "struct shared_he" HEK.
Ref counting increase of an existing resource (CHEK) pointer. There is no 100% deduping because PL_strtab is NOT used (but PL_strtab isn't shared between ithreads today anyway). There is a way to defeat the current deduping code, but I feel it is pointless to fix this since this is so rare, and the worst case scenario is a regression to the current 1 malloc-ed path per COP situation today. To defeat the deduping, "require Foo;", then delete the %INC entry for Foo, then replace Foo.pm on disk with a different contents/different sub names file, also named Foo.pm, then "require Foo;" again. Now there are 2 or more COPs in the process, with identical filenames, but different CHEK *s.
Existing CopFILE_free/CopFILE_set/CopFILE_setn calls in core modifying PL_curcop or PL_compiling.
No. PL_strtab is not used. A CHEK * will never be in a HE *. A CHEK * does know its hash number for faster *{'_<Foo.pm'} aka gv_fetchfile* execution, but the CHEK never gets store in a HE *. Perl core already has a HEK that is not stored in PL_strtab called HVhek_UNSHARED flag. I took inspiration from HVhek_UNSHARED to create a CHEK.
My wild guess is putting a mutex around PL_strtab would cause performance problems, that is why I didn't try to make PL_strtab shared between ithreads. All ithreads inherit the hash seed of their parent, so hash numbers for 1 particular string can be shared between 2 ithreads, PL_hash_seed_set/PL_hash_seed are per-interp globals, but they behave almost like per-process globals (I can only see 2 different hash seeds in 1 process under the following situation, 2 different shared libraries both embedding perl starting up 2 interps with perl_alloc/perl_parse/perl_run, but in that case, no pointers will ever cross between the 2 perl interps.
"struct refcounted_he" is a strange little ****SERIALIZATION**** format for serializing a HV* into a linked list structure shared between threads. "struct refcounted_he" can't be COWed or the string buffers stored in in SV *s. It is too bloated (storing IVs, etc, we only need strings), too specialized (linked list stuff). So "struct refcounted_he" just doesn't work. HEKs are an immutable string type and fit the bill. Once a HEK * is created, it will never change. A HEK knows its length, and knows its hash number (gives a lil perf boost to for gv_fetchfile, which is now gv_fetchfile_hek). struct shared_he, which contains an inline HEK, had a reference count member. So a CHEK * is a cut down "struct shared_he *". The CHEK fits nicely for the design requirements of the CopFILE problem. Design Requirements: -must know its hash number for gv_fetchfile
To speed up the indirection that gv_fetchfile/CopFILEGV/CopFILESV/CopFILEAV function/macro calls use under ithreads since GV *s (and SV*s) can't be stored in the optree. On threads, constant SVs are also stored with indirection, with pad offsets in the OP struct, instead of an SV * in an OP struct (as on unthreaded). -- |
From @tonycozOn Wed Mar 09 13:33:31 2016, bulk88 wrote:
You can get multiple CHEKs for the same name with #line directives too, which was the reason I suggested it was a bad idea for non-threaded builds. It's still a big improvement for threaded builds in the general case. The attached code demonstrates the different CHEKs for the same name. While this is unlikely for most use written code, it's a problem for code generated by a pre-processor or translator, common for templating tools. Perhaps adding a per-thread hash of the CHEKs would reduce that duplication. Tony |
From @tonycoz |
From @timbunceOn Tue, Mar 08, 2016 at 06:17:54PM -0800, Ricardo SIGNES via RT wrote:
And a saving per string eval? Presumably related... this is something that dismays me everytime I run strings on a core file: $ perl -MMoose -e 'system("gcore $$ && strings core.$$")'| grep Moose | sort | uniq -c | sort -rn | head Why are there hundreds of copies of those paths? Tim. p.s. That was perl 5, version 20, subversion 2 (v5.20.2) built for x86_64-linux-thread-multi p.p.s. I'm not picking on Moose here, it's just a handy example. $ perl -MMoose -e 'system("gcore $$ && strings core.$$")'| sort | uniq -c | sort -rn |head 2832 copies of "UTF-8"?! |
From @bulk88On Tue Mar 08 21:23:04 2016, tonyc wrote:
Wouldn't "chek_ref_inc" means a CHEK **?
I rewrote Perl_newchek() in the branch into its final version. Only 1 buffer, the final CHEK *, is ever allocated as you suggested.
On Wed Mar 09 15:36:14 2016, tonyc wrote:
I fixed most of that #line problem in commit "dont realloc CopFILE if filename stays the same in "#line 1234 "file.pm""" use Inline C => <<'EOS'; dump_cc_file(); #line 10 "copfileheks.pl" dump_cc_file(); #line 15 "copfileheks.pl" #line 16 "copfileheks.pl" #line 17 "second.pl" #line 18 "copfileheks.pl" cop 00E0F278 0036CEEA t.pl "Perhaps adding a per-thread hash of the CHEKs would reduce that duplication." If I/someone really wants to implement that, there is a way. Current without The other choice for aggressive deduping for random filenames in #line is modify PL_strtab so the HE and HEK aren't the same malloc block anymore, or stuff a flag (turn hent_refcount into a bitfield and steal a bit) or something into the HE that the HEK doesn't follow the HE for that strtab key/hash entry (perhaps NULL for the HEK). This also means that CHEKs can wind up in regular hash HE structs now. There would also have to be a mechanism for upgrading a SHEK in PL_strtab to a CHEK, or detaching a SHEK from PL_strtab and putting a CHEK in its place. The SHEK * could potentially have references from existing HEs, so it can't be dealloced. The HEK * to HEK * comparison HV linked list search code loop in Perl_hv_common might make it impossible to have any "upgrade" concept of a SHEK * to a CHEK *. Another idea is all SHEKs become CHEKs allocated in shared memory pool, SHEKs as they are known today are eliminated on threaded perl. Atomic operations are used on the refcount of the CHEK to avoid contention on OP_REFCNT_LOCK/OP_REFCNT_UNLOCK mutex on every hash entry create/delete operation. PL_strtab stays per interp but the CHEK and PL_strtab HE are 2 different mem blocks now. ithreads would take less memory when they copy the world since hash keys CHEKs are just ++ed during the clone, not new malloc blocks. I like choice 1 if aggressive deduping is a design requirement (every #line in the file has a different filename than the last #line) and I can see it easily being implemented. -- |
From @bulk88On Thu Mar 10 06:45:07 2016, timbo wrote:
Are you using threaded or unthreaded perl? If you are using threaded perl, you are now the 2nd reporter of the "bug" described in this ticket after me ;) -- |
From @timbunceOn Fri, Mar 11, 2016 at 12:05:30PM -0800, bulk88 via RT wrote:
I recall reporting something like this in 2012 around the time I worked on Devel::SizeMe I thought that the specific issue shown up by the Devel::SizeMe Either way, there's still clearly a problem. Tim. |
From @cpansproutOn Fri Mar 11 12:32:20 2016, timbo wrote:
It was ‘fixed’ for a while in blead, but some smokers started crashing. I realised my approach was fundamentally flawed and reverted it. I believe your ticket it still open, but I do not remember the number offhand. -- Father Chrysostomos |
From @bulk88On Thu Mar 10 06:45:07 2016, timbo wrote:
On my CopFILE branch perl (currently SHA-1: 304db02, with commit "dont realloc CopFILE if filename stays the same in "#line 1234 "file.pm"""), threaded, with prog "perl -MMoose -e "sleep 1000"", core dump of it, strings -n16, then grep and sort for all lines containing ".pm", I got the following file which I attached. It seems Moose's Eval::Closure dep is using the filepath part of #line to store arbitrary data (wrong prototype warning message) :-/ It looks like there are 2-4 identical CHEKs ("_<" prefixed strings) being created, and 1 missed COW opportunity for each Eval::Closure sub. I will need to investigate this a bit more. -- |
From @bulk88 -V[:variable] print configuration summary (or a single Config.pm variable) |
From @bulk88On Fri Mar 11 20:30:24 2016, bulk88 wrote:
Not a problem anymore. There are only 2 identical _< strings. One is the CHEK and the other is a SHEK in PL_strtab. I can separate them in a hex core dump by looking at HEK_FLAGS byte, 0x00 is SHEK, and 0x0C (I think) is a CHEK.
Fixed in commit "#line 1234 "bar_filename"'s dbging glob's scalar should be a CHEK COW SV". -- |
From @iabynOn Wed, Mar 02, 2016 at 01:35:22PM -0800, bulk88 via RT wrote:
Ok, I spent most of yesterday looking at the rt125296-wip-COPFILE-threads My biggest problem with the way its currently implemented is that Perl's hash implementation is already notoriously complex, and I would IIUC, most of the filename duplication was caused by copies of names being It's the stuff where "<_foo.pm" is stored as a GV and a GvSV in symbol I would like (if its possible) for the GvSV be be created using *copy* If we end up with two or three copies of a filename stored per compiled If the same implementation can be used on both threaded and non-threaded If CHEKs do end up only being used a refererence-counted shared Also, I would prefer that struct fields like cop_file are declared as Finally, thanks for working on this, and I'm sorry that I disagree with -- |
From @bulk88On Wed Mar 23 04:01:22 2016, davem wrote:
There is no way to introduce a new COW type into SVPVs. The SVPV head and body have no free bits/free permutations/storage to encode anything more. AFAIK SVf_IsCOW && SvLEN == 0 = HEK So using the HEK case is the only way to add a new COW type.
Right. I could try to "clean up" the perl debugger infrastructure and remove "unused" features but I didnt, because Im too afraid to do it due to a lack of familiarity and ability to unit/human test the perl debugging infrastructure after I change it.
That sounds inefficient memory usage wise. __FILE__ badly needs COW CHEKs. caller(), %INC values, and GvSV of the glob also benefit.
With the CHEK design, there will be only 2 copies of the filepath per perl process, one a CHEK and one a SHEK in PL_strtab (the debug glob's HE in main::).
commit "WIP CHEKs on unthreaded perl" does that, I need test unthreaded with CHEKs with -MMoose and IDK what else to try as a benchmark or measurement criteria. There are some provisions that if GCC/POSIX/Whatever atomic APIs are added to perl, some of the current mutexes would be replaces by atomic operations, and then CHEKs and SHEKs are absolutely identical in ++ and -- operations (always do the atomic CPU instruction whether the HEK * is from shared mem pool or per interp pool), only at freeing time does it make a difference which pool to return the ptr to.
I'd still keep the hash number stored in them with binary compatibility with a HEK * but it won't be obvious anymore (IDC, unless you are debugging the CHEK code, you wont know or care where that ptr came from, you just wanna know where perl crashed). Do you have a better name than "CHEK"? It needs to be kinda pronounceable if you say it on stage at a podium. I need the abbreviation and the full length expansion too obviously.
Are you sure you really want to see that as a perl C dev? Its more things to type or click everytime in a C debugger UI to "open" the struct pointer vs just reading the char []. Unthreaded builds require 3 extra clicks to figure out what the PP filename is (GV *) in a crashed/suspended perl proc (curcop), threaded builds are more convenient that its just a char * in the COP vs drilling through a GV head and into the GV body. I can change it so all the perl structs are declared with CHEK *s and not char *s if everyone else wants it that way. -- |
From @timbunceOn Wed, Mar 23, 2016 at 09:41:10AM -0700, bulk88 via RT wrote:
FWIW, I'd be happy enough with reducing hundreds or thousands of copies Tim. |
From @iabynOn Wed, Mar 23, 2016 at 09:41:10AM -0700, bulk88 via RT wrote:
I'm saying that (I think) there's no need to embed SHEKs into SVs: copy Running this code: perl -MMath::BigInt -e'dump' and greppping for the string "lib/Math/BigInt.pm" in the core file, gives 1532 blead I think that counts as good enough, and avoids having to pollute all the
Remember that compiling a source file creates thousands of OPs, SVs, GVs
Pronouncability isn't a major concern. I'd suggest perhaps. struct cop_filestr. It's used so rarely that it probably doesn't even need a typedef. If at some point in the future the need for a ref-counted shared string
-- |
From @iabyn0001-sv_sethek-newSVhek-copy-rather-than-embed.patchFrom e545037d18e839484d1f62b58e8fb71bef43d501 Mon Sep 17 00:00:00 2001
From: David Mitchell <davem@iabyn.com>
Date: Thu, 24 Mar 2016 09:11:23 +0000
Subject: [PATCH] sv_sethek,newSVhek copy rather than embed
---
sv.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/sv.c b/sv.c
index f7c1495..a5f3f95 100644
--- a/sv.c
+++ b/sv.c
@@ -4715,6 +4715,7 @@ Perl_sv_setsv_flags(pTHX_ SV *dstr, SV* sstr, const I32 flags)
HEK * hek = SvSHARED_HEK_FROM_PV(SvPVX_const(sstr));
U8 offset = (Size_t)SvPVX_const(sstr) & 0x3;
SvPV_set(dstr, HEK_KEY(hek)+offset);
+ assert(!(HEK_FLAGS(hek) & HVhek_COMPILING));
if(HEK_FLAGS(hek) & HVhek_COMPILING) {
CHEK * chek = FNPV2CHEK(HEK2FNPV(hek));
chek_inc(chek);
@@ -4837,6 +4838,7 @@ Perl_sv_setsv_cow(pTHX_ SV *dstr, SV *sstr)
{
HEK * hek = SvSHARED_HEK_FROM_PV(SvPVX_const(sstr));
U8 offset = (Size_t)SvPVX_const(sstr) & 0x3;
+ assert(!(HEK_FLAGS(hek) & HVhek_COMPILING));
if (HEK_FLAGS(hek) & HVhek_COMPILING) {
CHEK * chek = FNPV2CHEK(HEK2FNPV(hek));
chek_inc(chek);
@@ -5009,8 +5011,9 @@ Perl_sv_sethek(pTHX_ SV *const sv, const HEK *hek)
sv_usepvn_flags(sv, as_utf8, utf8_len, SV_HAS_TRAILING_NUL);
SvUTF8_on(sv);
return;
- } else if ((flags & (HVhek_COMPILING | HVhek_UNSHARED)) == HVhek_UNSHARED) {
- sv_setpvn(sv, HEK_KEY(hek), HEK_LEN(hek));
+ } else if (flags & HVhek_UNSHARED) {
+ Size_t off = (flags & HVhek_COMPILING) ? 2 : 0; /* skip "_<" */
+ sv_setpvn(sv, HEK_KEY(hek) + off, HEK_LEN(hek) - off);
if (HEK_UTF8(hek))
SvUTF8_on(sv);
else SvUTF8_off(sv);
@@ -5020,6 +5023,7 @@ Perl_sv_sethek(pTHX_ SV *const sv, const HEK *hek)
SV_CHECK_THINKFIRST_COW_DROP(sv);
SvUPGRADE(sv, SVt_PV);
SvPV_free(sv);
+ assert(!(flags & HVhek_COMPILING));
if (flags & HVhek_COMPILING) {
CHEK * chek = FNPV2CHEK(HEK2FNPV(hek));
chek_inc(chek);
@@ -9319,12 +9323,13 @@ Perl_newSVhek(pTHX_ const HEK * hek)
sv_usepvn_flags(sv, as_utf8, utf8_len, SV_HAS_TRAILING_NUL);
SvUTF8_on (sv);
return sv;
- } else if ((flags & (HVhek_COMPILING | HVhek_UNSHARED)) == HVhek_UNSHARED) {
+ } else if (flags & HVhek_UNSHARED) {
/* A hash that isn't using shared hash keys has to have
the flag in every key so that we know not to try to call
share_hek_hek on it. */
- SV * const sv = newSVpvn (HEK_KEY(hek), HEK_LEN(hek));
+ Size_t off = (flags & HVhek_COMPILING) ? 2 : 0; /* skip "_<" */
+ SV * const sv = newSVpvn (HEK_KEY(hek) + off, HEK_LEN(hek) - off);
if (HEK_UTF8(hek))
SvUTF8_on (sv);
return sv;
@@ -9337,6 +9342,7 @@ Perl_newSVhek(pTHX_ const HEK * hek)
new_SV(sv);
sv_upgrade(sv, SVt_PV);
+ assert(!(flags & HVhek_COMPILING));
if (flags & HVhek_COMPILING) {
CHEK * chek = FNPV2CHEK(HEK2FNPV(hek));
chek_inc(chek);
--
2.4.3
|
From @bulk88On Thu Mar 24 05:07:40 2016, davem wrote:
Your patch increased mem usage from 10848 KB to 10944 KB Win32 private bytes for "perl -MMoose -e"sleep 1000"". That is 96 KB more memory to load Moose. I disagree with it, the SHEK must be storable in a SV. 672 KB is "perl -e"sleep 1000"", so (10944-676)/(10848-676)=1.00943767204, you patch causes a 1% increase in memory usage of loading moose.
The copies are permanent. the SV behind __FILE__ token for example. Its 96 KB for Moose to have these extra copies in unique malloc storage.
gv_fetchfile will fetch the same exact glob over and over. This design isn't changing.
I think it does CFSTR (sefster) Cop Filepath String, or COPSTR Cop String?
-- |
This comment has been minimized.
This comment has been minimized.
From @iabynOn Sun, Mar 27, 2016 at 10:01:03PM -0700, bulk88 via RT wrote:
The attached proof-of-concept patch (fails a couple tests) reduces
I have no idea what any of that sentence means. -- |
From @iabyn0001-dedup-__FILE__.patchFrom d6f66bc3abdca6902cefe1e3cb4877452754ef1e Mon Sep 17 00:00:00 2001
From: David Mitchell <davem@iabyn.com>
Date: Mon, 28 Mar 2016 17:33:15 +0100
Subject: [PATCH] dedup __FILE__
---
toke.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/toke.c b/toke.c
index 9f78d5a..f81c51d 100644
--- a/toke.c
+++ b/toke.c
@@ -7206,9 +7206,12 @@ Perl_yylex(pTHX)
}
case KEY___FILE__:
- FUN0OP(
- (OP*)newSVOP(OP_CONST, 0, newSVhek((HEK*)((Size_t)FNPV2HEK(CopFILE(PL_curcop))+2)))
- );
+ {
+ GV *gv = gv_fetchfile_hek((HEK*)FNPV2HEK(CopFILE(PL_curcop)));
+ SV *sv = GvSV(gv);
+ OP *o = (OP*)newSVOP(OP_CONST, 0, SvREFCNT_inc_simple_NN(sv));
+ FUN0OP(o);
+ }
case KEY___LINE__:
FUN0OP(
--
2.4.3
|
From mcarey@ucar.eduOn Mon, Mar 28, 2016 at 10:42 AM, Dave Mitchell <davem@iabyn.com> wrote:
Attempted translation: I think it does need a typedef, e.g. CFSTR (pronounced "sefster"), stands for "Cop Filepath String" or COPSTR, stands for "Cop String" |
Dave Mitchell disagrees with the new COW type being storable inside SVs because "I dont want to see HV code changed (because bulk88's primarily filename COW type is a derivative of a shared hash key infrastructure) because dont fully understand it myself and Ill understand it even less if there are modifications to it". Dave wants, if I understand correct, that the filename buffers, if they are exposed in PP, they will be copied into new SV * as 1 off malloc buffers, rather than any attempt at COW or many SV*s and many not PP exposed structs hold refcnt ownership on the same unique filename malloc buffer. My argument is basing the COWed filename strings off SVs directly is impossible because SVs can't encode any more COW or non-1 to 1 malloced string buffers. I decided on some of the struct names but I am bad at naming structs and wanted someone to agree or disagree on my suggested struct names. Since nobody but me and Dave ever read this ticket, or understands perl guts to comment on it, it stalled. I should finish it one day, its is the fastest to implement and largest memory win in the interp I can think of off the top of my head, other than dropping a member from a core high frequency used struct (automatic 4/8/16 byte win). I stopped working on it, because I didn't understand myself the impact the patch would have on B::CC and cperl and a thought I had for a long time of writing out CV optrees to memory mapped files (B::CC is really badly designed, it generates C files which must be compiled with a platform specific CC, rather than dumping optree linked lists straight to disk with a pointer relocation list) and I didn't understand myself if the SHEK/CHEK struct format can be stored in RO memory or not. AFAIK any attempts to memory map file PP subs/optrees to disk files would require non random hash seeds for largest shared between perl procs memory win. Otherwise all the hash es stored on disk have to have new hash keys written into HEKs and linked lists and HV arrays reoraganized on "serialized module load" into a perl proc. |
Much of what you describe related to serialization is already available via cPanel’s B::C https://github.com/cpanel/perl-compiler. It is not currently available on CPAN primarily because it requires an unthreaded perl with several patches to Perl related to memory. Both of these can be overcome if there is interest from someone besides cPanel. At this point we have made the strings for sheks static but all hash arrays have to be allocated and initialized on startup in order for hash randomization to work correctly. Our conclusion at this point is that B::CC will never be achievable without essentially re-implementing libperl. What would be more useful is if a module at CHECK did peephole-ish optimizations to the op tree before it was handed to B::C. It could take as long as it likes since the code is serialized for rapid re-run later. This would also be super useful for daemonized processes like catalyst, dancer, mojo, etc. who don’t care (within reason) how slow a process is to compile. We are available on IRC if you would like to learn more about what has already been done for B::C |
Migrated from rt.perl.org#125296 (status was 'open')
Searchable as RT125296$
The text was updated successfully, but these errors were encountered: