Skip Menu |

Subject: document/publicize THINKFIRST
Date: Mon, 25 Feb 2013 11:05:50 -0500
To: perlbug [...] perl.org
From: Ricardo Signes <rjbs [...] cpan.org>
Download (untitled) / with headers
text/plain 440b
A number of things broken by the new COW work were answered with, "By the way, you need to use THINKFIRST there." This, in turn, was followed by, "What? That's undocumented and how was I supposed to know that?" The third line in this exchange, if any, was "Welcome to the perl guts!" It's true enough, but if it's going to be needed, let's get this documented so we can point people at the documentation when the code breaks. -- rjbs
Download signature.asc
application/pgp-signature 490b

Message body not shown because it is not plain text.

Subject: Re: [perl #116925] document/publicize THINKFIRST
Date: Mon, 25 Feb 2013 10:32:53 -0600
To: perl5-porters [...] perl.org
From: Jesse Luehrs <doy [...] tozt.net>
Download (untitled) / with headers
text/plain 676b
On Mon, Feb 25, 2013 at 08:06:11AM -0800, Ricardo SIGNES wrote: Show quoted text
> A number of things broken by the new COW work were answered with, "By the way, > you need to use THINKFIRST there." This, in turn, was followed by, "What? > That's undocumented and how was I supposed to know that?" The third line in > this exchange, if any, was "Welcome to the perl guts!" > > It's true enough, but if it's going to be needed, let's get this documented so > we can point people at the documentation when the code breaks.
Also, the core typemap for char* (not const char*) needs to be updated to use THINKFIRST, since apparently people expect it to produce a writable string buffer. -doy
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 1019b
On Mon Feb 25 08:33:29 2013, doy@tozt.net wrote: Show quoted text
> > Also, the core typemap for char* (not const char*) needs to be updated > to use THINKFIRST, since apparently people expect it to produce a > writable string buffer. > > -doy >
perlxs and perlxstypemap needs to be fixed. It uses "bool_t rpcb_gettime(const char *host, time_t *timep);" as an example, but in the XS code it uses "char *" everywhere. It needs to be written in perlguts or perlxs to advise the reader to always use const char * and SvPV_const() if possible for performance reasons. Also a new T-* type needs to be created for non-de-cowed PVs. I randomly throw out T_CPV as a suggestion. I am not sure what exactly needs to be added to perlguts. As a separate issue, "Memory Allocation" in perlguts should mention sv_usepvn_flags exists, and the pod for sv_usepvn_flags needs to say it only takes "Newx" memory never any kind of other memory like "malloc" memory. This ticket also needs to be linked to #116569. -- bulk88 ~ bulk88 at hotmail.com
CC: bugs-bitbucket [...] rt.perl.org
Subject: Re: [perl #116925] document/publicize THINKFIRST
Date: Thu, 21 Mar 2013 13:15:00 +0100
To: perl5-porters [...] perl.org
From: Leon Timmermans <fawaka [...] gmail.com>
Download (untitled) / with headers
text/plain 793b
On Mon, Feb 25, 2013 at 5:06 PM, Ricardo SIGNES <perlbug-followup@perl.org> wrote: Show quoted text
> A number of things broken by the new COW work were answered with, "By the way, > you need to use THINKFIRST there." This, in turn, was followed by, "What? > That's undocumented and how was I supposed to know that?" The third line in > this exchange, if any, was "Welcome to the perl guts!" > > It's true enough, but if it's going to be needed, let's get this documented so > we can point people at the documentation when the code breaks.
Father C already documented SV_THINKFIRST in a11eaecb, though I think he was too optimistic (it's not just needed on SvPVX, but on any SvPV* write) and not marked as API. SV_CHECK_THINKFIRST and SV_CHECK_THINKFIRST_COW_DROP still need to be documented though. Leon
CC: perl5-porters [...] perl.org, bugs-bitbucket [...] rt.perl.org
Subject: Re: [perl #116925] document/publicize THINKFIRST
Date: Fri, 22 Mar 2013 16:56:26 +0000
To: Leon Timmermans <fawaka [...] gmail.com>
From: Dave Mitchell <davem [...] iabyn.com>
On Thu, Mar 21, 2013 at 01:15:00PM +0100, Leon Timmermans wrote: Show quoted text
> On Mon, Feb 25, 2013 at 5:06 PM, Ricardo SIGNES > <perlbug-followup@perl.org> wrote:
> > A number of things broken by the new COW work were answered with, "By the way, > > you need to use THINKFIRST there." This, in turn, was followed by, "What? > > That's undocumented and how was I supposed to know that?" The third line in > > this exchange, if any, was "Welcome to the perl guts!" > > > > It's true enough, but if it's going to be needed, let's get this documented so > > we can point people at the documentation when the code breaks.
> > Father C already documented SV_THINKFIRST in a11eaecb, though I think > he was too optimistic (it's not just needed on SvPVX, but on any SvPV* > write) and not marked as API. SV_CHECK_THINKFIRST and > SV_CHECK_THINKFIRST_COW_DROP still need to be documented though.
Also, Its not just adding API documentation. The xs*.pod and/or perlguts docs need updating so people know they need to use these. -- You never really learn to swear until you learn to drive.
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 252b
I have moved this from 5.18 to 5.20 blockers. It sounds like this needs more than mere documentation, and we're not going to gain much by documenting it now, encouraging its uptake, then changing how one must use it or its future analogue. -- rjbs
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 460b
On Tue Mar 26 13:55:32 2013, rjbs wrote: Show quoted text
> I have moved this from 5.18 to 5.20 blockers. > > It sounds like this needs more than mere documentation, and we're not > going to gain much by > documenting it now, encouraging its uptake, then changing how one must > use it or its future > analogue.
Also, if I’m not mistaken, sv_force_normal *is* already in the API, so we don’t actually *need* to ‘expose’ SV_CHECK_THINKFIRST. -- Father Chrysostomos
CC: perl5-porters [...] perl.org
Subject: Re: [perl #116925] document/publicize THINKFIRST
Date: Wed, 27 Mar 2013 01:35:24 +0100
To: perlbug-followup [...] perl.org
From: Leon Timmermans <fawaka [...] gmail.com>
Download (untitled) / with headers
text/plain 376b
On Tue, Mar 26, 2013 at 9:55 PM, Ricardo SIGNES via RT <perlbug-followup@perl.org> wrote: Show quoted text
> I have moved this from 5.18 to 5.20 blockers. > > It sounds like this needs more than mere documentation, and we're not going to gain much by > documenting it now, encouraging its uptake, then changing how one must use it or its future > analogue.
But what to do about the typemap?
Download (untitled) / with headers
text/plain 833b
On Tue Mar 26 17:13:22 2013, sprout wrote: Show quoted text
> > Also, if I’m not mistaken, sv_force_normal *is* already in the API, so > we don’t actually *need* to ‘expose’ SV_CHECK_THINKFIRST. >
And why unconditionally call a function that isn't needed? Show quoted text
______________________________________________________________________ #if PERL_BCDVERSION >= 0x5007001 # define PREP_SV_SET(sv) if(SvTHINKFIRST((sv))) sv_force_normal_flags((sv), SV_COW_DROP_PV) #else # define PREP_SV_SET(sv) if(SvTHINKFIRST((sv))) sv_force_normal((sv)) #endif
______________________________________________________________________ later on
______________________________________________________________________ PREP_SV_SET(TARG); SvOK_off(TARG);
______________________________________________________________________ -- bulk88 ~ bulk88 at hotmail.com
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 2.7k
On Tue Mar 26 20:53:12 2013, bulk88 wrote: Show quoted text
> On Tue Mar 26 17:13:22 2013, sprout wrote:
> > > > Also, if I’m not mistaken, sv_force_normal *is* already in the API, so > > we don’t actually *need* to ‘expose’ SV_CHECK_THINKFIRST. > >
> > And why unconditionally call a function that isn't needed? > > ______________________________________________________________________ > #if PERL_BCDVERSION >= 0x5007001 > # define PREP_SV_SET(sv) if(SvTHINKFIRST((sv))) > sv_force_normal_flags((sv), SV_COW_DROP_PV) > #else > # define PREP_SV_SET(sv) if(SvTHINKFIRST((sv))) sv_force_normal((sv)) > #endif > ______________________________________________________________________ > later on > ______________________________________________________________________ > PREP_SV_SET(TARG); > SvOK_off(TARG); > ______________________________________________________________________
^ Accidental post /|\ | And why unconditionally call a function that isn't needed? This is what I figured out is how to make an existing SV undef-enough to write directly into. SvUPGRADE will also be required but isn't listed here. Show quoted text
______________________________________________________________________ #if PERL_BCDVERSION >= 0x5007001 # define PREP_SV_SET(sv) if(SvTHINKFIRST((sv))) sv_force_normal_flags((sv), SV_COW_DROP_PV) #else # define PREP_SV_SET(sv) if(SvTHINKFIRST((sv))) sv_force_normal((sv)) #endif
______________________________________________________________________ later on
______________________________________________________________________ PREP_SV_SET(sv); SvOK_off(sv);
______________________________________________________________________ Some dont care about the contents of a COW SV and want to release it. Others want to use the contents but also overwrite it. There are 2 different features. Also, SvOK_off may call sv_backoff(), (force_normal wont) which is also confusion, since OOK is another form of trickery. In a number of places SV_CHECK_THINKFIRST_COW_DROP is followed exactly by, or often (not all branches), SvOK_off. SvOK_off, SV_CHECK_THINKFIRST_COW_DROP, and SvUPGRADE are all flag test and maybe func call macros. Maybe the 3 should be merged into 1 or 2 flag tests (2 meaning flags and sv type num), and 1 func with a flags bitfield (extend force_normal to also do upgrades and mathom sv_upgrade(). All 3 macros would call (made up name) sv_force_anywayuwant_flags under the hood with a flags argument (1 char devoted to new requested type, 0x00 (SVt_NULL, can't upgrade to that right?) or 0xff mean no upgrade happens, other 3 bytes control want to write to PV, want to replace, de-cow, cow drop, de-offset, get/set magic calls during emptying of SV, de-rv/glob/version, die on non-COW RO, etc (my list is too large to be sane and some things are the same)). -- bulk88 ~ bulk88 at hotmail.com
CC: perl5-porters [...] perl.org
Subject: Re: [perl #116925] document/publicize THINKFIRST
Date: Mon, 6 May 2013 16:11:36 +0100
To: Father Chrysostomos via RT <perlbug-followup [...] perl.org>
From: Dave Mitchell <davem [...] iabyn.com>
Download (untitled) / with headers
text/plain 1.3k
On Tue, Mar 26, 2013 at 05:13:22PM -0700, Father Chrysostomos via RT wrote: Show quoted text
> On Tue Mar 26 13:55:32 2013, rjbs wrote:
> > I have moved this from 5.18 to 5.20 blockers. > > > > It sounds like this needs more than mere documentation, and we're not > > going to gain much by > > documenting it now, encouraging its uptake, then changing how one must > > use it or its future > > analogue.
> > Also, if I’m not mistaken, sv_force_normal *is* already in the API, so > we don’t actually *need* to ‘expose’ SV_CHECK_THINKFIRST.
Also, since the thing that triggered this discussion originally was "how to make it safe under COW to directly modify the SvPVX buffer", note that SV_CHECK_THINKFIRST etc don't actually guarantee you a writeable SvPVX; for that you need SvPV_force (which does the equivalent of SV_CHECK_THINKFIRST, but then in addition converts a IV/NV/RV etc into an equivalent string value). Since SvPV_force is already API, and is already documented (since 2000 or earlier) as "You want force if you are going to update the SvPVX directly", I think that's enough to keep us going till 5.18. See also my recent comments in ticket RT #116407 about improving the docs for SvPV to make it clear you can't just trample on PVX, and need SvPV_force instead. -- Art is anything that has a label (especially if the label is "untitled 1")
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 2.5k
On Tue Mar 26 13:55:32 2013, rjbs wrote: Show quoted text
> I have moved this from 5.18 to 5.20 blockers. > > It sounds like this needs more than mere documentation, and we're not > going to gain much by > documenting it now, encouraging its uptake, then changing how one must > use it or its future > analogue.
Beating this dead horse which is a blocker for 5.20: What might need to be done: - neither perlxs nor perlxstut discuss modifying (or even reading) the PV part of a scalar directly, but since they're about *XS* I'm not sure they should - L<perlguts/Working with SVs> should probably discuss how to work with the SV's buffer, including SvPV_force(), SvPVbyte_force(), SvPVutf8_force(), SvGROW(). I'm not sure if this belongs after the SvGROW discussion, or after the sv_cat*() discussion. - do we need a typemap entry? I can think of two cases where you're writing back to the SV: a) the caller provides a fixed length buffer, in which case you need your own typemap entry (or inline code) to verify/force the buffer size b) the caller provides a variable length buffer For case b) in most cases I think you'd also need the length (remember, it's writable, no-one does gets() type interfaces anymore, right?), which requires inline code. So the only case that doesn't need the length (and hence inline code or a custom typemap) is a gets() style interface, for which I'm inclined to let people shoot their own feet off. The only real way I could see doing it would be to provide our own custom typedefs: typedef unsigned char WRITABLE_BYTE; typedef unsigned char WRITABLE_UTF8; typedef unsigned char WRITABLE_DONTCARE; /* for the wild ones */ and create typemaps for WRITABLE_BYTE etc: WRITEABLE_BYTE * T_WRITEABLE_BYTE WRITEABLE_UTF8 * T_WRITEABLE_UTF8 WRITEABLE_DONTCARE * T_WRITEABLE_DONTCARE ... T_WRITEABLE_BYTE $var = SvPVbyte_force_nolen($arg); // no such macro T_WRITEABLE_UTF8 $var = SvPVutf8_force_nolen($arg); // no such macro T_WRITEABLE_BYTE $var = SvPV_force_nolen($arg); What doesn't need to be done: - SvPV() documents that you should use SvPV_force() if you want to write to the PV string. Possibly this could be made clearer (making the mention of SvPV_force() a new paragraph might do it) - SvPV_force() documents that the PV string is writable after calling it. - under Copy on Write, perlguts documents that you need sv_force_normal() or SvPV_force_nolen() to make the SV non-COW. Hopefully that makes some sense. Tony
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 3.2k
On Tue Feb 04 20:24:36 2014, tonyc wrote: Show quoted text
> On Tue Mar 26 13:55:32 2013, rjbs wrote:
> > I have moved this from 5.18 to 5.20 blockers. > > > > It sounds like this needs more than mere documentation, and we're not > > going to gain much by > > documenting it now, encouraging its uptake, then changing how one must > > use it or its future > > analogue.
> > Beating this dead horse which is a blocker for 5.20: > > What might need to be done: > > - neither perlxs nor perlxstut discuss modifying (or even > reading) the PV part of a scalar directly, but since they're > about *XS* I'm not sure they should > > - L<perlguts/Working with SVs> should probably discuss how to > work with the SV's buffer, including SvPV_force(), > SvPVbyte_force(), SvPVutf8_force(), SvGROW(). I'm not sure if > this belongs after the SvGROW discussion, or after the > sv_cat*() discussion. > > - do we need a typemap entry? I can think of two cases where > you're writing back to the SV: > > a) the caller provides a fixed length buffer, in which case > you need your own typemap entry (or inline code) to > verify/force the buffer size > > b) the caller provides a variable length buffer > > For case b) in most cases I think you'd also need the > length (remember, it's writable, no-one does gets() type > interfaces anymore, right?), which requires inline code. > > So the only case that doesn't need the length (and hence > inline code or a custom typemap) is a gets() style interface, > for which I'm inclined to let people shoot their own feet off. > > The only real way I could see doing it would be to provide our > own custom typedefs: > > typedef unsigned char WRITABLE_BYTE; > typedef unsigned char WRITABLE_UTF8; > typedef unsigned char WRITABLE_DONTCARE; /* for the wild ones */ > > and create typemaps for WRITABLE_BYTE etc: > > WRITEABLE_BYTE * T_WRITEABLE_BYTE > WRITEABLE_UTF8 * T_WRITEABLE_UTF8 > WRITEABLE_DONTCARE * T_WRITEABLE_DONTCARE > ... > T_WRITEABLE_BYTE > $var = SvPVbyte_force_nolen($arg); // no such macro > > T_WRITEABLE_UTF8 > $var = SvPVutf8_force_nolen($arg); // no such macro > > T_WRITEABLE_BYTE > $var = SvPV_force_nolen($arg); > > What doesn't need to be done: > > - SvPV() documents that you should use SvPV_force() if you want > to write to the PV string. Possibly this could be made > clearer (making the mention of SvPV_force() a new paragraph > might do it) > > - SvPV_force() documents that the PV string is writable after > calling it. > > - under Copy on Write, perlguts documents that you need > sv_force_normal() or SvPV_force_nolen() to make the SV > non-COW. > > Hopefully that makes some sense. > > Tony
The problem is SvPV_force_nolen and SvPVbyte_force_nolen do something totally useless if you just want to write into them. If I am going to just write, why do I want to A. copy the COW contents to a new non-COW block, B. convert encodings, then write over it? I want a uninitialized PV. sv_force_normal() copies the COW data over. Some people want R/W buffers to manipulate in place the data too. -- bulk88 ~ bulk88 at hotmail.com
To: perl5-porters [...] perl.org
Date: 7 Feb 2014 00:06:42 -0000
Subject: Re: [perl #116925] document/publicize THINKFIRST
From: Father Chrysostomos <sprout [...] cpan.org>
Download (untitled) / with headers
text/plain 326b
Tony Cook wrote: Show quoted text
> Beating this dead horse which is a blocker for 5.20: > > What might need to be done:
The Copy on Write section added to perlguts was meant to address this. I think it is sufficient that this ticket no longer needs to be a blocker, but I do not mean to imply that your suggestions should not be followed,
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 715b
On Wed Feb 05 01:57:24 2014, bulk88 wrote: Show quoted text
> The problem is SvPV_force_nolen and SvPVbyte_force_nolen do something > totally useless if you just want to write into them. If I am going to > just write, why do I want to A. copy the COW contents to a new non-COW > block, B. convert encodings, then write over it? I want a > uninitialized PV. sv_force_normal() copies the COW data over. Some > people want R/W buffers to manipulate in place the data too.
I think: sv_force_normal_flags(sv, SV_COW_DROP_PV); SvUPGRADE(sv, SVt_PV); // SvGROW etc is the answer there, though the name of the flag leaves something to desire, since it doesn't just drop the PV for COW, but also for REGEXP and GLOB arguments. Tony
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 351b
On Tue Feb 04 20:24:36 2014, tonyc wrote: Show quoted text
> - L<perlguts/Working with SVs> should probably discuss how to > work with the SV's buffer, including SvPV_force(), > SvPVbyte_force(), SvPVutf8_force(), SvGROW(). I'm not sure if > this belongs after the SvGROW discussion, or after the > sv_cat*() discussion.
Attached a potential patch. Tony
Subject: 0001-perl-116925-discuss-modifying-an-SV-s-buffer.patch
From 25598ee4bac31536cab2363f4bc3c70c716f1d8e Mon Sep 17 00:00:00 2001 From: Tony Cook <tony@develop-help.com> Date: Wed, 26 Mar 2014 16:18:08 +1100 Subject: [perl #116925] discuss modifying an SV's buffer in the same place we otherwise talk about SVs --- pod/perlguts.pod | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/pod/perlguts.pod b/pod/perlguts.pod index 3fb5137..db8548a 100644 --- a/pod/perlguts.pod +++ b/pod/perlguts.pod @@ -159,6 +159,38 @@ decrease, the allocated memory of an SV and that it does not automatically add space for the trailing NUL byte (perl's own string functions typically do C<SvGROW(sv, len + 1)>). +If you want to write to an existing SV's buffer and set its value to a +string, use SvPV_force() or one of its variants to force the SV to be +a PV. This will remove any of various types of non-stringness from +the SV, including copy-on-write and being a reference. This can be +used to implement something like read(): + + (void)SvPVbyte_force(sv, len); + s = SvGROW(sv, offset + needlen + 1); + /* something that modifies up to needlen bytes at s, but modifies newlen + eg. newlen = read(fd, s + offset. needlen); + ignoring errors for these examples + */ + s[offset + newlen] = '\0'; + SvCUR_set(sv, offset + newlen); + SvUTF8_off(sv); + SvSETMAGIC(sv); + +If you don't need the existing content of the SV, you can avoid some +copying with: + + if (SvTHINKFIRST(sv)) + sv_force_normal_flags(sv, SV_COW_DROP_PV); + SvUPGRADE(sv, SVt_PV); + s = SvGROW(sv, needlen + 1); + /* something that modifies up to needlen bytes at s, but modifies newlen + eg. newlen = read(fd, s. needlen); + */ + s[newlen] = '\0'; + SvCUR_set(sv, newlen); + SvPOK_only(sv); /* also clears SVf_UTF8 */ + SvSETMAGIC(sv); + If you have an SV and want to know what kind of data Perl thinks is stored in it, you can use the following macros to check the type of SV you have. -- 1.7.10.4
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 110b
I'd like to see this reviewed and (if approved) applied, but I've removed this ticket from blocking. -- rjbs
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 437b
On Mon Feb 25 10:38:49 2013, bulk88 wrote: Show quoted text
> > As a separate issue, "Memory Allocation" in perlguts should mention > sv_usepvn_flags exists, and the pod for sv_usepvn_flags needs to say it > only takes "Newx" memory never any kind of other memory like "malloc" > memory. >
This pod issue is still a problem in blead. I was 1/4 done writing a new ticket for it before I found this old post of mine. -- bulk88 ~ bulk88 at hotmail.com
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 2.4k
On Tue Mar 25 22:19:43 2014, tonyc wrote: Show quoted text
> On Tue Feb 04 20:24:36 2014, tonyc wrote:
> > - L<perlguts/Working with SVs> should probably discuss how to > > work with the SV's buffer, including SvPV_force(), > > SvPVbyte_force(), SvPVutf8_force(), SvGROW(). I'm not sure if > > this belongs after the SvGROW discussion, or after the > > sv_cat*() discussion.
> > Attached a potential patch. > > Tony
+If you want to write to an existing SV's buffer and set its value to a +string, use SvPV_force() or one of its variants to force the SV to be +a PV. This will remove any of various types of non-stringness from +the SV, including copy-on-write and being a reference. This can be +used to implement something like read(): + + (void)SvPVbyte_force(sv, len); + s = SvGROW(sv, offset + needlen + 1); WHy the "(void)SvPVbyte_force(sv, len);" line? Won't SvGROW drop the COW and RVs? +If you don't need the existing content of the SV, you can avoid some +copying with: + + if (SvTHINKFIRST(sv)) + sv_force_normal_flags(sv, SV_COW_DROP_PV); + SvUPGRADE(sv, SVt_PV); + s = SvGROW(sv, needlen + 1); Maybe the API needs to be reworks. WOuldn't sv_force_normal_flags create a SVPV without a buffer already? So why the SvUPGRADE? I Kno the SvUPGRADE is protection so the SvGROW doesn't segv because its an SVIV with no body, only a head. Instread of SvUPGRADE(sv, SVt_PV); SvGROW(sv, needlen + 1); try this ---------------------- if(SvTYPE(sv) >= SVt_PV) { if( SvIsCOW(sv) ) { needgrow: buffer = sv_grow(sv,len); } else if (SvLEN(sv) < (len) { goto needgrow; } else { buffer = SvPVX(sv); } } else { goto needgrow; } ----------------------- +If you want to write to an existing SV's buffer and set its value to a +string, use SvPV_force() or one of its variants to force the SV to be +a PV. This will remove any of various types of non-stringness from +the SV, including copy-on-write and being a reference. This can be +used to implement something like read(): Is this for read buffer, then you can then write to it? Where the buffer will be read, and conditionally written to in the same C library call? If so it needs to explain better that this produces the Perl level contents for you to read and write to. And this does not supply uninit memory, but only initiated memory. Use "If you want to read and write to an string" . also mention " If you are ONLY WRITTING, see the next paragraph for how to do it faster." -- bulk88 ~ bulk88 at hotmail.com
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 2.7k
On Wed Apr 16 02:47:11 2014, bulk88 wrote: Show quoted text
> On Tue Mar 25 22:19:43 2014, tonyc wrote: > +If you want to write to an existing SV's buffer and set its value to > a > +string, use SvPV_force() or one of its variants to force the SV to be > +a PV. This will remove any of various types of non-stringness from > +the SV, including copy-on-write and being a reference. This can be > +used to implement something like read(): > + > + (void)SvPVbyte_force(sv, len); > + s = SvGROW(sv, offset + needlen + 1); > > WHy the "(void)SvPVbyte_force(sv, len);" line? Won't SvGROW drop the > COW and RVs?
SvGROW() cleans those up, but not some other forms of specialness. sv_force_normal_flags() called by SvPV_force() variants will also check SvREADONLY(). Show quoted text
> +If you don't need the existing content of the SV, you can avoid some > +copying with: > + > + if (SvTHINKFIRST(sv)) > + sv_force_normal_flags(sv, SV_COW_DROP_PV); > + SvUPGRADE(sv, SVt_PV); > + s = SvGROW(sv, needlen + 1); > > Maybe the API needs to be reworks. WOuldn't sv_force_normal_flags > create a SVPV without a buffer already? So why the SvUPGRADE? I Kno > the SvUPGRADE is protection so the SvGROW doesn't segv because its an > SVIV with no body, only a head. Instread of SvUPGRADE(sv, SVt_PV); > SvGROW(sv, needlen + 1); try this > > > ---------------------- > if(SvTYPE(sv) >= SVt_PV) { > if( SvIsCOW(sv) ) { > needgrow: > buffer = sv_grow(sv,len); > } else if (SvLEN(sv) < (len) { > goto needgrow; > } else { > buffer = SvPVX(sv); > } > } else { > goto needgrow; > } > -----------------------
sv_force_normal_flags() only removes specialness, for unspecial SVs (like an undef SV) it does nothing. So replacing those 3 lines with your if statement would be incorrect. This code was based on Perl_sv_setpvn(). Show quoted text
> +If you want to write to an existing SV's buffer and set its value to > a > +string, use SvPV_force() or one of its variants to force the SV to be > +a PV. This will remove any of various types of non-stringness from > +the SV, including copy-on-write and being a reference. This can be > +used to implement something like read(): > > > Is this for read buffer, then you can then write to it? Where the > buffer will be read, and conditionally written to in the same C > library call? If so it needs to explain better that this produces the > Perl level contents for you to read and write to. And this does not > supply uninit memory, but only initiated memory. Use "If you want to > read and write to an string" . also mention " If you are ONLY > WRITTING, see the next paragraph for how to do it faster."
You're right, it needs to explain that the content is readable. As to only writing, I think the proximity with the other paragraph does that. Tony
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 5.4k
On Wed Apr 16 15:17:05 2014, tonyc wrote: Show quoted text
> On Wed Apr 16 02:47:11 2014, bulk88 wrote:
> > On Tue Mar 25 22:19:43 2014, tonyc wrote: > > +If you want to write to an existing SV's buffer and set its value to > > a > > +string, use SvPV_force() or one of its variants to force the SV to > > be > > +a PV. This will remove any of various types of non-stringness from > > +the SV, including copy-on-write and being a reference. This can be > > +used to implement something like read(): > > + > > + (void)SvPVbyte_force(sv, len); > > + s = SvGROW(sv, offset + needlen + 1); > > > > WHy the "(void)SvPVbyte_force(sv, len);" line? Won't SvGROW drop the > > COW and RVs?
> > SvGROW() cleans those up, but not some other forms of specialness. > sv_force_normal_flags() called by SvPV_force() variants will also > check > SvREADONLY().
Ok, yor're right about RO flaged. Another issue, why "SvPVbyte_force" and not "SvPV_force"? Show quoted text
> > sv_force_normal_flags() only removes specialness, for unspecial SVs > (like an undef SV) it does nothing. > > So replacing those 3 lines with your if statement would be incorrect. >
There was a misunderstanding or I wasn't clear. I'll rephrase it. Replace if (SvTHINKFIRST(sv)) sv_force_normal_flags(sv, SV_COW_DROP_PV); SvUPGRADE(sv, SVt_PV); s = SvGROW(sv, needlen + 1); with if (SvTHINKFIRST(sv)) sv_force_normal_flags(sv, SV_COW_DROP_PV); if(SvTYPE(sv) >= SVt_PV) { if( SvIsCOW(sv) ) { needgrow: buffer = sv_grow(sv,len); } else if (SvLEN(sv) < (len) { goto needgrow; } else { buffer = SvPVX(sv); } } else { goto needgrow; } 1 remaining issue, that I dont think POD/docs can fix is, something like Show quoted text
> > + (void)SvPVbyte_force(sv, len); > > + s = SvGROW(sv, offset + needlen + 1);
can perform redundant ascii/utf8 conversion and stringifys on a buffer that will be tossed. Here is an unnecessary stringify example with SvPV_force. ---------------------------- void t1(SV* sv) PPCODE: STRLEN len; SvPV_force(sv, len); ---------------------------- C:\sources\>perl -MDevel::Peek=Dump -ML::XS -E"my $i; my $a = \$i; L::XS::t1($a); Dump($a); SV = PVIV(0x8ff9ec) at 0x8fc40c REFCNT = 1 FLAGS = (PADMY,POK,pPOK) IV = 0 PV = 0x8fe86c "SCALAR(0x8fc45c)"\0 CUR = 16 LEN = 20 ---------------------------- perl519.dll!Perl_sv_2pv_flags(interpreter * my_perl=0x00364e9c, sv * const sv=0x00000006, unsigned int * const lp=0x0012fc68, const long flags=0) Line 2885 C perl519.dll!Perl_sv_pvn_force_flags(interpreter * my_perl=0x00364e9c, sv * const sv=0x008fc404, unsigned int * const lp=0x0012fc80, const long flags=2) Line 9498 + 0xf C XS.dll!XS_L__XS_t1(interpreter * my_perl=0x00364e9c, cv * cv=0x0093ff14) Line 2957 + 0x1d C perl519.dll!Perl_pp_entersub(interpreter * my_perl=0x00000000) Line 2794 C perl519.dll!Perl_runops_standard(interpreter * my_perl=0x00364e9c) Line 42 + 0x4 C perl519.dll!S_run_body(interpreter * my_perl=0x00000000, long oldscope=1) Line 2449 + 0xa C perl519.dll!perl_run(interpreter * my_perl=0x00364e9c) Line 2365 + 0x8 C perl519.dll!RunPerl(int argc=3, char * * argv=0x01364c68, char * * env=0x00363328) Line 259 + 0x6 C perl.exe!mainCRTStartup() Line 398 + 0xe C kernel32.dll!_BaseProcessStart@4() + 0x23 ---------------------------- Newx(buffer, len, char); buffer_end = retval = buffer + len; /* Working backwards */ *--retval = '\0'; *--retval = ')';<<<<<<<<<<<<<<<<<<<<<<<<LINE 2885 in sv.c do { *--retval = PL_hexdigit[addr & 15]; } while (addr >>= 4); *--retval = 'x'; *--retval = '0'; *--retval = '('; retval -= typelen; ------------------------------------- This stringifying by SvPV_force matches its docs. ------------------------------------- /* =for apidoc Am|char*|SvPV_force|SV* sv|STRLEN len Like C<SvPV> but will force the SV into containing a string (C<SvPOK>), and only a string (C<SvPOK_only>), by hook or by crook. You need force if you are going to update the C<SvPVX> directly. Processes get magic. Note that coercing an arbitrary scalar into a plain PV will potentially strip useful data from it. For example if the SV was C<SvROK>, then the referent will have its reference count decremented, and the SV itself may be converted to an C<SvPOK> scalar with a string buffer containing a value such as C<"ARRAY(0x1234)">. ------------------------------------- Then again this proposed wording \|/ Show quoted text
> > On Tue Mar 25 22:19:43 2014, tonyc wrote: > > +If you want to write to an existing SV's buffer and set its value to > > a > > +string, use SvPV_force() or one of its variants to force the SV to > > be > > +a PV. This will remove any of various types of non-stringness from > > +the SV, including copy-on-write and being a reference. This can be > > +used to implement something like read(): > > + > > + (void)SvPVbyte_force(sv, len); > > + s = SvGROW(sv, offset + needlen + 1);
is about having valid read data. I think having s = SvGROW(sv, offset + needlen + 1); after (void)SvPVbyte_force(sv, len); is strange and should be removed for the "we want read and write data" instructions. Wouldnt SvGROW have added uninit memory to the end of the readable valid data, and depending if the user/dev is bright enough realize what he did (mixing up expression "len" with "offset + needlen + 1"), tell some C library to parse/translate uninit memory at the end of the post-COW buffer? -- bulk88 ~ bulk88 at hotmail.com
From: Tony Cook <tony [...] develop-help.com>
To: bulk88 via RT <perlbug-followup [...] perl.org>
Date: Thu, 17 Apr 2014 11:50:41 +1000
Subject: Re: [perl #116925] document/publicize THINKFIRST
CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 4.4k
On Wed, Apr 16, 2014 at 04:13:24PM -0700, bulk88 via RT wrote: Show quoted text
> On Wed Apr 16 15:17:05 2014, tonyc wrote:
> > On Wed Apr 16 02:47:11 2014, bulk88 wrote:
> > > On Tue Mar 25 22:19:43 2014, tonyc wrote: > > > +If you want to write to an existing SV's buffer and set its value to > > > a > > > +string, use SvPV_force() or one of its variants to force the SV to > > > be > > > +a PV. This will remove any of various types of non-stringness from > > > +the SV, including copy-on-write and being a reference. This can be > > > +used to implement something like read(): > > > + > > > + (void)SvPVbyte_force(sv, len); > > > + s = SvGROW(sv, offset + needlen + 1); > > > > > > WHy the "(void)SvPVbyte_force(sv, len);" line? Won't SvGROW drop the > > > COW and RVs?
> > > > SvGROW() cleans those up, but not some other forms of specialness. > > sv_force_normal_flags() called by SvPV_force() variants will also > > check > > SvREADONLY().
> > Ok, yor're right about RO flaged. > > Another issue, why "SvPVbyte_force" and not "SvPV_force"?
That, and the other UTF8 mentions are there to get the reader to think about about unicode handling. Using a simple SvPV_force() with my commented out example case would simply have been incorrect, since read(1) returns octets, with little reason to expect that it's utf-8 encoded. (Which is why using :utf8 is bad.) Show quoted text
> > > > sv_force_normal_flags() only removes specialness, for unspecial SVs > > (like an undef SV) it does nothing. > > > > So replacing those 3 lines with your if statement would be incorrect. > >
> > There was a misunderstanding or I wasn't clear. I'll rephrase it.
I looked at your quote of my code instead of your prose, my fault. Show quoted text
> Replace > > if (SvTHINKFIRST(sv)) > sv_force_normal_flags(sv, SV_COW_DROP_PV); > SvUPGRADE(sv, SVt_PV); > s = SvGROW(sv, needlen + 1); > > with > > if (SvTHINKFIRST(sv)) > sv_force_normal_flags(sv, SV_COW_DROP_PV); > if(SvTYPE(sv) >= SVt_PV) { > if( SvIsCOW(sv) ) {
It can't be COW at this point, sv_force_normal_flags() removes it. Show quoted text
> needgrow: > buffer = sv_grow(sv,len); > } else if (SvLEN(sv) < (len) { > goto needgrow; > } else { > buffer = SvPVX(sv); > } > } else { > goto needgrow; > }
I could see something like that as a new macro or function. I don't think I'd want to ask XS authors to inline that code manually though. Show quoted text
> > 1 remaining issue, that I dont think POD/docs can fix is, something like >
> > > + (void)SvPVbyte_force(sv, len); > > > + s = SvGROW(sv, offset + needlen + 1);
> > can perform redundant ascii/utf8 conversion and stringifys on a buffer that will be tossed. Here is an unnecessary stringify example with SvPV_force.
If the caller supplies a reference or undef or something else that isn't a string the author could add code to reject them. If they don't want any part of the the content, they can skip the SvPV_force() and use the alternate mechanism. The SvGROW() may end up copying that generated content, but I don't think I can avoid it without making the sample even more complex. Show quoted text
> Then again this proposed wording \|/ >
> > > On Tue Mar 25 22:19:43 2014, tonyc wrote: > > > +If you want to write to an existing SV's buffer and set its value to > > > a > > > +string, use SvPV_force() or one of its variants to force the SV to > > > be > > > +a PV. This will remove any of various types of non-stringness from > > > +the SV, including copy-on-write and being a reference. This can be > > > +used to implement something like read(): > > > + > > > + (void)SvPVbyte_force(sv, len); > > > + s = SvGROW(sv, offset + needlen + 1);
> > is about having valid read data. I think having > > s = SvGROW(sv, offset + needlen + 1); > > after > > (void)SvPVbyte_force(sv, len); > > is strange and should be removed for the "we want read and write data" instructions. Wouldnt SvGROW have added uninit memory to the end of the readable valid data, and depending if the user/dev is bright enough realize what he did (mixing up expression "len" with "offset + needlen + 1"), tell some C library to parse/translate uninit memory at the end of the post-COW buffer?
That's a good point. I can see two cases: 1) offset <= SvCUR() - if the wrapped function does read needlen bytes, some of the data is going to be uninitialized. 2) offset > SvCUR() - the wrapped function is going to receive uninitialized content, but the caller is also going to receiver uninitialized data. Tony
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 802b
On Wed Apr 16 18:51:10 2014, tonyc wrote: Show quoted text
> 1) offset <= SvCUR() - if the wrapped function does read needlen > bytes, some of the data is going to be uninitialized. > > 2) offset > SvCUR() - the wrapped function is going to receive > uninitialized content, but the caller is also going to receiver > uninitialized data.
I went through some drafts trying to address this, but I think examples like this need to stay simple rather than trying to cover every possible use of the APIs. So all I've added is a check that offset <= len so we don't return undefined content back to the caller. I don't think it's necessary to document here that the extra memory allocated by SvGROW() is undefined, since we don't say what it is initialized it, the caller shouldn't be making an assumptions about it. Tony
Subject: 0001-perl-116925-discuss-modifying-an-SV-s-buffer.patch
From ad5fd95262bcbe0faa8f380328c461fb88be4941 Mon Sep 17 00:00:00 2001 From: Tony Cook <tony@develop-help.com> Date: Mon, 21 Apr 2014 11:54:08 +1000 Subject: [perl #116925] discuss modifying an SV's buffer in the same place we otherwise talk about SVs --- pod/perlguts.pod | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/pod/perlguts.pod b/pod/perlguts.pod index 3fb5137..165e44e 100644 --- a/pod/perlguts.pod +++ b/pod/perlguts.pod @@ -159,6 +159,42 @@ decrease, the allocated memory of an SV and that it does not automatically add space for the trailing NUL byte (perl's own string functions typically do C<SvGROW(sv, len + 1)>). +If you want to write to an existing SV's buffer and set its value to a +string, use SvPV_force() or one of its variants to force the SV to be +a PV. This will remove any of various types of non-stringness from +the SV while preserving the content of the SV in the PV. This can be +used to implement something like read(): + + (void)SvPVbyte_force(sv, len); + if (offset > len) + croak("Supplied offset beyond the end of the string"); + s = SvGROW(sv, offset + needlen + 1); + /* something that modifies up to needlen bytes at s+offset, but + modifies newlen bytes + eg. newlen = read(fd, s + offset. needlen); + ignoring errors for these examples + */ + s[offset + newlen] = '\0'; + SvCUR_set(sv, offset + newlen); + SvUTF8_off(sv); + SvSETMAGIC(sv); + +If you don't need the existing content of the SV, you can avoid some +copying with: + + if (SvTHINKFIRST(sv)) + sv_force_normal_flags(sv, SV_COW_DROP_PV); + SvUPGRADE(sv, SVt_PV); + s = SvGROW(sv, needlen + 1); + /* something that modifies up to needlen bytes at s, but modifies + newlen bytes + eg. newlen = read(fd, s. needlen); + */ + s[newlen] = '\0'; + SvCUR_set(sv, newlen); + SvPOK_only(sv); /* also clears SVf_UTF8 */ + SvSETMAGIC(sv); + If you have an SV and want to know what kind of data Perl thinks is stored in it, you can use the following macros to check the type of SV you have. -- 1.7.10.4
Date: Tue, 6 May 2014 15:00:11 +0100
To: Tony Cook via RT <perlbug-followup [...] perl.org>
Subject: Re: [perl #116925] document/publicize THINKFIRST
CC: perl5-porters [...] perl.org
From: Dave Mitchell <davem [...] iabyn.com>
Download (untitled) / with headers
text/plain 5.2k
On Sun, Apr 20, 2014 at 07:04:29PM -0700, Tony Cook via RT wrote: Show quoted text
> I went through some drafts trying to address this, but I think examples like > this need to stay simple rather than trying to cover every possible use of > the APIs. > > So all I've added is a check that offset <= len so we don't return undefined content back to the caller. > > I don't think it's necessary to document here that the extra memory allocated by SvGROW() is undefined, since we don't say what it is initialized it, the caller shouldn't be making an assumptions about it. >
Show quoted text
> pod/perlguts.pod | 36 ++++++++++++++++++++++++++++++++++++ > 1 file changed, 36 insertions(+) > > diff --git a/pod/perlguts.pod b/pod/perlguts.pod > index 3fb5137..165e44e 100644 > --- a/pod/perlguts.pod > +++ b/pod/perlguts.pod > @@ -159,6 +159,42 @@ decrease, the allocated memory of an SV and that it does not automatically > add space for the trailing NUL byte (perl's own string functions typically do > C<SvGROW(sv, len + 1)>). > > +If you want to write to an existing SV's buffer and set its value to a > +string, use SvPV_force() or one of its variants to force the SV to be > +a PV. This will remove any of various types of non-stringness from > +the SV while preserving the content of the SV in the PV. This can be > +used to implement something like read(): > + > + (void)SvPVbyte_force(sv, len); > + if (offset > len) > + croak("Supplied offset beyond the end of the string"); > + s = SvGROW(sv, offset + needlen + 1); > + /* something that modifies up to needlen bytes at s+offset, but > + modifies newlen bytes > + eg. newlen = read(fd, s + offset. needlen); > + ignoring errors for these examples > + */ > + s[offset + newlen] = '\0'; > + SvCUR_set(sv, offset + newlen); > + SvUTF8_off(sv); > + SvSETMAGIC(sv); > + > +If you don't need the existing content of the SV, you can avoid some > +copying with: > + > + if (SvTHINKFIRST(sv)) > + sv_force_normal_flags(sv, SV_COW_DROP_PV); > + SvUPGRADE(sv, SVt_PV); > + s = SvGROW(sv, needlen + 1); > + /* something that modifies up to needlen bytes at s, but modifies > + newlen bytes > + eg. newlen = read(fd, s. needlen); > + */ > + s[newlen] = '\0'; > + SvCUR_set(sv, newlen); > + SvPOK_only(sv); /* also clears SVf_UTF8 */ > + SvSETMAGIC(sv); > + > If you have an SV and want to know what kind of data Perl thinks is stored > in it, you can use the following macros to check the type of SV you have.
I've been reading and re-reading this thread and the two related ones (#116569, #116407) and keep ending up confused and undecided. My gut reaction to your proposed doc patch is uneasiness. I don't like the read() example at at all; it's not immediately obvious what it does, and what all the values like offset and needlen represent. My current understanding is that there are four types of string buffer activity: * get-only This is already covered in perlguts: basically it's SvPV() and variants. Perl-level equivalent: stringification: e.g. "$var". * get-and-set This is where you modify an existing value by directly modifying the SPVX buffer. Use SvPV_force or variant. If the new string needs to be larger than the old one, then use SvGROW and SvCUR_set. Perl-level equivalent: e.g. substr($var, 1, 3, "abcd"); Perhaps a simple string prepend would make a clearer example? I think here we should should also point out that the various sv_cat*() functions are more convenient for a simple append than messing about with SvGROW. * set-only This is where we don't care about the old value; we just want to set the var to a new string value. Perl-level equivalent: e.g. $var = "new string"; This is where it seems to me, that most of the difficultly lies. This is where there's lots of discussion, with a possible need to expose previously un- or under-documented and/or non-API functions, such as SvTHINKFIRST() and sv_force_normal_flags() This is a special case of get-and-set, only we're trying to be more efficient and avoiding unnecessarily copy string buffers, or converting an NV into a PV, only to discard the result. My feeling here is that most of the time people should just use sv_setpv() and variants if you already know the string value you wish to assign; for example, len = read(fh, buf, 1024); sv_setpvn(sv, buf, len); sv_setpvn() will handle all the de-COWing etc, in an efficient manner. In those cases where you just want to allocate a particular sized buffer, but then modify it directly, sv_setpvn(sv,"",0); SvGROW(sv, len); ... does all the sv_force_normal_flags(), SvUPGRADE() etc stuff for you at the expense of temporarily mallocing a 1-byte (+ rounding up) buffer. * replace buffer This is a particular use-case that Leon T mentioned, where you want to do the equivalent of free(SvPVX(sv)); SvPVX(sv) = newptr; but safely. I could be wrong, but doesn't sv_usepvn_flags () and similar do this? -- The Enterprise is involved in a bizarre time-warp experience which is in some way unconnected with the Late 20th Century. -- Things That Never Happen in "Star Trek" #14
CC: Tony Cook via RT <perlbug-followup [...] perl.org>, Perl5 Porters <perl5-porters [...] perl.org>
Subject: Re: [perl #116925] document/publicize THINKFIRST
Date: Wed, 7 May 2014 00:41:28 +0200
To: Dave Mitchell <davem [...] iabyn.com>
From: Leon Timmermans <fawaka [...] gmail.com>
Download (untitled) / with headers
text/plain 2.1k
On Tue, May 6, 2014 at 4:00 PM, Dave Mitchell <davem@iabyn.com> wrote:
Show quoted text
My gut reaction to your proposed doc patch is uneasiness. I don't like the
read() example at at all; it's not immediately obvious what it does, and
what all the values like offset and needlen represent.

I agree it's not optimal, but that may be inherent to the issue.
 
Show quoted text
* set-only

    This is where we don't care about the old value; we just want to set
    the var to a new string value.

    Perl-level equivalent: e.g. $var = "new string";

    This is where it seems to me, that most of the difficultly lies.
    This is where there's lots of discussion, with a possible need to
    expose previously un- or under-documented and/or non-API functions,
    such as SvTHINKFIRST() and sv_force_normal_flags()

    This is a special case of get-and-set, only we're trying to be more
    efficient and avoiding unnecessarily copy string buffers, or
    converting an NV into a PV, only to discard the result.

    My feeling here is that most of the time people should just use
    sv_setpv() and variants if you already know the string value you wish
    to assign; for example,

        len = read(fh, buf, 1024);
        sv_setpvn(sv, buf, len);

    sv_setpvn() will handle all the de-COWing etc, in an efficient manner.

    In those cases where you just want to allocate a particular sized
    buffer, but then modify it directly,

        sv_setpvn(sv,"",0);
        SvGROW(sv, len);
        ...

    does all the sv_force_normal_flags(), SvUPGRADE() etc stuff for you at
    the expense of temporarily mallocing a 1-byte (+ rounding up) buffer.

I would welcome new API functions that would make this easier.
 
Show quoted text
* replace buffer

    This is a particular use-case that Leon T mentioned, where you
    want to do the equivalent of

        free(SvPVX(sv));
        SvPVX(sv) = newptr;

    but safely.

    I could be wrong, but doesn't sv_usepvn_flags () and similar do this?

sv_usepvn_flags assumed the buffer is malloc()ed, while in my use-case it's mmap()ed. We could add an "un-owned" flag to sv_usepvn_flags for this sort of situation though.

Leon
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 1.4k
On Tue May 06 07:01:16 2014, davem wrote: Show quoted text
> > My feeling here is that most of the time people should just use > sv_setpv() and variants if you already know the string value you wish > to assign; for example, > > len = read(fh, buf, 1024); > sv_setpvn(sv, buf, len); > > sv_setpvn() will handle all the de-COWing etc, in an efficient manner. >
That usually results in a redundant double copy, unless the * came from a foreign allocator. Direct writing into the PV is always better. Show quoted text
> In those cases where you just want to allocate a particular sized > buffer, but then modify it directly, > > sv_setpvn(sv,"",0); > SvGROW(sv, len); > ... > > does all the sv_force_normal_flags(), SvUPGRADE() etc stuff for you at > the expense of temporarily mallocing a 1-byte (+ rounding up) buffer. >
true Show quoted text
> * replace buffer > > This is a particular use-case that Leon T mentioned, where you > want to do the equivalent of > > free(SvPVX(sv)); > SvPVX(sv) = newptr; > > but safely. > > I could be wrong, but doesn't sv_usepvn_flags () and similar do this?
Yes, but only for Newx blocks and only if SV_HAS_TRAILING_NUL is used. Never pass a non-Newx (AKA malloc) ptr to sv_usepvn_flags. On Tue May 06 15:42:16 2014, LeonT wrote: Show quoted text
> sv_usepvn_flags assumed the buffer is malloc()ed, while in my use-case > it's > mmap()ed. We could add an "un-owned" flag to sv_usepvn_flags for this > sort > of situation though.
See paragraph above. -- bulk88 ~ bulk88 at hotmail.com
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 5.8k
On Tue May 06 07:01:16 2014, davem wrote: Show quoted text
> On Sun, Apr 20, 2014 at 07:04:29PM -0700, Tony Cook via RT wrote:
> > I went through some drafts trying to address this, but I think > > examples like > > this need to stay simple rather than trying to cover every possible > > use of > > the APIs. > > > > So all I've added is a check that offset <= len so we don't return > > undefined content back to the caller. > > > > I don't think it's necessary to document here that the extra memory > > allocated by SvGROW() is undefined, since we don't say what it is > > initialized it, the caller shouldn't be making an assumptions about > > it. > >
>
> > pod/perlguts.pod | 36 ++++++++++++++++++++++++++++++++++++ > > 1 file changed, 36 insertions(+) > > > > diff --git a/pod/perlguts.pod b/pod/perlguts.pod > > index 3fb5137..165e44e 100644 > > --- a/pod/perlguts.pod > > +++ b/pod/perlguts.pod > > @@ -159,6 +159,42 @@ decrease, the allocated memory of an SV and that > > it does not automatically > > add space for the trailing NUL byte (perl's own string functions > > typically do > > C<SvGROW(sv, len + 1)>). > > > > +If you want to write to an existing SV's buffer and set its value to > > a > > +string, use SvPV_force() or one of its variants to force the SV to > > be > > +a PV. This will remove any of various types of non-stringness from > > +the SV while preserving the content of the SV in the PV. This can > > be > > +used to implement something like read(): > > + > > + (void)SvPVbyte_force(sv, len); > > + if (offset > len) > > + croak("Supplied offset beyond the end of the string"); > > + s = SvGROW(sv, offset + needlen + 1); > > + /* something that modifies up to needlen bytes at s+offset, but > > + modifies newlen bytes > > + eg. newlen = read(fd, s + offset. needlen); > > + ignoring errors for these examples > > + */ > > + s[offset + newlen] = '\0'; > > + SvCUR_set(sv, offset + newlen); > > + SvUTF8_off(sv); > > + SvSETMAGIC(sv); > > + > > +If you don't need the existing content of the SV, you can avoid some > > +copying with: > > + > > + if (SvTHINKFIRST(sv)) > > + sv_force_normal_flags(sv, SV_COW_DROP_PV); > > + SvUPGRADE(sv, SVt_PV); > > + s = SvGROW(sv, needlen + 1); > > + /* something that modifies up to needlen bytes at s, but > > modifies > > + newlen bytes > > + eg. newlen = read(fd, s. needlen); > > + */ > > + s[newlen] = '\0'; > > + SvCUR_set(sv, newlen); > > + SvPOK_only(sv); /* also clears SVf_UTF8 */ > > + SvSETMAGIC(sv); > > + > > If you have an SV and want to know what kind of data Perl thinks is > > stored > > in it, you can use the following macros to check the type of SV you > > have.
> > I've been reading and re-reading this thread and the two related ones > (#116569, #116407) and keep ending up confused and undecided. > > My gut reaction to your proposed doc patch is uneasiness. I don't like > the > read() example at at all; it's not immediately obvious what it does, > and > what all the values like offset and needlen represent.
I'm not fond of using read() in the first example, though it's based on a simple implementation of perl's read() function. I'm using read() as the function since I believe (hope!) that most of the people digging this deep into guts to make their code efficient understand the behaviour of the POSIX read() function. Perhaps I should be doing a simple appended read() instead, since it avoid the extra parameter and some of the checks. Show quoted text
> My current understanding is that there are four types of string buffer > activity: >
... Show quoted text
> * get-and-set > > This is where you modify an existing value by directly modifying the > SPVX buffer. Use SvPV_force or variant. If the new string needs to be > larger than the old one, then use SvGROW and SvCUR_set. > > Perl-level equivalent: e.g. substr($var, 1, 3, "abcd"); > > Perhaps a simple string prepend would make a clearer example? > > I think here we should should also point out that the various > sv_cat*() > functions are more convenient for a simple append than messing about > with SvGROW.
If you're doing a string prepend, you should probably using sv_insert(), which I should probably mention. I've simplified the get-and-set to an append, and added notes about sv_cat*() and sv_insert(). Show quoted text
> * set-only > > This is where we don't care about the old value; we just want to set > the var to a new string value. > > Perl-level equivalent: e.g. $var = "new string"; > > This is where it seems to me, that most of the difficultly lies. > This is where there's lots of discussion, with a possible need to > expose previously un- or under-documented and/or non-API functions, > such as SvTHINKFIRST() and sv_force_normal_flags() > > This is a special case of get-and-set, only we're trying to be more > efficient and avoiding unnecessarily copy string buffers, or > converting an NV into a PV, only to discard the result. > > My feeling here is that most of the time people should just use > sv_setpv() and variants if you already know the string value you wish > to assign; for example, > > len = read(fh, buf, 1024); > sv_setpvn(sv, buf, len); > > sv_setpvn() will handle all the de-COWing etc, in an efficient manner.
I've added a reference to sv_setpvn(). Show quoted text
> In those cases where you just want to allocate a particular sized > buffer, but then modify it directly, > > sv_setpvn(sv,"",0); > SvGROW(sv, len); > ... > > does all the sv_force_normal_flags(), SvUPGRADE() etc stuff for you at > the expense of temporarily mallocing a 1-byte (+ rounding up) buffer.
I've changed the example to use that. Show quoted text
> * replace buffer > > This is a particular use-case that Leon T mentioned, where you > want to do the equivalent of > > free(SvPVX(sv)); > SvPVX(sv) = newptr; > > but safely. > > I could be wrong, but doesn't sv_usepvn_flags () and similar do this?
I've added a brief discussion of that. Tony
Subject: 0001-perl-116925-discuss-modifying-an-SV-s-buffer.patch
From 0751e326fb39ab12055583509812e6047ce20e1e Mon Sep 17 00:00:00 2001 From: Tony Cook <tony@develop-help.com> Date: Tue, 27 May 2014 12:01:14 +1000 Subject: [perl #116925] discuss modifying an SV's buffer in the same place we otherwise talk about SVs --- pod/perlguts.pod | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/pod/perlguts.pod b/pod/perlguts.pod index 3fb5137..cf3df89 100644 --- a/pod/perlguts.pod +++ b/pod/perlguts.pod @@ -159,6 +159,58 @@ decrease, the allocated memory of an SV and that it does not automatically add space for the trailing NUL byte (perl's own string functions typically do C<SvGROW(sv, len + 1)>). +If you want to write to an existing SV's buffer and set its value to a +string, use SvPV_force() or one of its variants to force the SV to be +a PV. This will remove any of various types of non-stringness from +the SV while preserving the content of the SV in the PV. This can be +used, for example, to append data from an API function to a buffer +without extra copying: + + (void)SvPVbyte_force(sv, len); + s = SvGROW(sv, len + needlen + 1); + /* something that modifies up to needlen bytes at s+len, but + modifies newlen bytes + eg. newlen = read(fd, s + len, needlen); + ignoring errors for these examples + */ + s[len + newlen] = '\0'; + SvCUR_set(sv, len + newlen); + SvUTF8_off(sv); + SvSETMAGIC(sv); + +If you already have the data in memory or if you want to keep your +code simple, you can use one of the sv_cat*() variants, such as +sv_catpvn(). If you want to insert anywhere in the string you can use +sv_insert() or sv_insert_flags(). + +If you don't need the existing content of the SV, you can avoid some +copying with: + + sv_setpvn(sv, "", 0); + s = SvGROW(sv, needlen + 1); + /* something that modifies up to needlen bytes at s, but modifies + newlen bytes + eg. newlen = read(fd, s. needlen); + */ + s[newlen] = '\0'; + SvCUR_set(sv, newlen); + SvPOK_only(sv); /* also clears SVf_UTF8 */ + SvSETMAGIC(sv); + +Again, if you already have the data in memory or want to avoid the +complexity of the above, you can use sv_setpvn(). + +If you have a buffer allocated with Newx() and want to set that as the +SV's value, you can use sv_usepvn_flags(). That has some requirements +if you want to avoid perl re-allocating the buffer to fit the trailing +NUL: + + Newx(buf, somesize+1, char); + /* ... fill in buf ... */ + buf[somesize] = '\0'; + sv_usepvn_flags(sv, buf, somesize, SV_SMAGIC | SV_HAS_TRAILING_NUL); + /* buf now belongs to perl, don't release it */ + If you have an SV and want to know what kind of data Perl thinks is stored in it, you can use the following macros to check the type of SV you have. -- 1.7.10.4
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 148b
On Mon May 26 19:02:51 2014, tonyc wrote: ... Show quoted text
> I've added a brief discussion of that.
Applied as 21134f66d28b7acf8dde08e536e8b4ee120ea3af. Tony


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org