New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SvPVutf8 != SvPVX, and sv_2pvutf8 #12703
Comments
From @bulk88Created by @bulk88I recently found a bug in a core module, where a loop was done over a http://perl5.git.perl.org/perl.git/blob/a8c6ff7b8e8c6037333c21f9b3f6b38b9278df4f:/sv.c#l3060 A mortal copy was made of the RO SV was made and returned as the result All that SvPVutf8's docs say is "Like C<SvPV>, but converts sv to utf8 SvPV says "The SV may cache the stringified version becoming C<SvPOK>." This ticket might be related to Perl Info
|
From @tonycozOn Tue, Jan 15, 2013 at 01:24:29PM -0800, bulk88 wrote:
Perhaps: Like C<SvPV>, but returns always returns a utf8 string. char* SvPVutf8(SV* sv, STRLEN len) This may return a pointer to a temporary instead of C<SvPVX(sv)>. The other SvPVutf8* macros should be updated too. Tony |
The RT System itself - Status changed from 'new' to 'open' |
From @bulk88On Tue Jan 15 15:37:17 2013, tonyc wrote:
Does anyone else have comments on this bug report or a wording proposal? -- |
From @iabynOn Wed, Jan 16, 2013 at 10:36:12AM +1100, Tony Cook wrote:
Perhaps we just need to clarify SvPV and rely on the fact that all the I'd suggest adding the following to SvPV: Note that there is no guarantee that that the return value of And for SvPV_force, which currently says: Like C<SvPV> but will force the SV into containing a string I'd suggest s/You want force/You need force/ (to make it stronger), and Note that coercing an arbitrary scalar into a plain PV will Then make the SvPV_force_nomg() doc refer to SvPV_force() rather than Note that updating SvPV* docs like this will also help with the goal of -- |
From @iabynOn Mon, May 06, 2013 at 03:59:20PM +0100, Dave Mitchell wrote:
In the absence of any feedback, I've applied the following commit: commit fd14238 clarify SvPV* and SvPV_force* docs Affected files ... Differences ... Inline Patchdiff --git a/sv.h b/sv.h
index fd3da2d..4bfbf9c 100644
--- a/sv.h
+++ b/sv.h
@@ -852,8 +852,8 @@ Set the actual length of the string which is in the SV. See C<SvIV_set>.
=for apidoc Am|U32|SvUTF8|SV* sv
Returns a U32 value indicating the UTF-8 status of an SV. If things are set-up
properly, this indicates whether or not the SV contains UTF-8 encoded data.
-Call this after SvPV() in case any call to string overloading updates the
-internal flag.
+You should use this I<after> a call to SvPV() or one of its variants, in
+case any call to string overloading updates the internal flag.
=for apidoc Am|void|SvUTF8_on|SV *sv
Turn on the UTF-8 status of an SV (the data is not changed, just the flag).
@@ -1471,13 +1471,17 @@ attention to precisely which outputs are influenced by which inputs.
/*
=for apidoc Am|char*|SvPV_force|SV* sv|STRLEN len
Like C<SvPV> but will force the SV into containing a string (C<SvPOK>), and
-only a string (C<SvPOK_only>), by hook or by crook. You want force if you are
+only a string (C<SvPOK_only>), by hook or by crook. You need force if you are
going to update the C<SvPVX> directly. Processes get magic.
+Note that coercing an arbitrary scalar into a plain PV will potentially
+strip useful data from it. For example if the SV was C<SvROK>, then the
+referent will have its reference count decremented, and the SV itself may
+be converted to an C<SvPOK> scalar with a string buffer containing a value
+such as C<"ARRAY(0x1234)">.
+
=for apidoc Am|char*|SvPV_force_nomg|SV* sv|STRLEN len
-Like C<SvPV> but will force the SV into containing a string (C<SvPOK>), and
-only a string (C<SvPOK_only>), by hook or by crook. You want force if you are
-going to update the C<SvPVX> directly. Doesn't process get magic.
+Like C<SvPV_force>, but doesn't process get magic.
=for apidoc Am|char*|SvPV|SV* sv|STRLEN len
Returns a pointer to the string in the SV, or a stringified form of
@@ -1485,6 +1489,13 @@ the SV if the SV does not contain a string. The SV may cache the
stringified version becoming C<SvPOK>. Handles 'get' magic. See also
C<SvPVx> for a version which guarantees to evaluate sv only once.
+Note that there is no guarantee that the return value of C<SvPV()> is
+equal to C<SvPVX(sv)> (or even that C<SvPVX(sv)> contains valid data), due
+to the way that things like overloading and Copy-On-Write are handled. In
+these cases, the return value may point to a temporary buffer or similar.
+If you absolutely need the SvPVX field to be valid (for example, if you
+intend to write to it), then see L</SvPV_force>.
+
=for apidoc Am|char*|SvPVx|SV* sv|STRLEN len
A version of C<SvPV> which guarantees to evaluate C<sv> only once.
Only use this if C<sv> is an expression with side effects, otherwise use the
@@ -1494,9 +1505,7 @@ more efficient C<SvPV>.
Like C<SvPV> but doesn't process magic.
=for apidoc Am|char*|SvPV_nolen|SV* sv
-Returns a pointer to the string in the SV, or a stringified form of
-the SV if the SV does not contain a string. The SV may cache the
-stringified form becoming C<SvPOK>. Handles 'get' magic.
+Like C<SvPV> but doesn't set a length variable.
=for apidoc Am|char*|SvPV_nomg_nolen|SV* sv
Like C<SvPV_nolen> but doesn't process magic.
-- Counsellor Troi states something other than the blindingly obvious. |
@iabyn - Status changed from 'open' to 'resolved' |
From @nwc10On Thu, May 09, 2013 at 03:52:15PM +0100, Dave Mitchell wrote:
Sorry, I'm one of about 600 people guilty of this.
Thanks for stressing this further.
This also looks useful. Is there a good way to add (without clutter) the That has caught out some code before, including in the core. Nicholas Clark |
From @iabynOn Thu, May 09, 2013 at 03:56:49PM +0100, Nicholas Clark wrote:
Is this good enough for yuz? commit fe02ddb further tweak SvPV() docs Affected files ... Differences ... Inline Patchdiff --git a/sv.h b/sv.h
index 4bfbf9c..7baef34 100644
--- a/sv.h
+++ b/sv.h
@@ -1490,11 +1490,13 @@ stringified version becoming C<SvPOK>. Handles 'get' magic. See also
C<SvPVx> for a version which guarantees to evaluate sv only once.
Note that there is no guarantee that the return value of C<SvPV()> is
-equal to C<SvPVX(sv)> (or even that C<SvPVX(sv)> contains valid data), due
-to the way that things like overloading and Copy-On-Write are handled. In
-these cases, the return value may point to a temporary buffer or similar.
-If you absolutely need the SvPVX field to be valid (for example, if you
-intend to write to it), then see L</SvPV_force>.
+equal to C<SvPVX(sv)>, or that C<SvPVX(sv)> contains valid data, or that
+successive calls to C<SvPV(sv)) will return the same pointer value each
+time. This is due to the way that things like overloading and
+Copy-On-Write are handled. In these cases, the return value may point to
+a temporary buffer or similar. If you absolutely need the SvPVX field to
+be valid (for example, if you intend to write to it), then see
+L</SvPV_force>.
-- "I do not resent criticism, even when, for the sake of emphasis, |
From @bulk88On Thu May 09 09:07:02 2013, davem wrote:
"(for example, if you intend to write to it), then see L</SvPV_force>." SvPV_force runs get magic, so if you are "writing to SvPVX", you just at -- |
From @iabynOn Thu, May 09, 2013 at 11:54:47AM -0700, bulk88 via RT wrote:
I have no idea what point you're trying to make here.
Currently near the top of my things to do prior to 5.18 is to add some -- |
From @iabynOn Fri, May 10, 2013 at 12:13:38PM +0100, Dave Mitchell wrote:
It won't be in time for perl 5.18.0 now, but in terms of what we need to add Given an arbitrary SV, 1) if you want to *read* its string value, use SvPV (or one of it's many Note that this preserves the original content of the SV. For example if As a shortcut, if you've already done any get magic processing that was 2) If you want to *modify* the string value of an SV by writing into its Note that this may destroy the original content of the SV. For example if After modifying the buffer, you will need to do SvSETMAGIC for the As a shortcut, if you've already done any get magic processing that was 3) if an XS sub is declared as having a 'char *' or similar argument, -- |
From @iabynOn Fri, May 17, 2013 at 03:14:16PM +0100, Dave Mitchell wrote:
No-ones provided any feedback yet, but I think I've already spotted an
[snip]
I still think that's true
[snip]
I don't think that's true. COW vars have SvPOK set, but you still $ perl5163 -MDevel::Peek -e '%h=qw(a 1); Dump $_ for keys %h' So, I *suspect* that the correct answer is that if POK is set, you still -- |
From @cpansproutOn Wed May 22 11:36:42 2013, davem wrote:
Currently sv_force_normal is already part of the API, but anything Since SvPV_force will itself call sv_force_normal, we can simply If the XS author wants to use the existing buffer (for speed) but drop Now, do we want to add THINKFIRST to the API? I really don’t know. But -- Father Chrysostomos |
From @LeontOn Fri, May 17, 2013 at 4:14 PM, Dave Mitchell <davem@iabyn.com> wrote:
My use-case is related but now quite the same. I want to get rid of the current buffer to replace it with my own (in Currently, I'm doing something like: SV_CHECK_THINKFIRST_COW_DROP(sv), I do think that's a valid, if unusual, use-case. Leon |
From @bulk88On Sun Jun 09 11:21:54 2013, sprout wrote:
XS requires a read-only buffer, read/write buffer, and write only -- |
Migrated from rt.perl.org#116407 (status was 'resolved')
Searchable as RT116407$
The text was updated successfully, but these errors were encountered: