Skip Menu |
Report information
Id: 132782
Status: open
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: pali [at] cpan.org
Cc:
AdminCc:

Operating System: (no value)
PatchStatus: (no value)
Severity: low
Type: unknown
Perl Version: (no value)
Fixed In: (no value)

Attachments
0001-Implement-sv_utf8_downgrade_nomg.patch
0002-Fix-do_vecget-and-do_vecset-to-process-GET-magic-onl.patch
0003-Implement-SvPVutf8_nomg-and-SvPVbyte_nomg.patch
0004-Implement-SvPV-_or_null.patch



Date: Mon, 29 Jan 2018 16:30:04 +0100
From: pali [...] cpan.org
Subject: Missing SvPV* utf8/byte nomg macro variants
To: perlbug [...] perl.org
Hi! Currently in perl there are missing SvPVutf8_nomg and SvPVbyte_nomg macros, equivalent of SvPVutf8 and SvPVbyte, but without processing get magic. To write XS module correctly without being affected by Perl's The Unicode Bug, it is easier to use SvPVutf8 resp. SvPVbyte macros instead of combination of SvPV + SvUTF8 with manual converting Latin1 to utf8. But if it is needed to distinguish between undef and string in function implemented in XS, then it SvPVutf8 cannot be used as it throw warning in case scalar is undef. I think that it is common requirement to support API undef or string, therefore SvPVutf8_nomg would be really useful. Currently for API which accepts undef or string is required something like this: void function(arg) SV *arg PREINIT: SV *tmp; char *str; STRLEN len; INIT: SvGETMAGIC(arg); CODE: if (SvOK(arg)) { str = SvPV_nomg(arg, len); if (!SvUTF8(arg) { if (SvGMAGICAL(arg)) tmp = sv_2mortal(newSVpvn(str, len)); else tmp = arg; str = SvPVutf8(tmp, len); } } else { str = NULL; len = 0; } ... now str/len contains either NULL or utf8 representation of arg ... Which is really non-intuitive and hard to write from scratch for novice as there is fully missing such (very common) example in any perl documentation. With SvPVutf8_nomg it would reduce code just to: void function(arg) SV *arg PREINIT: char *str; STRLEN len; INIT: SvGETMAGIC(arg); CODE: if (SvOK(arg)) { str = SvPVutf8_nomg(arg, len); } else { str = NULL; len = 0; } ... now str/len contains either NULL or utf8 representation of arg ... Maybe some SvPV* macro which would return NULL without warning for undefined value can be useful too to even more simplify that code. Also, perlapi documentation should suggest to use SvPVutf8 (reps. SvPVbyte) function instead of SvPV as without processing SvUTF8() check, such code is affected by the Perl's Unicode Bug. Also, to prevent processing get magic more times, it is needed to call get magic only once in XS function, so ideally with SvGETMAGIC() and then using only *_nomg functions/macros.
RT-Send-CC: perl5-porters [...] perl.org
Patches welcome :) -- Karl Williamson
To: perlbug-followup [...] perl.org
Date: Tue, 6 Feb 2018 10:36:35 +0100
From: pali [...] cpan.org
Subject: Re: [perl #132782] Missing SvPV* utf8/byte nomg macro variants
Download (untitled) / with headers
text/plain 894b
On Wednesday 31 January 2018 06:48:21 Karl Williamson via RT wrote: Show quoted text
> Patches welcome :)
What about something like this? #ifndef SvPVutf8_nomg PERL_STATIC_INLINE char * SvPVutf8_nomg(pTHX_ SV *sv, STRLEN *len) { char *buf = SvPV_nomg(sv, *len); if (SvUTF8(sv)) return buf; if (SvGMAGICAL(sv)) sv = sv_2mortal(newSVpvn(buf, *len)); /* There is sv_utf8_upgrade_nomg(), but it is broken prior to Perl version 5.13.10 */ return SvPVutf8(sv, *len); } #define SvPVutf8_nomg(sv, len) SvPVutf8_nomg(aTHX_ (sv), &(len)) #endif #ifndef SvPVbyte_nomg PERL_STATIC_INLINE char * SvPVbyte_nomg(pTHX_ SV *sv, STRLEN *len) { char *buf = SvPV_nomg(sv, *len); if (!SvUTF8(sv)) return buf; if (SvGMAGICAL(sv)) { sv = sv_2mortal(newSVpvn(buf, *len)); SvUTF8_on(sv); } return SvPVbyte(sv, *len); } #define SvPVbyte_nomg(sv, len) SvPVbyte_nomg(aTHX_ (sv), &(len)) #endif
To: perlbug-followup [...] perl.org
Date: Sun, 11 Feb 2018 12:44:31 +0100
From: pali [...] cpan.org
Subject: Re: [perl #132782] Missing SvPV* utf8/byte nomg macro variants
Download (untitled) / with headers
text/plain 322b
In attachment are RFC patches for new functions/macros: sv_utf8_downgrade_flags() sv_utf8_downgrade_nomg() sv_2pvbyte_flags() sv_2pvutf8_flags() SvPVutf8_nomg() SvPVbyte_nomg() SvPV_or_null() SvPV_or_null_nomg() SvPVutf8_or_null() SvPVutf8_or_null_nomg() SvPVbyte_or_null() SvPVbyte_or_null_nomg() What about them?

Message body is not shown because sender requested not to inline it.

Message body is not shown because sender requested not to inline it.

Message body is not shown because sender requested not to inline it.

Message body is not shown because sender requested not to inline it.



This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org