Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing XSUB functions for UTF-8 char* buffers #17079

Open
p5pRT opened this issue Jul 4, 2019 · 9 comments
Open

Missing XSUB functions for UTF-8 char* buffers #17079

p5pRT opened this issue Jul 4, 2019 · 9 comments

Comments

@p5pRT
Copy link

p5pRT commented Jul 4, 2019

Migrated from rt.perl.org#134262 (status was 'open')

Searchable as RT134262$

@p5pRT
Copy link
Author

p5pRT commented Jul 4, 2019

From @pali

Hi! Currently there are (for x86) XSUB functions/macros which take only
Latin1 buffer. E.g. XST_mPV(), XSRETURN_PV(), POPpbytex, PUSHp(),
XPUSHs(), etc...

Could it be possible to add also UTF8 functions/macros variants?
E.g. XST_mPVutf8, XSRETURN_PVUTF8, POPputf8x, PUSHputf8, ...

It would simply working with UTF-8 char* strings as currently the only
way is to use XSRETURN_SV / POPs / PUSHs macros and construct SV* from
UTF-8 manually via newSVpvn_utf8().

And UTF-8 char* strings are needed to deal with UNICODE Perl strings
correctly, as Latin1 char* strings can store only U+00 .. U+FF UNICODE
codepoints.

@p5pRT
Copy link
Author

p5pRT commented Jul 4, 2019

From @jkeenan

On Thu, 04 Jul 2019 11​:02​:41 GMT, pali@​cpan.org wrote​:

Hi! Currently there are (for x86) XSUB functions/macros which take only
Latin1 buffer. E.g. XST_mPV(), XSRETURN_PV(), POPpbytex, PUSHp(),
XPUSHs(), etc...

Could it be possible to add also UTF8 functions/macros variants?
E.g. XST_mPVutf8, XSRETURN_PVUTF8, POPputf8x, PUSHputf8, ...

It would simply working with UTF-8 char* strings as currently the only
way is to use XSRETURN_SV / POPs / PUSHs macros and construct SV* from
UTF-8 manually via newSVpvn_utf8().

And UTF-8 char* strings are needed to deal with UNICODE Perl strings
correctly, as Latin1 char* strings can store only U+00 .. U+FF UNICODE
codepoints.

Karl, would this be related to https://rt-archive.perl.org/perl5/Ticket/Display.html?id=134142 ?

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Jul 4, 2019

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2019

From @khwilliamson

On 7/4/19 7​:00 AM, James E Keenan via RT wrote​:

On Thu, 04 Jul 2019 11​:02​:41 GMT, pali@​cpan.org wrote​:

Hi! Currently there are (for x86) XSUB functions/macros which take only
Latin1 buffer. E.g. XST_mPV(), XSRETURN_PV(), POPpbytex, PUSHp(),
XPUSHs(), etc...

Could it be possible to add also UTF8 functions/macros variants?
E.g. XST_mPVutf8, XSRETURN_PVUTF8, POPputf8x, PUSHputf8, ...

It would simply working with UTF-8 char* strings as currently the only
way is to use XSRETURN_SV / POPs / PUSHs macros and construct SV* from
UTF-8 manually via newSVpvn_utf8().

And UTF-8 char* strings are needed to deal with UNICODE Perl strings
correctly, as Latin1 char* strings can store only U+00 .. U+FF UNICODE
codepoints.

Karl, would this be related to https://rt-archive.perl.org/perl5/Ticket/Display.html?id=134142 ?

No.

pali, do you have patches?

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2019

From @pali

On Saturday 06 July 2019 08​:57​:00 karl williamson via RT wrote​:

On 7/4/19 7​:00 AM, James E Keenan via RT wrote​:

On Thu, 04 Jul 2019 11​:02​:41 GMT, pali@​cpan.org wrote​:

Hi! Currently there are (for x86) XSUB functions/macros which take only
Latin1 buffer. E.g. XST_mPV(), XSRETURN_PV(), POPpbytex, PUSHp(),
XPUSHs(), etc...

Could it be possible to add also UTF8 functions/macros variants?
E.g. XST_mPVutf8, XSRETURN_PVUTF8, POPputf8x, PUSHputf8, ...

It would simply working with UTF-8 char* strings as currently the only
way is to use XSRETURN_SV / POPs / PUSHs macros and construct SV* from
UTF-8 manually via newSVpvn_utf8().

And UTF-8 char* strings are needed to deal with UNICODE Perl strings
correctly, as Latin1 char* strings can store only U+00 .. U+FF UNICODE
codepoints.

Karl, would this be related to https://rt-archive.perl.org/perl5/Ticket/Display.html?id=134142 ?

No.

pali, do you have patches?

No, I have not written anything for this.

@p5pRT
Copy link
Author

p5pRT commented Oct 14, 2019

From @pali

On Saturday 06 July 2019 17​:58​:11 pali@​cpan.org wrote​:

On Saturday 06 July 2019 08​:57​:00 karl williamson via RT wrote​:

On 7/4/19 7​:00 AM, James E Keenan via RT wrote​:

On Thu, 04 Jul 2019 11​:02​:41 GMT, pali@​cpan.org wrote​:

Hi! Currently there are (for x86) XSUB functions/macros which take only
Latin1 buffer. E.g. XST_mPV(), XSRETURN_PV(), POPpbytex, PUSHp(),
XPUSHs(), etc...

Could it be possible to add also UTF8 functions/macros variants?
E.g. XST_mPVutf8, XSRETURN_PVUTF8, POPputf8x, PUSHputf8, ...

It would simply working with UTF-8 char* strings as currently the only
way is to use XSRETURN_SV / POPs / PUSHs macros and construct SV* from
UTF-8 manually via newSVpvn_utf8().

And UTF-8 char* strings are needed to deal with UNICODE Perl strings
correctly, as Latin1 char* strings can store only U+00 .. U+FF UNICODE
codepoints.

Karl, would this be related to https://rt-archive.perl.org/perl5/Ticket/Display.html?id=134142 ?

No.

pali, do you have patches?

No, I have not written anything for this.

So should I prepare some of them?

@p5pRT
Copy link
Author

p5pRT commented Oct 14, 2019

From @khwilliamson

Sure

Sent from my iPhone

On Oct 14, 2019, at 1​:03 PM, pali@​cpan.org wrote​:

On Saturday 06 July 2019 17​:58​:11 pali@​cpan.org wrote​:

On Saturday 06 July 2019 08​:57​:00 karl williamson via RT wrote​:
On 7/4/19 7​:00 AM, James E Keenan via RT wrote​:

On Thu, 04 Jul 2019 11​:02​:41 GMT, pali@​cpan.org wrote​:

Hi! Currently there are (for x86) XSUB functions/macros which take only
Latin1 buffer. E.g. XST_mPV(), XSRETURN_PV(), POPpbytex, PUSHp(),
XPUSHs(), etc...

Could it be possible to add also UTF8 functions/macros variants?
E.g. XST_mPVutf8, XSRETURN_PVUTF8, POPputf8x, PUSHputf8, ...

It would simply working with UTF-8 char* strings as currently the only
way is to use XSRETURN_SV / POPs / PUSHs macros and construct SV* from
UTF-8 manually via newSVpvn_utf8().

And UTF-8 char* strings are needed to deal with UNICODE Perl strings
correctly, as Latin1 char* strings can store only U+00 .. U+FF UNICODE
codepoints.

Karl, would this be related to https://rt-archive.perl.org/perl5/Ticket/Display.html?id=134142 ?

No.

pali, do you have patches?

No, I have not written anything for this.

So should I prepare some of them?

@toddr
Copy link
Member

toddr commented Oct 21, 2019

@pali we take pull requests now

@p5pRT p5pRT added the khw label Oct 25, 2019
@toddr toddr removed the khw label Oct 25, 2019
@xenu xenu removed the Severity Low label Dec 29, 2021
@khwilliamson
Copy link
Contributor

@pali
Patches welcome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants