Skip Menu |
Report information
Id: 132665
Status: new
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: LeonT <fawaka [at] gmail.com>
Cc:
AdminCc:

Operating System: Linux
PatchStatus: (no value)
Severity: low
Type: library
Perl Version: 5.24.0
Fixed In: (no value)

Attachments
0001-Enforce-STOP_AT_PARTIAL-in-PerlIO-encoding-fallback.patch
0001-PerlIO-encoding-has-a-fallback-variable-that-allows-.patch
0002-Disallow-coderef-in-PerlIO-encoding-fallback.patch



To: perlbug [...] perl.org
Date: Fri, 29 Dec 2017 14:11:10 +0100
Subject: Always use STOP_AT_PARTIAL in PerlIO::encoding
From: Leon Timmermans <fawaka [...] gmail.com>
Download (untitled) / with headers
text/plain 4.2k
This is a bug report for perl from fawaka@gmail.com, generated with the help of perlbug 1.40 running under perl 5.24.0. ----------------------------------------------------------------- [Please describe your issue here] PerlIO::encoding has a $fallback variable that allows one to set the behavior on a encoding/decoding error, for example to make it throw an exception on error. What is not documented (actually the example in the documentation is even missing this) is that PerlIO::encoding needs the (equally undocumented) Encode::STOP_AT_PARTIAL flag to be set, otherwise a multi-byte character spanning buffer boundaries will be interpreted as two invalid byte sequences. I could have fixed the documentation, but instead I fixed the code to always pass this flag to Encode, simplifying the use and making the current documentation correct again. [Please do not change anything below this line] ----------------------------------------------------------------- --- Flags: category=library severity=low module=PerlIO::encoding --- Site configuration information for perl 5.24.0: Configured by leont at Thu Jun 9 10:14:26 CEST 2016. Summary of my perl5 (revision 5 version 24 subversion 0) configuration: Platform: osname=linux, osvers=4.5.4-1-arch, archname=x86_64-linux-thread-multi uname='linux leon-laptop 4.5.4-1-arch #1 smp preempt wed may 11 22:21:28 cest 2016 x86_64 gnulinux ' config_args='-de -Dprefix=/home/leont/perl5/perlbrew/perls/perl-5.24.0 -Dusethreads -Aeval:scriptdir=/home/leont/perl5/perlbrew/perls/perl-5.24.0/bin' hint=recommended, useposix=true, d_sigaction=define useithreads=define, usemultiplicity=define use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include' ccversion='', gccversion='6.1.1 20160501', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector-strong -L/usr/local/lib' libpth=/usr/local/lib /usr/lib/gcc/x86_64-pc-linux-gnu/6.1.1/include-fixed /usr/lib /lib/../lib /usr/lib/../lib /lib /lib64 /usr/lib64 libs=-lpthread -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc libc=libc-2.23.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.23' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong' Locally applied patches: Devel::PatchPerl 1.40 --- @INC for perl 5.24.0: /home/leont/perl5/perlbrew/perls/perl-5.24.0/lib/site_perl/5.24.0/x86_64-linux-thread-multi /home/leont/perl5/perlbrew/perls/perl-5.24.0/lib/site_perl/5.24.0 /home/leont/perl5/perlbrew/perls/perl-5.24.0/lib/5.24.0/x86_64-linux-thread-multi /home/leont/perl5/perlbrew/perls/perl-5.24.0/lib/5.24.0 . --- Environment for perl 5.24.0: HOME=/home/leont LANG=en_US.UTF-8 LANGUAGE=en_US.UTF-8 LC_MEASUREMENT=en_IE.UTF-8 LC_MESSAGES=POSIX LC_MONETARY=en_IE.UTF-8 LC_PAPER=en_IE.UTF-8 LC_TIME=en_IE.UTF-8 LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/leont/perl5/perlbrew/bin:/home/leont/perl5/perlbrew/perls/perl-5.24.0/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl PERLBREW_HOME=/home/leont/.perlbrew PERLBREW_MANPATH=/home/leont/perl5/perlbrew/perls/perl-5.24.0/man PERLBREW_PATH=/home/leont/perl5/perlbrew/bin:/home/leont/perl5/perlbrew/perls/perl-5.24.0/bin PERLBREW_PERL=perl-5.24.0 PERLBREW_ROOT=/home/leont/perl5/perlbrew PERLBREW_SHELLRC_VERSION=0.82 PERLBREW_VERSION=0.82 PERL_BADLANG (unset) SHELL=/bin/bash

Message body is not shown because sender requested not to inline it.

RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 1.1k
On Fri, 29 Dec 2017 05:11:38 -0800, LeonT wrote: Show quoted text
> PerlIO::encoding has a $fallback variable that allows one to set the > behavior on a encoding/decoding error, for example to make it throw an > exception on error. What is not documented (actually the example in > the > documentation is even missing this) is that PerlIO::encoding needs the > (equally undocumented) Encode::STOP_AT_PARTIAL flag to be set, > otherwise > a multi-byte character spanning buffer boundaries will be interpreted > as > two invalid byte sequences. I could have fixed the documentation, but > instead I fixed the code to always pass this flag to Encode, > simplifying > the use and making the current documentation correct again.
It turned out that Encode allows one to pass a coderef instead of a set of flags to handle. This however doesn't allow one to pass STOP_AT_PARTIAL, which means it has always been buggy on buffer boundaries. With my new automatic STOP_AT_PARTIAL passing this would result in an unpredictable value (based on the pointer value of the CV), this is clearly undesirable. Instead we now disallow it in PerlIO::encoding. This could be changed at some point in the future once Encode adds support for it. Leon
Subject: 0001-Enforce-STOP_AT_PARTIAL-in-PerlIO-encoding-fallback.patch
From 35d99902af4832a40c4aa9d88895f98aa0b22755 Mon Sep 17 00:00:00 2001 From: Leon Timmermans <fawaka@gmail.com> Date: Thu, 28 Dec 2017 19:23:03 +0100 Subject: [PATCH 1/2] Enforce STOP_AT_PARTIAL in $PerlIO::encoding::fallback PerlIO::encoding has a $fallback variable that allows one to set the behavior on a encoding/decoding error, for example to make it throw an exception on error. What is not documented (actually the example in the documentation is even missing this) is that PerlIO::encoding needs the (equally undocumented) Encode::STOP_AT_PARTIAL flag to be set, otherwise a multi-byte character spanning buffer boundaries will be interpreted as two invalid byte sequences. I could have fixed the documentation, but instead I fixed the code to always pass this flag to Encode, simplifying the use and making the current documentation correct again. --- ext/PerlIO-encoding/encoding.pm | 5 ++--- ext/PerlIO-encoding/encoding.xs | 15 +++++++++++++++ 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/ext/PerlIO-encoding/encoding.pm b/ext/PerlIO-encoding/encoding.pm index 08d2df4713..c3b36df2ee 100644 --- a/ext/PerlIO-encoding/encoding.pm +++ b/ext/PerlIO-encoding/encoding.pm @@ -1,7 +1,7 @@ package PerlIO::encoding; use strict; -our $VERSION = '0.25'; +our $VERSION = '0.26'; our $DEBUG = 0; $DEBUG and warn __PACKAGE__, " called by ", join(", ", caller), "\n"; @@ -13,8 +13,7 @@ $DEBUG and warn __PACKAGE__, " called by ", join(", ", caller), "\n"; require XSLoader; XSLoader::load(); -our $fallback = - Encode::PERLQQ()|Encode::WARN_ON_ERR()|Encode::STOP_AT_PARTIAL(); +our $fallback = Encode::PERLQQ()|Encode::WARN_ON_ERR(); 1; __END__ diff --git a/ext/PerlIO-encoding/encoding.xs b/ext/PerlIO-encoding/encoding.xs index bb4754f3d9..39ac40f182 100644 --- a/ext/PerlIO-encoding/encoding.xs +++ b/ext/PerlIO-encoding/encoding.xs @@ -5,6 +5,10 @@ #define U8 U8 #define OUR_DEFAULT_FB "Encode::PERLQQ" +#define OUR_STOP_AT_PARTIAL "Encode::STOP_AT_PARTIAL" + +/* This will be set during BOOT */ +static unsigned int encode_stop_at_partial = 0; #if defined(USE_PERLIO) @@ -164,6 +168,7 @@ PerlIOEncode_pushed(pTHX_ PerlIO * f, const char *mode, SV * arg, PerlIO_funcs * } e->chk = newSVsv(get_sv("PerlIO::encoding::fallback", 0)); + SvUV_set(e->chk, SvUV(e->chk) | encode_stop_at_partial); e->inEncodeCall = 0; FREETMPS; @@ -685,6 +690,16 @@ BOOT: } SPAGAIN; sv_setsv(chk, POPs); + + PUSHMARK(sp); + PUTBACK; + if (call_pv(OUR_STOP_AT_PARTIAL, G_SCALAR) != 1) { + /* should never happen */ + Perl_die(aTHX_ "%s did not return a value", OUR_STOP_AT_PARTIAL); + } + SPAGAIN; + encode_stop_at_partial = POPu; + PUTBACK; #ifdef PERLIO_LAYERS PerlIO_define_layer(aTHX_ PERLIO_FUNCS_CAST(&PerlIO_encode)); -- 2.15.0-291-g0d8980c
Subject: 0002-Disallow-coderef-in-PerlIO-encoding-fallback.patch
From f4e72dbfa4b0d957dd2dcfcf63b615ddd38ae786 Mon Sep 17 00:00:00 2001 From: Leon Timmermans <fawaka@gmail.com> Date: Thu, 4 Jan 2018 19:56:03 +0100 Subject: [PATCH 2/2] Disallow coderef in $PerlIO::encoding::fallback Encode allows one to pass a coderef instead of a set of flags to handle. This however doesn't allow one to pass STOP_AT_PARTIAL, which means it has always been buggy on buffer boundaries. With my new automatic STOP_AT_PARTIAL passing this would result in an unpredictable value. Instead we now disallow it in PerlIO::encoding. --- ext/PerlIO-encoding/encoding.xs | 2 ++ 1 file changed, 2 insertions(+) diff --git a/ext/PerlIO-encoding/encoding.xs b/ext/PerlIO-encoding/encoding.xs index 39ac40f182..66728734d6 100644 --- a/ext/PerlIO-encoding/encoding.xs +++ b/ext/PerlIO-encoding/encoding.xs @@ -168,6 +168,8 @@ PerlIOEncode_pushed(pTHX_ PerlIO * f, const char *mode, SV * arg, PerlIO_funcs * } e->chk = newSVsv(get_sv("PerlIO::encoding::fallback", 0)); + if (SvROK(e->chk)) + Perl_croak(aTHX_ "PerlIO::encoding::fallback must be an integer"); SvUV_set(e->chk, SvUV(e->chk) | encode_stop_at_partial); e->inEncodeCall = 0; -- 2.15.0-291-g0d8980c


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org