New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding error in UTF-8 locales #8385
Comments
From vincent-perl@vinc17.netCreated by vincent@vinc17.orgConsider the following script: #!/usr/bin/env perl use strict; my $parser = XML::LibXML->new(); open OUT, '>:encoding(iso-8859-1)', 'out.xml' or die "$!"; where file.xml is: <?xml version="1.0" encoding="iso-8859-1"?> I get the following error: $ LC_CTYPE="en_US.UTF-8" ./encoding-bug Without the "$string =~ s/xml//s;", there are no errors. The problem Perl Info
|
From @nwc10On Wed, Mar 29, 2006 at 02:51:36PM -0800, Vincent Lefevre wrote:
Removing the non-core dependency gives: use strict; open OUT, '>:encoding(iso-8859-1)', 'out.xml' or die "$!";
With the output SV = PV(0x8147080) at 0x81e0260 The 0xfffd characters are coming as a side effect of the regular expression I'm not sure what's at fault. Possibly the documentation. Nicholas Clark |
The RT System itself - Status changed from 'new' to 'open' |
From vincent-perl@vinc17.netOn 2006-03-30 02:49:37 -0800, Nicholas Clark via RT wrote:
It seems that you significantly changed the script, since with still SV = PV(0x1844754) at 0x1861cc8
On more complex files, this leads to a panic: sv_setpvn called with negative strlen. so, I wouldn't say this is just the documentation. -- |
From vincent-perl@vinc17.netOn 2006-03-30 15:06:28 +0200, Vincent Lefevre wrote:
Here's a simple example: #!/usr/bin/env perl use strict; my @t = qw/230 13 90 65 34 239 86 15 8 26 181 25 305 123 22 139 111 6 3 -- |
From BQW10602@nifty.comOn Thu, 30 Mar 2006 17:03:02 +0200, Vincent Lefevre <vincent@vinc17.org> wrote
It can be simplified more; it smells of some buffer of 1024 bytes. use strict; SADAHIRO Tomoyuki |
p5p@spam.wizbit.be - Status changed from 'open' to 'stalled' |
From @nwc10On Fri Mar 31 18:15:11 2006, BQW10602@nifty.com wrote:
For reference, this test case: use strict; my @t = qw/230 13 90 65 34 239 86 15 8 26 181 25 305 123 22 139 111 6 3 which Karl created in an attempt to bisect to find the commit which So figuring out those may shed more light on this. Nicholas Clark |
The RT System itself - Status changed from 'stalled' to 'open' |
From @khwilliamsonOn Fri Jun 21 14:08:58 2013, nicholas wrote:
I worked a little more on this and have come to believe that the fault -- |
From @khwilliamsonOn Sat Jun 22 09:03:47 2013, khw wrote:
I already got a response to that ticket! ============= https://rt-archive.perl.org/perl5//Public/Bug/Display.html?id=38812 and found if it is due to the hard-codeded buffer size of 1024 @ Dan the Encode MaintainerSo my belief was wrong that it was Encode alone. I'm hoping someone |
From dankogai@dan.co.jpThough this is not a solution, we'd better increase the buffer size of PerlIO::Encode. I don't quite remember why it is 1024 but it is terribly low for today's use cases. Also not that the use of encoding pragma is not recommended when there is a chance of encoding error since PerlIO has no room for error detection and recovery (besides emitting errors). In which case you should decode after read and encode before print. Dan the Encode Maintainer On 23 Jun 2013, at 03:14 , "Karl Williamson via RT" <perlbug-followup@perl.org> wrote:
|
From @cpansproutOn Sat Jun 22 11:14:04 2013, khw wrote:
Interesting. Could this be related to #115262? I don’t know much about In the process of fixing it, I noticed that it might pass partial If it is, then the real solution (as mention in #115262) is to cache the -- Father Chrysostomos |
From @cpansproutI removed the link to the encoding.pm ticket, since the bug was actually coming from PerlIO::encoding. The use of encoding.pm was unrelated. -- Father Chrysostomos |
From @khwilliamsonThis has been fixed by many changes to locale handling over the years. |
@khwilliamson - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#38812 (status was 'resolved')
Searchable as RT38812$
The text was updated successfully, but these errors were encountered: