Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple encodings causes "panic: sv_setpvn called with negative strlen" #8166

Closed
p5pRT opened this issue Oct 25, 2005 · 12 comments
Closed

multiple encodings causes "panic: sv_setpvn called with negative strlen" #8166

p5pRT opened this issue Oct 25, 2005 · 12 comments

Comments

@p5pRT
Copy link

p5pRT commented Oct 25, 2005

Migrated from rt.perl.org#37526 (status was 'open')

Searchable as RT37526$

@p5pRT
Copy link
Author

p5pRT commented Oct 25, 2005

From kenhirsch@ftml.net

Created by kenhirsch@ftml.net

use encoding 'iso-8859-1';
binmode STDOUT, '​:encoding(iso-8859-1)' or die "binmode​:$!";
print "\xe1", "\n";
__END__
output is​:
"\x{1280}" does not map to iso-8859-1.
panic​: sv_setpvn called with negative strlen.

The same result is obtained with this program​:
open OUTFILE, ">", "testout.txt" or die "open​: $!";
binmode OUTFILE, '​:encoding(iso-8859-1)' or die "binmode 1​:$!";
binmode OUTFILE, '​:encoding(iso-8859-1)' or die "binmode 2​:$!";
print OUTFILE "\xe1", "\n";

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl v5.8.0:

Configured by Ken at Wed Mar  5 21:45:35  2003.

Summary of my perl5 (revision 5.0 version 8 subversion 0) configuration:
  Platform:
    osname=cygwin, osvers=1.3.12(0.5432), archname=cygwin
    uname='cygwin_nt-5.1 dxhirx1 1.3.12(0.5432) 2002-07-06 02:16 i686 unknown '
    config_args='-d'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=y, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-DPERL_USE_SAFE_PUTENV -fno-strict-aliasing -I/usr/local/include',
    optimize='-O2',
    cppflags='-DPERL_USE_SAFE_PUTENV -fno-strict-aliasing -I/usr/local/include'
    ccversion='', gccversion='2.95.3-5 (cygwin special)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=4
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='ld2', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib /lib
    libs=-lgdbm -lcrypt -lutil
    perllibs=-lcrypt -lutil
    libc=/usr/lib/libc.a, so=dll, useshrplib=true, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags=' -L/usr/local/lib'

Locally applied patches:
    ACTIVEPERL_LOCAL_PATCHES_ENTRY


@INC for perl v5.8.0:
    /usr/local/lib/perl5/5.8.0/cygwin
    /usr/local/lib/perl5/5.8.0
    /usr/local/lib/perl5/site_perl/5.8.0/cygwin
    /usr/local/lib/perl5/site_perl/5.8.0
    /usr/local/lib/perl5/site_perl
    .


Environment for perl v5.8.0:
    CYGWIN=tty
    HOME=/home/Ken
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/Ken/bin:/perl/bin:/usr/lib/subversion/bin:/usr/X11R6/bin:/usr/local/bin:/usr/bin:/bin:/Program Files/Microsoft.NET/FrameworkSDK/Bin/:/Program Files/Microsoft Visual Studio .NET/Common7/IDE/:/Program Files/Microsoft Visual Studio .NET/Vc7/bin/:/J2SDK_Forte/jdk1.4.0/bin:/Python21/:/usr/bin:/usr/bin/msnetcurr:/vim/current:/WINDOWS/system32:/WINDOWS:/WINDOWS/System32/Wbem:/Program Files/Common Files/Adaptec Shared/System:/Program Files/Microsoft SDK/Bin/:/Program Files/Microsoft SDK/Bin/WinNT/:/MSSQL7/BINN:/Program Files/Reflection/:/cygdrive/e/Program Files/Microsoft Visual Studio 8/Team Tools/Performance Tools/:/Program Files/Microsoft SQL Server/90/Tools/binn/:/Program Files/Microsoft Visual Studio/Common/Tools/WinNT:/Program Files/Microsoft Visual Studio/Common/MSDev98/Bin:/Program Files/Microsoft Visual Studio/Common/Tools:/Program Files/Microsoft Visual Studio/VC98/bin:/Program Files/Microsoft SDK/Bin/:/Program Files/Microsoft SDK/Bin/WinNT/:.:/PROGRA~1/COMM!
 ON~1/MUVEET~1/030625:.
    PERL_BADLANG (unset)
    PERL_MAILERS=smtp:c:\perl\bin\perl.exe
    SHELL (unset)

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2005

From @smpeters

[ken_hirsch - Tue Oct 25 03​:19​:49 2005]​:

This is a bug report for perl from kenhirsch@​ftml.net,
generated with the help of perlbug 1.34 running under perl v5.8.0.

-----------------------------------------------------------------
[Please enter your report here]

use encoding 'iso-8859-1';
binmode STDOUT, '​:encoding(iso-8859-1)' or die "binmode​:$!";
print "\xe1", "\n";
__END__
output is​:
"\x{1280}" does not map to iso-8859-1.
panic​: sv_setpvn called with negative strlen.

The same result is obtained with this program​:
open OUTFILE, ">", "testout.txt" or die "open​: $!";
binmode OUTFILE, '​:encoding(iso-8859-1)' or die "binmode 1​:$!";
binmode OUTFILE, '​:encoding(iso-8859-1)' or die "binmode 2​:$!";
print OUTFILE "\xe1", "\n";

I was able to replicate this problem in Perl-5.8.6, but I could get the
panic in bleadperl. I'm not sure what change fixed this problem, though.

./perl -Ilib -Mencoding=iso-8859-1 -wle'binmode STDOUT,
"​:encoding(iso-8859-1)" or die "binmode​:$!"; print "\xe1", "\n";'
"\x{128a}" does not map to iso-8859-1 at -e line 1.
\x{128a}

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2005

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2005

@smpeters - Status changed from 'open' to 'resolved'

@p5pRT p5pRT closed this as completed Nov 9, 2005
@p5pRT
Copy link
Author

p5pRT commented Apr 13, 2006

@sciurius - Status changed from 'resolved' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Apr 13, 2006

From @sciurius

The original test program​:

  use encoding 'iso-8859-1';
  binmode STDOUT, '​:encoding(iso-8859-1)' or die "binmode​:$!";
  print "\xe1", "\n";

prints _nothing_ with 5.8.8.

Changing the last line to

  print "x\xe1", "\n";

Yields​:

  "\x{1280}" does not map to iso-8859-1 at x2.pl line 3.
  panic​: sv_setpvn called with negative strlen at x2.pl line 3.
  "\x{12b8}" does not map to iso-8859-1.
  x\x{12b8}

(note​: no final newline)

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2012

From @doy

In 5.16.0, this program​:

  use encoding 'iso-8859-1';
  binmode STDOUT, '​:encoding(iso-8859-1)' or die "binmode​:$!";
  print "\xe1", "\n";

prints​:

  "\x{fffd}" does not map to iso-8859-1 at test7.pl line 3.
  \x{fffd}

and changing the last line to​:

  print "x\xe1", "\n";

yields​:

  "\x{fffd}" does not map to iso-8859-1 at test7.pl line 3.
  x\x{fffd}

Is this the expected behavior? I'm not entirely sure why it would be
printing a literal '\x{fffd}'.

-doy

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2012

From @nwc10

On Fri, Jul 06, 2012 at 03​:01​:55PM -0700, Jesse Luehrs via RT wrote​:

In 5.16.0, this program​:

use encoding 'iso-8859-1';
binmode STDOUT, '​:encoding(iso-8859-1)' or die "binmode​:$!";
print "\xe1", "\n";

prints​:

"\x{fffd}" does not map to iso-8859-1 at test7.pl line 3.
\x{fffd}

and changing the last line to​:

print "x\xe1", "\n";

yields​:

"\x{fffd}" does not map to iso-8859-1 at test7.pl line 3.
x\x{fffd}

Is this the expected behavior? I'm not entirely sure why it would be
printing a literal '\x{fffd}'.

It's not the behaviour I would expect. Particularly given that the value
printed seems to be internally well formed​:

$ ./perl -Ilib -MDevel​::Peek -e 'use encoding "iso-8859-1"; Dump "\xe1"'
SV = PV(0x1008010a0) at 0x100811c88
  REFCNT = 1
  FLAGS = (POK,READONLY,pPOK,UTF8)
  PV = 0x10030a870 "\303\241"\0 [UTF8 "\x{e1}"]
  CUR = 2
  LEN = 16

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2012

From @ikegami

On Fri, Jul 6, 2012 at 6​:01 PM, Jesse Luehrs via RT <
perlbug-followup@​perl.org> wrote​:

Is this the expected behavior? I'm not entirely sure why it would be
printing a literal '\x{fffd}'.

Double-encoding U+00E1 with iso-8859-1 should resolve in E1, as shown below​:

$ perl -MEncode=encode -E'print encode("iso-8859-1", encode("iso-8859-1",
"\xE1"));' | od -t x1
0000000 e1
0000001

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2012

From @ap

* Jesse Luehrs via RT <perlbug-followup@​perl.org> [2012-07-07 00​:05]​:

In 5.16.0, this program​:

use encoding 'iso-8859-1';
binmode STDOUT, '​:encoding(iso-8859-1)' or die "binmode​:$!";
print "\xe1", "\n";

prints​:

"\x{fffd}" does not map to iso-8859-1 at test7.pl line 3.
\x{fffd}

You used encoding.pm. You lose.

Lose that line and watch it work.

and changing the last line to​:

print "x\xe1", "\n";

yields​:

"\x{fffd}" does not map to iso-8859-1 at test7.pl line 3.
x\x{fffd}

Is this the expected behavior? I'm not entirely sure why it would be
printing a literal '\x{fffd}'.

It is trying to output a replacement character, finding it cannot, and
falling back to printing it as an escape. The reaction of printing it as
an escape seems the sane behaviour once you have gotten yourself into
the insanity that led to there being a replacement character in the
first place. Why exactly that happens is the real question here. Yet
frankly I am not sure I even care, because it involves encoding.pm in
some way, so an attempt to answer this is going to be roughly analogous
to descending into hell by your own volition then trying to exorcise the
ghosts that haunt your cell. Don’t Do That Then is a better answer.

* Nicholas Clark <nick@​ccl4.org> [2012-07-10 13​:20]​:

It's not the behaviour I would expect. Particularly given that the
value printed seems to be internally well formed​:

$ ./perl -Ilib -MDevel​::Peek -e 'use encoding "iso-8859-1"; Dump "\xe1"'
SV = PV(0x1008010a0) at 0x100811c88
REFCNT = 1
FLAGS = (POK,READONLY,pPOK,UTF8)
PV = 0x10030a870 "\303\241"\0 [UTF8 "\x{e1}"]
CUR = 2
LEN = 16

He is using encoding.pm so that’s not saying much.

Regards,
--
Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2012

From @Leont

On Sat, Jul 7, 2012 at 1​:01 AM, Jesse Luehrs via RT
<perlbug-followup@​perl.org> wrote​:

In 5.16.0, this program​:

use encoding 'iso-8859-1';
binmode STDOUT, '​:encoding(iso-8859-1)' or die "binmode​:$!";
print "\xe1", "\n";

prints​:

"\x{fffd}" does not map to iso-8859-1 at test7.pl line 3.
\x{fffd}

and changing the last line to​:

print "x\xe1", "\n";

yields​:

"\x{fffd}" does not map to iso-8859-1 at test7.pl line 3.
x\x{fffd}

Is this the expected behavior? I'm not entirely sure why it would be
printing a literal '\x{fffd}'.

: perl -Mencoding=iso-8859-1 -E 'binmode STDOUT,
"​:encoding(iso-8859-1)"; say for PerlIO​::get_layers(\*STDOUT)'
unix
perlio
encoding(iso-8859-1)
utf8
encoding(iso-8859-1)
utf8

Having two encoding layers on top of each other isn't going to work,
really. encoding.pm already adds it, you shouldn't add it yourself
too. Probably that should give a warning/exception (or better said
:encoding on top of any utf8 layers should).

Leon

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2012

From @ikegami

On Thu, Jul 12, 2012 at 4​:55 AM, Leon Timmermans <fawaka@​gmail.com> wrote​:

Is this the expected behavior? I'm not entirely sure why it would be
printing a literal '\x{fffd}'.

: perl -Mencoding=iso-8859-1 -E 'binmode STDOUT,
"​:encoding(iso-8859-1)"; say for PerlIO​::get_layers(\*STDOUT)'
unix
perlio
encoding(iso-8859-1)
utf8
encoding(iso-8859-1)
utf8

Having two encoding layers on top of each other isn't going to work,
really.

It's probably not something you *should* do, but why isn't it going to
work? Like I said earlier, double-encoding U+00E1 with iso-8859-1 should
resolve in E1, as shown below​:

$ perl -MEncode=encode -E'print encode("iso-8859-1", encode("iso-8859-1",
"\xE1"));' | od -t x1
0000000 e1
0000001

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant