Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PerlIO::encoding infinite loop when trying to decode UTF-8 as ISO-2022-JP #10258

Open
p5pRT opened this issue Mar 25, 2010 · 7 comments
Open

Comments

@p5pRT
Copy link

p5pRT commented Mar 25, 2010

Migrated from rt.perl.org#73826 (status was 'open')

Searchable as RT73826$

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2010

From jgreely@dotclue.org

Created by jgreely@dotclue.org

When the ISO-2022-JP decoder is used by PerlIO​::encoding, invalid
data can send it into an infinite loop. The following three short
scripts demonstrate the problem​:

#step1.pl - create a file containing Unicode 201d in UTF-8
# (right double quotation mark)
open(Out,">foo");
print Out "\xE2\x80\x9D";
close(Out);

#step2.pl - this script exits successfully
use Encode;
open(In,"foo");
print decode("iso-2022-jp",<In>);

#step3.pl - this script goes into an infinite decoding loop
open(In,"<​:encoding(iso-2022-jp)","foo");
print <In>;

The behavior is identical on the Apple-supplied 5.10.0 and on
Strawberry Perl 5.10.1.

Perl Info
---
Flags:
    category=library
    severity=low
    module=PerlIO::encoding
---
Site configuration information for perl 5.10.1:

Configured by win32-vanilla at Wed Oct 21 13:53:59 2009.

Summary of my perl5 (revision 5 version 10 subversion 1) configuration:
   
  Platform:
    osname=MSWin32, osvers=5.1, archname=MSWin32-x86-multi-thread
    uname='Win32 strawberryperl 5.10.1.0 #1 30 i386'
    config_args='undef'
    hint=recommended, useposix=true, d_sigaction=undef
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags =' -s -O2 -DWIN32 -DHAVE_DES_FCRYPT  -DUSE_SITECUSTOMIZE -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -fno-strict-aliasing -DPERL_MSVCRT_READFIX',
    optimize='-s -O2',
    cppflags='-DWIN32'
    ccversion='', gccversion='3.4.5', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='long long', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='g++', ldflags ='-s -L"C:\strawberry\perl\lib\CORE" -L"C:\strawberry\c\lib"'
    libpth=C:\strawberry\c\lib
    libs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32
    perllibs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32
    libc=, so=dll, useshrplib=true, libperl=libperl510.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags='-mdll -s -L"C:\strawberry\perl\lib\CORE" -L"C:\strawberry\c\lib"'

Locally applied patches:
    

---
@INC for perl 5.10.1:
    C:/strawberry/perl/lib
    C:/strawberry/perl/site/lib
    C:\strawberry\perl\vendor\lib
    .

---
Environment for perl 5.10.1:
    HOME (unset)
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=C:\windows\system32;C:\windows;C:\windows\System32\Wbem;C:\windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Lenovo\Bluetooth Software\;C:\Program Files\QuickTime\QTSystem\;C:\Program Files\GNU\GnuPG\pub;C:\strawberry\c\bin;C:\strawberry\perl\bin;C:\Program Files\Mercurial;C:\emacs\bin;C:\gnuwin32\bin;C:\Program Files\OpenVPN\bin
    PERL_BADLANG (unset)
    SHELL (unset)

@p5pRT
Copy link
Author

p5pRT commented Mar 27, 2010

From @iabyn

still loops in blead

@p5pRT
Copy link
Author

p5pRT commented Feb 17, 2017

@jkeenan - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Feb 17, 2017

From @jkeenan

Reproduced with perl-5.24.1 on Linux:

$ perl -e 'open(OUT, q|>|, q|foo|); print OUT qq|\xE2\x80\x9D|; close OUT'

$ perl -MEncode -e 'open(IN, q|foo|); print decode("iso-2022-jp", <IN>);'

# below loops indefinitely
$ perl -e 'open(IN, q|<:encoding(iso-2022-jp)|, 'foo'); print <IN>;'

--
James E Keenan (jkeenan@cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Mar 4, 2017

From @Leont

iso-2022 is a very problematic encoding because it's escape based, as explained in Encode​::PerlIO. iso-2022-jp is currently allowed in PerlIO (unlike iso-2022-ke) because apparently well-formed iso-2022-jp can be handled easily enough, but your example shows that less well formed input can be quite problematic.

In this particular case I suspect it's fixable (by handling EOF smarter), in many other cases it probably isn't.

Leon

@p5pRT
Copy link
Author

p5pRT commented Mar 4, 2017

From @khwilliamson

Can we detect we are in a loop?

@khwilliamson
Copy link
Contributor

This still exists in 5.37.12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants