Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

$PerlIO::encoding::fallback = FB_DEFAULT leads to duplicated output #7309

Closed
p5pRT opened this issue May 19, 2004 · 11 comments
Closed

$PerlIO::encoding::fallback = FB_DEFAULT leads to duplicated output #7309

p5pRT opened this issue May 19, 2004 · 11 comments
Assignees
Labels
Closable? We might be able to close this ticket, but we need to check with the reporter type-core type-PerlIO

Comments

@p5pRT
Copy link

p5pRT commented May 19, 2004

Migrated from rt.perl.org#29720 (status was 'open')

Searchable as RT29720$

@p5pRT
Copy link
Author

p5pRT commented May 19, 2004

From aa29@mail.ru

To​: perlbug@​perl.org
Subject​: $PerlIO​::encoding​::fallback = FB_DEFAULT leads to duplicated output
Reply-To​: aa29@​mail.ru
Message-Id​: <5.8.4_2160_1084980780@​INFORMED>

This is a bug report for perl from aa29@​mail.ru,
generated with the help of perlbug 1.35 running under perl v5.8.4.

$PerlIO​::encoding​::fallback = FB_DEFAULT leads to duplicated output.

It is possible to change check-mode via $PerlIO​::encoding​::fallback​:

use Encode qw(​:fallback_all);
use encoding 'utf8';

$PerlIO​::encoding​::fallback = FB_DEFAULT;

binmode(STDERR, "​:encoding(cp866)");
warn "foobar";

This code gives four messages instead of one​:

foobar at 6.pl line 7.
foobar at 6.pl line 7.
foobar at 6.pl line 7.
foobar at 6.pl line 7.

And with redirection STDERR to file it gives three messages​:

foobar at 6.pl line 7.
foobar at 6.pl line 7.
foobar at 6.pl line 7.

Further investigation shows that there is no duplication if
$PerlIO​::encoding​::fallback = FB_DEFAULT | FB_PERLQQ; # or FB_(HT|X)MLCREF

Looking into ext\Encode\Encode.xs I found such code
(ext\Encode\Encode.xs, line 229)​:

  if (check && !(check & ENCODE_LEAVE_SRC)){
sdone = SvCUR(src) - (slen+sdone);
if (sdone) {
  sv_setpvn(src, (char*)s+slen, sdone);
}
SvCUR_set(src, sdone);
  }

If check is set to FB_DEFAULT (which is 0) and no other fallback is
defined, then it behaves as if ENCODE_LEAVE_SRC is set, and buffer
does not became truncated, and then it will be flushed several times.


Flags​:
  category=core
  severity=medium


Site configuration information for perl v5.8.4​:

Configured by aa29 at Mon May 17 17​:59​:46 2004.

Summary of my perl5 (revision 5 version 8 subversion 4) configuration​:
  Platform​:
  osname=MSWin32, osvers=4.0, archname=MSWin32-x86-multi-thread
  uname=''
  config_args='undef'
  hint=recommended, useposix=true, d_sigaction=undef
  usethreads=undef use5005threads=undef useithreads=define
usemultiplicity=define
  useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
  use64bitint=undef use64bitall=undef uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cl', ccflags
='-nologo -Gf -W3 -MD -Zi -DNDEBUG -O1 -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE
_DES_FCRYPT -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -DPERL
_MSVCRT_READFIX',
  optimize='-MD -Zi -DNDEBUG -O1',
  cppflags='-DWIN32'
  ccversion='', gccversion='', gccosandvers=''
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
  d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=10
  ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='__int64',
lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='link', ldflags
'-nologo -nodefaultlib -debug -opt​:ref,icf -libpath​:"c​:\perl\lib\CORE" -ma
chine​:x86'
  libpth=\lib
  libs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib
uuid.lib wsock32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib
msvcrt.lib
  perllibs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib
uuid.lib wsock32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib
msvcrt.lib
  libc=msvcrt.lib, so=dll, useshrplib=yes, libperl=perl58.lib
  gnulibc_version='undef'
  Dynamic Linking​:
  dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
  cccdlflags=' ',
lddlflags='-dll -nologo -nodefaultlib -debug -opt​:ref,icf -libpath​:"c​:\perl
\lib\CORE" -machine​:x86'

Locally applied patches​:


@​INC for perl v5.8.4​:
  C​:/Perl/lib
  C​:/Perl/site/lib
  .


Environment for perl v5.8.4​:
  HOME (unset)
  LANG (unset)
  LANGUAGE (unset)
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)

PATH=C​:\cygwin\bin;C​:\Tcl\bin;C​:\WINDOWS\system32;C​:\WINDOWS;C​:\WINDOWS\Syst
em32\Wbem;C​:\Perl\bin;C​:\Program Files\Support Tools;D​:\src\lib;C​:\Program
Files\Microsoft Visual Studio\Common\Tools\WinNT;C​:\Program Files\Microsoft
Visual Studio\Common\MSDev98\Bin;C​:\Program Files\Microsoft Visual
Studio\Common\Tools;C​:\Program Files\Microsoft Visual
Studio\VC98\bin;C​:\Arc;C​:\Program Files\Utils;C​:\Mysql\bin;C​:\Program
Files\Debugging Tools for Windows;C​:\Tcl\bin;D​:\Linda\XML\fop;C​:\Program
Files\GNU\WinCvs 1.2;D​:\src\bin;C​:\Program Files\Far
  PERL_BADLANG (unset)
  SHELL (unset)

aa29

@p5pRT
Copy link
Author

p5pRT commented Nov 18, 2011

From ambrus@math.bme.hu

On Wed May 19 08​:34​:56 2004, aa29 wrote​:

$PerlIO​::encoding​::fallback = FB_DEFAULT leads to duplicated output.

This old bug still exists in bleadperl (as of a few days ago). It also
exists in all of perl 5.14.2, 5.12.3, 5.10.1. I have tested on
amd64-linux with this command.

$ ~/local/perlblead/bin/perl5.15.4 -we 'use Encode; use
PerlIO::encoding; $PerlIO::encoding::fallback = Encode::FB_XMLCREF();
binmode STDOUT, "encoding(iso-8859-2)" or die; print
"\x{e9}l\x{151}.u\x{ef} \x{2203}t\n";'
élő.u&#xef; &#x2203;t
élő.u&#xef; &#x2203;t
élő.u&#xef; &#x2203;t
$

@p5pRT
Copy link
Author

p5pRT commented Nov 18, 2011

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Dec 2, 2011

From ambrus@math.bme.hu

Besides printing duplicate output, a filehandle with an encoding layer
with fallback set also usually raises an exception "Close with partial
character" when you try to close it. This error message is not
documented in either perldiag or PerlIO​::encoding, and, in any case,
there shouldn't be an error.

I attach a test script that tests whether this bug is still present​:
it tests for both correct output and no exception when you close the
file.

Ambrus

@p5pRT
Copy link
Author

p5pRT commented Dec 2, 2011

From ambrus@math.bme.hu

unencodable.pl

@toddr
Copy link
Member

toddr commented Mar 5, 2020

@Leont does that mean you're looking at this issue then?

@Leont
Copy link
Contributor

Leont commented Mar 7, 2020

I have some ideas, but it may require some work on the Encode side too; FB_DEFAULT having a double meaning is inconvenient.

@toddr
Copy link
Member

toddr commented Mar 8, 2020

OK. I’m gonna put your name on it so we know who is involved

@osir3z
Copy link

osir3z commented Jul 28, 2022

Thanks to @Leont this issue should be resolved in Perl 5.34.0 and later.

The solution allows you to set whatever value you like for $PerlIO::encoding::fallback, but every time you use :encoding(...), that value is sanitised (using the same logic as the workaround below) before it is actually used by the encoder/decoder.

Workaround

For versions before Perl 5.34.0, always clear the LEAVE_SRC bit and set the STOP_AT_PARTIAL bit when setting $PerlIO::encoding::fallback, e.g.:

$PerlIO::encoding::fallback = (($fallback) & ~Encode::LEAVE_SRC()) | Encode::STOP_AT_PARTIAL();

(tested with Perl 5.30.2 on Windows 10, Perl 5.30.3 on Ubuntu 20.04 LTS for WSL2, and Perl 5.28.1 on Debian Buster)

Background

When I encountered this issue a couple of days ago, I was trying to set $PerlIO::encoding::fallback to FB_DEFAULT because I was unhappy with the qq-style output and the warnings I got when I used :encoding(...). Obviously, as per this issue, that resulted in duplicated output.

After some experimentation I discovered that clearing the LEAVE_SRC bit resolved the duplicated output for all but FB_DEFAULT. But that's because LEAVE_SRC is only honored when $PerlIO::encoding::fallback is set (see Encode#LEAVE_SRC). Testing showed that by forcing an "unused" bit (e.g. 0x8000) to be set, the clear LEAVE_SRC bit would be honored and everything appeared to work.

Unhappy at having to hack a solution with an "unused" bit that may someday get used, I dug in to the code for PerlIO::encoding on MetaCPAN and found @Leont's code which sanitised $PerlIO::encoding::fallback according to the logic in the workaround above (see PerlIO-encoding/encoding.xs#L175). I assumed that it wasn't working for some reason, but it turns out that it was just the latest version of the code which wasn't included in the versions of Perl that I was testing on.

Looking at various different version of Perl going back through the years, it is clear that the default value for $PerlIO::encoding::fallback always has a clear LEAVE_SRC bit and a set STOP_AT_PARTIAL bit. Obviously @Leont came to the same conclusion. Thankfully, using this combination means I avoid using a hack, and also likely avoid some errors I hadn't yet encountered.

@toddr toddr added the Closable? We might be able to close this ticket, but we need to check with the reporter label Jul 28, 2022
@jkeenan
Copy link
Contributor

jkeenan commented Sep 21, 2022

Thanks to @Leont this issue should be resolved in Perl 5.34.0 and later.

@Leont, do you concur?

@Leont Leont closed this as completed Sep 21, 2022
@Leont
Copy link
Contributor

Leont commented Sep 21, 2022

@Leont, do you concur?

Yeah, this is solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closable? We might be able to close this ticket, but we need to check with the reporter type-core type-PerlIO
Projects
None yet
Development

No branches or pull requests

6 participants