Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pod2html generates illegal UTF-8 #11976

Closed
p5pRT opened this issue Feb 29, 2012 · 7 comments
Closed

pod2html generates illegal UTF-8 #11976

p5pRT opened this issue Feb 29, 2012 · 7 comments
Labels
distro-openbsd ext/Pod-Html issues in the blead-upstream Pod-Html distribution

Comments

@p5pRT
Copy link

p5pRT commented Feb 29, 2012

Migrated from rt.perl.org#111446 (status was 'resolved')

Searchable as RT111446$

@p5pRT
Copy link
Author

p5pRT commented Feb 29, 2012

From tchrist@perl.com

pod2html generates illegal UTF-8 because it creates HTML pages that
claim to be UTF-8​:

  <meta http-equiv="content-type" content="text/html; charset=utf-8" />

But then generates strings afflicted with the Unicode bug. Code points 128-255
come out as simple illegal bytes, unless there's a larger code point in them.

The right fix is to binmode the output handle to :utf8.

Here's a list of pages to test. Note that you won't get a wide char warning
if it is only 128-255; you'll simply get illegal output.

  perlebcdic.pod
  perlgit.pod
  perlhist.pod
  perlpodspec.pod
  perlthrtut.pod

  perl588delta.pod
  perl5100delta.pod
  perl5120delta.pod
  perl5121delta.pod
  perl5122delta.pod
  perl5123delta.pod
  perl5124delta.pod
  perl5140delta.pod
  perl5141delta.pod
  perl5142delta.pod
  perl5150delta.pod
  perl5151delta.pod
  perl5152delta.pod
  perl5153delta.pod
  perl5154delta.pod
  perl5156delta.pod
  perl5157delta.pod
  perl5158delta.pod

  perlcn.pod
  perljp.pod
  perlko.pod
  perltw.pod

Notice also that you get differently wrong answers running with PERL_UNICODE
set to 0 vs to SD. The program should not be sensitive to whether
that variable is set, because it knows the encodings of its input and
output, and should set things accordingly.

--tom

Summary of my perl5 (revision 5 version 14 subversion 0) configuration​:
 
  Platform​:
  osname=openbsd, osvers=4.4, archname=OpenBSD.i386-openbsd
  uname='openbsd chthon 4.4 generic#0 i386 '
  config_args='-des'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=undef, usemultiplicity=undef
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=undef, use64bitall=undef, uselongdouble=undef
  usemymalloc=y, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include',
  optimize='-O2',
  cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='3.3.5 (propolice)', gccosandvers='openbsd4.4'
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
  ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=4, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags ='-Wl,-E -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib
  libs=-lgdbm -lm -lutil -lc
  perllibs=-lm -lutil -lc
  libc=/usr/lib/libc.so.48.0, so=so, useshrplib=false, libperl=libperl.a
  gnulibc_version=''
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
  cccdlflags='-DPIC -fPIC ', lddlflags='-shared -fPIC -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​:
  Compile-time options​: MYMALLOC PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP
  PERL_PRESERVE_IVUV USE_LARGE_FILES USE_PERLIO
  USE_PERL_ATOF
  Built under openbsd
  Compiled at Jun 11 2011 11​:48​:28
  %ENV​:
  PERL_UNICODE="SA"
  @​INC​:
  /usr/local/lib/perl5/site_perl/5.14.0/OpenBSD.i386-openbsd
  /usr/local/lib/perl5/site_perl/5.14.0
  /usr/local/lib/perl5/5.14.0/OpenBSD.i386-openbsd
  /usr/local/lib/perl5/5.14.0
  /usr/local/lib/perl5/site_perl/5.12.3
  /usr/local/lib/perl5/site_perl/5.11.3
  /usr/local/lib/perl5/site_perl/5.10.1
  /usr/local/lib/perl5/site_perl/5.10.0
  /usr/local/lib/perl5/site_perl/5.8.7
  /usr/local/lib/perl5/site_perl/5.8.0
  /usr/local/lib/perl5/site_perl/5.6.0
  /usr/local/lib/perl5/site_perl/5.005
  /usr/local/lib/perl5/site_perl
  .

@p5pRT
Copy link
Author

p5pRT commented May 4, 2012

From @smpeters

On Wed Feb 29 12​:57​:02 2012, tom christiansen wrote​:

pod2html generates illegal UTF-8 because it creates HTML pages that
claim to be UTF-8​:

\<meta http\-equiv="content\-type" content="text/html; charset=utf\-8"

/>

But then generates strings afflicted with the Unicode bug. Code
points 128-255
come out as simple illegal bytes, unless there's a larger code point
in them.

The right fix is to binmode the output handle to :utf8.

Here's a list of pages to test. Note that you won't get a wide char
warning
if it is only 128-255; you'll simply get illegal output.

perlebcdic\.pod
perlgit\.pod
perlhist\.pod
perlpodspec\.pod
perlthrtut\.pod

perl588delta\.pod
perl5100delta\.pod
perl5120delta\.pod
perl5121delta\.pod
perl5122delta\.pod
perl5123delta\.pod
perl5124delta\.pod
perl5140delta\.pod
perl5141delta\.pod
perl5142delta\.pod
perl5150delta\.pod
perl5151delta\.pod
perl5152delta\.pod
perl5153delta\.pod
perl5154delta\.pod
perl5156delta\.pod
perl5157delta\.pod
perl5158delta\.pod

perlcn\.pod
perljp\.pod
perlko\.pod
perltw\.pod

Notice also that you get differently wrong answers running with
PERL_UNICODE
set to 0 vs to SD. The program should not be sensitive to whether
that variable is set, because it knows the encodings of its input and
output, and should set things accordingly.

--tom

I have a fix in a local post-5.16 branch to be pushed out after the
release unless I hear this should go to blead now.

Steve

@p5pRT
Copy link
Author

p5pRT commented May 4, 2012

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented May 24, 2012

From @cpansprout

On Fri May 04 07​:23​:44 2012, stmpeters wrote​:

I have a fix in a local post-5.16 branch to be pushed out after the
release unless I hear this should go to blead now.

The release has happened. :-)

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jun 4, 2012

From @smpeters

On Thu May 24 00​:47​:34 2012, sprout wrote​:

On Fri May 04 07​:23​:44 2012, stmpeters wrote​:

I have a fix in a local post-5.16 branch to be pushed out after the
release unless I hear this should go to blead now.

The release has happened. :-)

This fix is now in blead.

@p5pRT p5pRT closed this as completed Jun 4, 2012
@p5pRT
Copy link
Author

p5pRT commented Jun 4, 2012

@smpeters - Status changed from 'open' to 'resolved'

@p5pRT
Copy link
Author

p5pRT commented Jun 4, 2012

From @cpansprout

On Mon Jun 04 07​:59​:46 2012, stmpeters wrote​:

On Thu May 24 00​:47​:34 2012, sprout wrote​:

On Fri May 04 07​:23​:44 2012, stmpeters wrote​:

I have a fix in a local post-5.16 branch to be pushed out after the
release unless I hear this should go to blead now.

The release has happened. :-)

This fix is now in blead.

Ouch! You didn’t rebase. :-)

--

Father Chrysostomos

@jkeenan jkeenan added the ext/Pod-Html issues in the blead-upstream Pod-Html distribution label Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distro-openbsd ext/Pod-Html issues in the blead-upstream Pod-Html distribution
Projects
None yet
Development

No branches or pull requests

2 participants