Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

poor argument parsing in pragma 'use open' #15848

Closed
p5pRT opened this issue Jan 29, 2017 · 14 comments
Closed

poor argument parsing in pragma 'use open' #15848

p5pRT opened this issue Jan 29, 2017 · 14 comments

Comments

@p5pRT
Copy link

p5pRT commented Jan 29, 2017

Migrated from rt.perl.org#130668 (status was 'rejected')

Searchable as RT130668$

@p5pRT
Copy link
Author

p5pRT commented Jan 29, 2017

From sur98ke@gmail.com

This is a bug report for perl from sur98ke@​gmail.com,
generated with the help of perlbug 1.40 running under perl 5.22.3.

?Today I tried to work with UCS-2-LE-BOM encoded data using ARGV and
STDOUT filehandles.

To do so I had to use binmode (STDOUT, ...)
for STDOUT and use open 'IN', ...
for ARGV.

What I found out is that format of argument for "use open" pragma is
much more strict than format for open and binmode subroutines. i.e.​:

binmode(STDOUT, "raw pop encoding(ucs-2le) crlf utf8"); # works well
binmode(STDOUT, " raw pop encoding(ucs-2le) crlf utf8"); # works well
binmode(STDOUT, "​:raw​:pop​:encoding(ucs-2le)​:crlf​:utf8"); # works well

open (F, '<encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); # works well
open (F, '< encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); # works well
open (F, '<​:encoding(ucs-2le)​:crlf​:utf8', 'vlc-assoc.reg'); # works well

use open ('IN' , 'encoding(ucs-2le) crlf utf8'); # works well
use open ('IN' , ' encoding(ucs-2le) crlf utf8'); # Unknown PerlIO layer
'' at myscript.pl line ...
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8'); # Unknown PerlIO layer
'encoding(ucs-2le)​:crlf​:utf8' at myscript.pl line ...

And this difference in interpreting arguments IS NOT mentioned at
http​://perldoc.perl.org/open.html
http​://perldoc.perl.org/functions/binmode.html
http​://perldoc.perl.org/functions/open.html
http​://perldoc.perl.org/PerlIO.html

I couldn't even find the example of proper "use open" usage over
internet. I had to look into the file "open.pm" to find the line number 73​:

https://perl5.git.perl.org/perl.git/blob/HEAD:/lib/open.pm#l73
foreach my $layer (split(/\s+/,$dscp)) {

which splits the argument by spaces.
In both of distributions on my machine files​:
<install path>\Strawberry\perl\lib\open.pm
<install path>\cygwin64\lib\perl5\5.22\open.pm
are equal to
https://perl5.git.perl.org/perl.git/blob_plain/HEAD:/lib/open.pm
on 29 Jan 2017.

Could you please fix this in such a way that examples​:
use open ('IN' , ' encoding(ucs-2le) crlf utf8');
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8');
would work well.

Or, at least add explanation of this limitation to perldoc​:
http​://perldoc.perl.org/open.html

Thank you.

Best regards,
Artur Mansurov.


Flags​:
  category=library
  severity=low
  module=open


Site configuration information for perl 5.22.3​:

Configured by ASSI at Sun Jan 15 13​:05​:43 CET 2017.

Summary of my perl5 (revision 5 version 22 subversion 3) configuration​:
  Platform​:
  osname=cygwin, osvers=2.6.1(0.30553), archname=cygwin-thread-multi
  uname='cygwin_nt-6.3 cygwin 2.6.1(0.30553) 2016-12-16 11​:55 x86_64
cygwin '
  config_args='-des -Dprefix=/usr -Dmksymlinks
-Darchname=x86_64-cygwin-threads -Dlibperl=cygperl5_22.dll -Dcc=gcc
-Dld=g++ -Accflags=-ggdb -O2 -pipe -Wimplicit-function-declaration
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/build=/usr/src/debug/perl-5.22.3-1
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/src/perl-5.22.3=/usr/src/debug/perl-5.22.3-1
-fwrapv'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=define, usemultiplicity=define
  use64bitint=define, use64bitall=define, uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='gcc', ccflags ='-DPERL_USE_SAFE_PUTENV -D_GNU_SOURCE
-U__STRICT_ANSI__ -ggdb -O2 -pipe -Wimplicit-function-declaration
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/build=/usr/src/debug/perl-5.22.3-1
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/src/perl-5.22.3=/usr/src/debug/perl-5.22.3-1
-fwrapv -fno-strict-aliasing -fstack-protector-strong -D_FORTIFY_SOURCE=2',
  optimize='-O3',
  cppflags='-DPERL_USE_SAFE_PUTENV -D_GNU_SOURCE -U__STRICT_ANSI__
-ggdb -O2 -pipe -Wimplicit-function-declaration
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/build=/usr/src/debug/perl-5.22.3-1
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/src/perl-5.22.3=/usr/src/debug/perl-5.22.3-1
-fwrapv -fno-strict-aliasing -fstack-protector-strong'
  ccversion='', gccversion='5.4.0', gccosandvers=''
  intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678,
doublekind=3
  d_longlong=define, longlongsize=8, d_longdbl=define,
longdblsize=16, longdblkind=3
  ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='g++', ldflags =' -Wl,--enable-auto-import
-Wl,--export-all-symbols -Wl,--enable-auto-image-base
-fstack-protector-strong'
  libpth=/usr/lib
  libs=-lpthread -lgdbm -ldb -ldl -lcrypt -lgdbm_compat
  perllibs=-lpthread -ldl -lcrypt
  libc=/usr/lib/libcygwin.a, so=dll, useshrplib=true,
libperl=cygperl5_22.dll
  gnulibc_version=''
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
  cccdlflags=' ', lddlflags=' --shared -Wl,--enable-auto-import
-Wl,--export-all-symbols -Wl,--enable-auto-image-base
-fstack-protector-strong'


@​INC for perl 5.22.3​:
  /usr/lib/perl5/site_perl/5.22/x86_64-cygwin-threads
  /usr/lib/perl5/site_perl/5.22
  /usr/lib/perl5/vendor_perl/5.22/x86_64-cygwin-threads
  /usr/lib/perl5/vendor_perl/5.22
  /usr/lib/perl5/5.22/x86_64-cygwin-threads
  /usr/lib/perl5/5.22


Environment for perl 5.22.3​:
  CYGWIN_HOME=c​:\home\sur
  HOME=/home/sur
  LANG=ru_RU.UTF-8
  LANGUAGE (unset)
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)

PATH=/usr/local/bin​:/usr/bin​:/win/c/Windows/system32​:/win/c/Windows​:/win/c/Windows/System32/Wbem​:/win/c/Windows/System32/WindowsPowerShell/v1.0​:/win/c/bin​:/win/c/prog32/SysinternalsSuite​:/win/c/Program
Files/Microsoft/Web Platform Installer​:/win/c/Program Files
(x86)/Microsoft ASP.NET/ASP.NET Web Pages/v1.0​:/win/c/Program Files
(x86)/Windows Kits/8.0/Windows Performance Toolkit​:/win/c/Program
Files/Microsoft SQL
Server/110/Tools/Binn​:/win/c/prog32/MATLAB/R2014a/runtime/win32​:/win/c/prog32/MATLAB/R2014a/bin​:/win/c/prog32/MATLAB/R2014a/polyspace/bin​:/win/c/prog64/Python34​:/win/c/prog64/Python34/Scripts​:/win/c/prog32/Borland/CBUILD1/Bin​:/win/c/prog32/Borland/CBUILD1/Projects/Bpl​:/win/c/Program
Files (x86)/Microsoft SDKs/Windows/v8.0A/bin/NETFX 4.0
Tools​:/win/c/Program Files (x86)/Microsoft SQL
Server/110/Tools/Binn​:/win/c/Program Files/Microsoft SQL
Server/110/DTS/Binn​:/win/c/Program Files (x86)/Microsoft SQL
Server/110/DTS/Binn​:/win/c/prog32/GnuPG/pub​:/win/c/home/ProgramData/Oracle/Java/javapath​:/win/c/prog64/Strawberry/c/bin​:/win/c/prog64/Strawberry/perl/site/bin​:/win/c/prog64/Strawberry/perl/bin​:/win/c/prog32/Skype/Phone​:/win/c/prog64/Java/jdk1.8.0_45​:/win/c/Program
Files/TortoiseGit/bin​:/win/c/home/sur/scripts​:/win/c/home/sur/bin​:/win/c/prog32/GMT4/bin​:/media/pri1/intelFPGA_lite/16.1/modelsim_ase/win32aloem​:/usr/lib/lapack
  PERL_BADLANG (unset)
  SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Feb 3, 2017

From @jkeenan

On Sun, 29 Jan 2017 13​:32​:15 GMT, sur98ke@​gmail.com wrote​:

This is a bug report for perl from sur98ke@​gmail.com,
generated with the help of perlbug 1.40 running under perl 5.22.3.

?Today I tried to work with UCS-2-LE-BOM encoded data using ARGV and
STDOUT filehandles.

To do so I had to use binmode (STDOUT, ...)
for STDOUT and use open 'IN', ...
for ARGV.

What I found out is that format of argument for "use open" pragma is
much more strict than format for open and binmode subroutines. i.e.​:

binmode(STDOUT, "raw pop encoding(ucs-2le) crlf utf8"); # works well
binmode(STDOUT, " raw pop encoding(ucs-2le) crlf utf8"); # works well
binmode(STDOUT, "​:raw​:pop​:encoding(ucs-2le)​:crlf​:utf8"); # works well

open (F, '<encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); # works
well
open (F, '< encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); # works
well
open (F, '<​:encoding(ucs-2le)​:crlf​:utf8', 'vlc-assoc.reg'); # works
well

use open ('IN' , 'encoding(ucs-2le) crlf utf8'); # works well
use open ('IN' , ' encoding(ucs-2le) crlf utf8'); # Unknown PerlIO
layer
'' at myscript.pl line ...
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8'); # Unknown PerlIO
layer
'encoding(ucs-2le)​:crlf​:utf8' at myscript.pl line ...

And this difference in interpreting arguments IS NOT mentioned at
http​://perldoc.perl.org/open.html
http​://perldoc.perl.org/functions/binmode.html
http​://perldoc.perl.org/functions/open.html
http​://perldoc.perl.org/PerlIO.html

I couldn't even find the example of proper "use open" usage over
internet. I had to look into the file "open.pm" to find the line
number 73​:

https://perl5.git.perl.org/perl.git/blob/HEAD:/lib/open.pm#l73
foreach my $layer (split(/\s+/,$dscp)) {

which splits the argument by spaces.
In both of distributions on my machine files​:
<install path>\Strawberry\perl\lib\open.pm
<install path>\cygwin64\lib\perl5\5.22\open.pm
are equal to
https://perl5.git.perl.org/perl.git/blob_plain/HEAD:/lib/open.pm
on 29 Jan 2017.

Could you please fix this in such a way that examples​:
use open ('IN' , ' encoding(ucs-2le) crlf utf8');
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8');
would work well.

Or, at least add explanation of this limitation to perldoc​:
http​://perldoc.perl.org/open.html

Thank you.

Best regards,
Artur Mansurov.

Would you be able to provide a file attachment with several lines of text in the ucs-2le encoding for which your work-around was necessary?

That would enable us to examine the problem on multiple platforms.

Thank you very much.
Jim Keenan

---
Flags​:
category=library
severity=low
module=open
---
Site configuration information for perl 5.22.3​:

Configured by ASSI at Sun Jan 15 13​:05​:43 CET 2017.

Summary of my perl5 (revision 5 version 22 subversion 3)
configuration​:
Platform​:
osname=cygwin, osvers=2.6.1(0.30553), archname=cygwin-thread-
multi
uname='cygwin_nt-6.3 cygwin 2.6.1(0.30553) 2016-12-16 11​:55
x86_64
cygwin '
config_args='-des -Dprefix=/usr -Dmksymlinks
-Darchname=x86_64-cygwin-threads -Dlibperl=cygperl5_22.dll -Dcc=gcc
-Dld=g++ -Accflags=-ggdb -O2 -pipe -Wimplicit-function-declaration
-fdebug-prefix-
map=/mnt/share/maint/perl.x86_64/build=/usr/src/debug/perl-5.22.3-1
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/src/perl-
5.22.3=/usr/src/debug/perl-5.22.3-1
-fwrapv'
hint=recommended, useposix=true, d_sigaction=define
useithreads=define, usemultiplicity=define
use64bitint=define, use64bitall=define, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler​:
cc='gcc', ccflags ='-DPERL_USE_SAFE_PUTENV -D_GNU_SOURCE
-U__STRICT_ANSI__ -ggdb -O2 -pipe -Wimplicit-function-declaration
-fdebug-prefix-
map=/mnt/share/maint/perl.x86_64/build=/usr/src/debug/perl-5.22.3-1
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/src/perl-
5.22.3=/usr/src/debug/perl-5.22.3-1
-fwrapv -fno-strict-aliasing -fstack-protector-strong
-D_FORTIFY_SOURCE=2',
optimize='-O3',
cppflags='-DPERL_USE_SAFE_PUTENV -D_GNU_SOURCE -U__STRICT_ANSI__
-ggdb -O2 -pipe -Wimplicit-function-declaration
-fdebug-prefix-
map=/mnt/share/maint/perl.x86_64/build=/usr/src/debug/perl-5.22.3-1
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/src/perl-
5.22.3=/usr/src/debug/perl-5.22.3-1
-fwrapv -fno-strict-aliasing -fstack-protector-strong'
ccversion='', gccversion='5.4.0', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8,
byteorder=12345678,
doublekind=3
d_longlong=define, longlongsize=8, d_longdbl=define,
longdblsize=16, longdblkind=3
ivtype='long', ivsize=8, nvtype='double', nvsize=8,
Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries​:
ld='g++', ldflags =' -Wl,--enable-auto-import
-Wl,--export-all-symbols -Wl,--enable-auto-image-base
-fstack-protector-strong'
libpth=/usr/lib
libs=-lpthread -lgdbm -ldb -ldl -lcrypt -lgdbm_compat
perllibs=-lpthread -ldl -lcrypt
libc=/usr/lib/libcygwin.a, so=dll, useshrplib=true,
libperl=cygperl5_22.dll
gnulibc_version=''
Dynamic Linking​:
dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
cccdlflags=' ', lddlflags=' --shared -Wl,--enable-auto-import
-Wl,--export-all-symbols -Wl,--enable-auto-image-base
-fstack-protector-strong'

---
@​INC for perl 5.22.3​:
/usr/lib/perl5/site_perl/5.22/x86_64-cygwin-threads
/usr/lib/perl5/site_perl/5.22
/usr/lib/perl5/vendor_perl/5.22/x86_64-cygwin-threads
/usr/lib/perl5/vendor_perl/5.22
/usr/lib/perl5/5.22/x86_64-cygwin-threads
/usr/lib/perl5/5.22

---
Environment for perl 5.22.3​:
CYGWIN_HOME=c​:\home\sur
HOME=/home/sur
LANG=ru_RU.UTF-8
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)

PATH=/usr/local/bin​:/usr/bin​:/win/c/Windows/system32​:/win/c/Windows​:/win/c/Windows/System32/Wbem​:/win/c/Windows/System32/WindowsPowerShell/v1.0​:/win/c/bin​:/win/c/prog32/SysinternalsSuite​:/win/c/Program
Files/Microsoft/Web Platform Installer​:/win/c/Program Files
(x86)/Microsoft ASP.NET/ASP.NET Web Pages/v1.0​:/win/c/Program Files
(x86)/Windows Kits/8.0/Windows Performance Toolkit​:/win/c/Program
Files/Microsoft SQL
Server/110/Tools/Binn​:/win/c/prog32/MATLAB/R2014a/runtime/win32​:/win/c/prog32/MATLAB/R2014a/bin​:/win/c/prog32/MATLAB/R2014a/polyspace/bin​:/win/c/prog64/Python34​:/win/c/prog64/Python34/Scripts​:/win/c/prog32/Borland/CBUILD1/Bin​:/win/c/prog32/Borland/CBUILD1/Projects/Bpl​:/win/c/Program
Files (x86)/Microsoft SDKs/Windows/v8.0A/bin/NETFX 4.0
Tools​:/win/c/Program Files (x86)/Microsoft SQL
Server/110/Tools/Binn​:/win/c/Program Files/Microsoft SQL
Server/110/DTS/Binn​:/win/c/Program Files (x86)/Microsoft SQL
Server/110/DTS/Binn​:/win/c/prog32/GnuPG/pub​:/win/c/home/ProgramData/Oracle/Java/javapath​:/win/c/prog64/Strawberry/c/bin​:/win/c/prog64/Strawberry/perl/site/bin​:/win/c/prog64/Strawberry/perl/bin​:/win/c/prog32/Skype/Phone​:/win/c/prog64/Java/jdk1.8.0_45​:/win/c/Program
Files/TortoiseGit/bin​:/win/c/home/sur/scripts​:/win/c/home/sur/bin​:/win/c/prog32/GMT4/bin​:/media/pri1/intelFPGA_lite/16.1/modelsim_ase/win32aloem​:/usr/lib/lapack
PERL_BADLANG (unset)
SHELL=/bin/bash

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Feb 3, 2017

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Feb 3, 2017

From sur98ke@gmail.com

Thu, 02 Feb 2017 17​:26​:34 -0800, jkeenan писал​:

On Sun, 29 Jan 2017 13​:32​:15 GMT, sur98ke@​gmail.com wrote​:

This is a bug report for perl from sur98ke@​gmail.com,
generated with the help of perlbug 1.40 running under perl 5.22.3.

?Today I tried to work with UCS-2-LE-BOM encoded data using ARGV and
STDOUT filehandles.

To do so I had to use binmode (STDOUT, ...)
for STDOUT and use open 'IN', ...
for ARGV.

What I found out is that format of argument for "use open" pragma is
much more strict than format for open and binmode subroutines. i.e.​:

binmode(STDOUT, "raw pop encoding(ucs-2le) crlf utf8"); # works well
binmode(STDOUT, " raw pop encoding(ucs-2le) crlf utf8"); # works well
binmode(STDOUT, "​:raw​:pop​:encoding(ucs-2le)​:crlf​:utf8"); # works well

open (F, '<encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); # works
well
open (F, '< encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); # works
well
open (F, '<​:encoding(ucs-2le)​:crlf​:utf8', 'vlc-assoc.reg'); # works
well

use open ('IN' , 'encoding(ucs-2le) crlf utf8'); # works well
use open ('IN' , ' encoding(ucs-2le) crlf utf8'); # Unknown PerlIO
layer
'' at myscript.pl line ...
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8'); # Unknown PerlIO
layer
'encoding(ucs-2le)​:crlf​:utf8' at myscript.pl line ...

And this difference in interpreting arguments IS NOT mentioned at
http​://perldoc.perl.org/open.html
http​://perldoc.perl.org/functions/binmode.html
http​://perldoc.perl.org/functions/open.html
http​://perldoc.perl.org/PerlIO.html

I couldn't even find the example of proper "use open" usage over
internet. I had to look into the file "open.pm" to find the line
number 73​:

https://perl5.git.perl.org/perl.git/blob/HEAD:/lib/open.pm#l73
foreach my $layer (split(/\s+/,$dscp)) {

which splits the argument by spaces.
In both of distributions on my machine files​:
<install path>\Strawberry\perl\lib\open.pm
<install path>\cygwin64\lib\perl5\5.22\open.pm
are equal to
https://perl5.git.perl.org/perl.git/blob_plain/HEAD:/lib/open.pm
on 29 Jan 2017.

Could you please fix this in such a way that examples​:
use open ('IN' , ' encoding(ucs-2le) crlf utf8');
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8');
would work well.

Or, at least add explanation of this limitation to perldoc​:
http​://perldoc.perl.org/open.html

Thank you.

Best regards,
Artur Mansurov.

Would you be able to provide a file attachment with several lines of
text in the ucs-2le encoding for which your work-around was necessary?

That would enable us to examine the problem on multiple platforms.

Thank you very much.
Jim Keenan

OK, I include sample of UCS-2LE file.

But I don't think you will really need it - because the problem happens at the stage of module import ('use open' ...) before my script actually start executing and accessing any files.
I'm sure that the source of the problem is poor argument parsing code in 'open.pm' standard module. And this poorness is not mentioned in official documentation.

---
Flags​:
category=library
severity=low
module=open
---
Site configuration information for perl 5.22.3​:

Configured by ASSI at Sun Jan 15 13​:05​:43 CET 2017.

Summary of my perl5 (revision 5 version 22 subversion 3)
configuration​:
Platform​:
osname=cygwin, osvers=2.6.1(0.30553), archname=cygwin-thread-
multi
uname='cygwin_nt-6.3 cygwin 2.6.1(0.30553) 2016-12-16 11​:55
x86_64
cygwin '
config_args='-des -Dprefix=/usr -Dmksymlinks
-Darchname=x86_64-cygwin-threads -Dlibperl=cygperl5_22.dll -Dcc=gcc
-Dld=g++ -Accflags=-ggdb -O2 -pipe -Wimplicit-function-declaration
-fdebug-prefix-
map=/mnt/share/maint/perl.x86_64/build=/usr/src/debug/perl-5.22.3-1
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/src/perl-
5.22.3=/usr/src/debug/perl-5.22.3-1
-fwrapv'
hint=recommended, useposix=true, d_sigaction=define
useithreads=define, usemultiplicity=define
use64bitint=define, use64bitall=define, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler​:
cc='gcc', ccflags ='-DPERL_USE_SAFE_PUTENV -D_GNU_SOURCE
-U__STRICT_ANSI__ -ggdb -O2 -pipe -Wimplicit-function-declaration
-fdebug-prefix-
map=/mnt/share/maint/perl.x86_64/build=/usr/src/debug/perl-5.22.3-1
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/src/perl-
5.22.3=/usr/src/debug/perl-5.22.3-1
-fwrapv -fno-strict-aliasing -fstack-protector-strong
-D_FORTIFY_SOURCE=2',
optimize='-O3',
cppflags='-DPERL_USE_SAFE_PUTENV -D_GNU_SOURCE
-U__STRICT_ANSI__
-ggdb -O2 -pipe -Wimplicit-function-declaration
-fdebug-prefix-
map=/mnt/share/maint/perl.x86_64/build=/usr/src/debug/perl-5.22.3-1
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/src/perl-
5.22.3=/usr/src/debug/perl-5.22.3-1
-fwrapv -fno-strict-aliasing -fstack-protector-strong'
ccversion='', gccversion='5.4.0', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8,
byteorder=12345678,
doublekind=3
d_longlong=define, longlongsize=8, d_longdbl=define,
longdblsize=16, longdblkind=3
ivtype='long', ivsize=8, nvtype='double', nvsize=8,
Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries​:
ld='g++', ldflags =' -Wl,--enable-auto-import
-Wl,--export-all-symbols -Wl,--enable-auto-image-base
-fstack-protector-strong'
libpth=/usr/lib
libs=-lpthread -lgdbm -ldb -ldl -lcrypt -lgdbm_compat
perllibs=-lpthread -ldl -lcrypt
libc=/usr/lib/libcygwin.a, so=dll, useshrplib=true,
libperl=cygperl5_22.dll
gnulibc_version=''
Dynamic Linking​:
dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
cccdlflags=' ', lddlflags=' --shared -Wl,--enable-auto-import
-Wl,--export-all-symbols -Wl,--enable-auto-image-base
-fstack-protector-strong'

---
@​INC for perl 5.22.3​:
/usr/lib/perl5/site_perl/5.22/x86_64-cygwin-threads
/usr/lib/perl5/site_perl/5.22
/usr/lib/perl5/vendor_perl/5.22/x86_64-cygwin-threads
/usr/lib/perl5/vendor_perl/5.22
/usr/lib/perl5/5.22/x86_64-cygwin-threads
/usr/lib/perl5/5.22

---
Environment for perl 5.22.3​:
CYGWIN_HOME=c​:\home\sur
HOME=/home/sur
LANG=ru_RU.UTF-8
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)

PATH=/usr/local/bin​:/usr/bin​:/win/c/Windows/system32​:/win/c/Windows​:/win/c/Windows/System32/Wbem​:/win/c/Windows/System32/WindowsPowerShell/v1.0​:/win/c/bin​:/win/c/prog32/SysinternalsSuite​:/win/c/Program
Files/Microsoft/Web Platform Installer​:/win/c/Program Files
(x86)/Microsoft ASP.NET/ASP.NET Web Pages/v1.0​:/win/c/Program Files
(x86)/Windows Kits/8.0/Windows Performance Toolkit​:/win/c/Program
Files/Microsoft SQL
Server/110/Tools/Binn​:/win/c/prog32/MATLAB/R2014a/runtime/win32​:/win/c/prog32/MATLAB/R2014a/bin​:/win/c/prog32/MATLAB/R2014a/polyspace/bin​:/win/c/prog64/Python34​:/win/c/prog64/Python34/Scripts​:/win/c/prog32/Borland/CBUILD1/Bin​:/win/c/prog32/Borland/CBUILD1/Projects/Bpl​:/win/c/Program
Files (x86)/Microsoft SDKs/Windows/v8.0A/bin/NETFX 4.0
Tools​:/win/c/Program Files (x86)/Microsoft SQL
Server/110/Tools/Binn​:/win/c/Program Files/Microsoft SQL
Server/110/DTS/Binn​:/win/c/Program Files (x86)/Microsoft SQL
Server/110/DTS/Binn​:/win/c/prog32/GnuPG/pub​:/win/c/home/ProgramData/Oracle/Java/javapath​:/win/c/prog64/Strawberry/c/bin​:/win/c/prog64/Strawberry/perl/site/bin​:/win/c/prog64/Strawberry/perl/bin​:/win/c/prog32/Skype/Phone​:/win/c/prog64/Java/jdk1.8.0_45​:/win/c/Program
Files/TortoiseGit/bin​:/win/c/home/sur/scripts​:/win/c/home/sur/bin​:/win/c/prog32/GMT4/bin​:/media/pri1/intelFPGA_lite/16.1/modelsim_ase/win32aloem​:/usr/lib/lapack
PERL_BADLANG (unset)
SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Feb 3, 2017

From sur98ke@gmail.com

UCS-2LE
Test File

Content

@p5pRT
Copy link
Author

p5pRT commented Feb 3, 2017

From sur98ke@gmail.com

Fri, 03 Feb 2017 01​:44​:54 -0800, sur98ke@​gmail.com писал​:

Thu, 02 Feb 2017 17​:26​:34 -0800, jkeenan писал​:

On Sun, 29 Jan 2017 13​:32​:15 GMT, sur98ke@​gmail.com wrote​:

This is a bug report for perl from sur98ke@​gmail.com,
generated with the help of perlbug 1.40 running under perl 5.22.3.

?Today I tried to work with UCS-2-LE-BOM encoded data using ARGV
and
STDOUT filehandles.

To do so I had to use binmode (STDOUT, ...)
for STDOUT and use open 'IN', ...
for ARGV.

What I found out is that format of argument for "use open" pragma
is
much more strict than format for open and binmode subroutines.
i.e.​:

binmode(STDOUT, "raw pop encoding(ucs-2le) crlf utf8"); # works
well
binmode(STDOUT, " raw pop encoding(ucs-2le) crlf utf8"); # works
well
binmode(STDOUT, "​:raw​:pop​:encoding(ucs-2le)​:crlf​:utf8"); # works
well

open (F, '<encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); # works
well
open (F, '< encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); # works
well
open (F, '<​:encoding(ucs-2le)​:crlf​:utf8', 'vlc-assoc.reg'); # works
well

use open ('IN' , 'encoding(ucs-2le) crlf utf8'); # works well
use open ('IN' , ' encoding(ucs-2le) crlf utf8'); # Unknown PerlIO
layer
'' at myscript.pl line ...
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8'); # Unknown PerlIO
layer
'encoding(ucs-2le)​:crlf​:utf8' at myscript.pl line ...

And this difference in interpreting arguments IS NOT mentioned at
http​://perldoc.perl.org/open.html
http​://perldoc.perl.org/functions/binmode.html
http​://perldoc.perl.org/functions/open.html
http​://perldoc.perl.org/PerlIO.html

I couldn't even find the example of proper "use open" usage over
internet. I had to look into the file "open.pm" to find the line
number 73​:

https://perl5.git.perl.org/perl.git/blob/HEAD:/lib/open.pm#l73
foreach my $layer (split(/\s+/,$dscp)) {

which splits the argument by spaces.
In both of distributions on my machine files​:
<install path>\Strawberry\perl\lib\open.pm
<install path>\cygwin64\lib\perl5\5.22\open.pm
are equal to
https://perl5.git.perl.org/perl.git/blob_plain/HEAD:/lib/open.pm
on 29 Jan 2017.

Could you please fix this in such a way that examples​:
use open ('IN' , ' encoding(ucs-2le) crlf utf8');
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8');
would work well.

Or, at least add explanation of this limitation to perldoc​:
http​://perldoc.perl.org/open.html

Thank you.

Best regards,
Artur Mansurov.

Would you be able to provide a file attachment with several lines of
text in the ucs-2le encoding for which your work-around was
necessary?

That would enable us to examine the problem on multiple platforms.

Thank you very much.
Jim Keenan

OK, I include sample of UCS-2LE file.

But I don't think you will really need it - because the problem
happens at the stage of module import ('use open' ...) before my
script actually start executing and accessing any files.
I'm sure that the source of the problem is poor argument parsing code
in 'open.pm' standard module. And this poorness is not mentioned in
official documentation.

I'm sorry, but some part of message-delivering software distorted the sample file from previous message (encoding has changed).
PLEASE DON'T USE FILE FROM PREVIOUS MESSAGE! use the packed file attached to this message.

---
Flags​:
category=library
severity=low
module=open
---
Site configuration information for perl 5.22.3​:

Configured by ASSI at Sun Jan 15 13​:05​:43 CET 2017.

Summary of my perl5 (revision 5 version 22 subversion 3)
configuration​:
Platform​:
osname=cygwin, osvers=2.6.1(0.30553), archname=cygwin-thread-
multi
uname='cygwin_nt-6.3 cygwin 2.6.1(0.30553) 2016-12-16 11​:55
x86_64
cygwin '
config_args='-des -Dprefix=/usr -Dmksymlinks
-Darchname=x86_64-cygwin-threads -Dlibperl=cygperl5_22.dll
-Dcc=gcc
-Dld=g++ -Accflags=-ggdb -O2 -pipe -Wimplicit-function-declaration
-fdebug-prefix-
map=/mnt/share/maint/perl.x86_64/build=/usr/src/debug/perl-5.22.3-1
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/src/perl-
5.22.3=/usr/src/debug/perl-5.22.3-1
-fwrapv'
hint=recommended, useposix=true, d_sigaction=define
useithreads=define, usemultiplicity=define
use64bitint=define, use64bitall=define, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler​:
cc='gcc', ccflags ='-DPERL_USE_SAFE_PUTENV -D_GNU_SOURCE
-U__STRICT_ANSI__ -ggdb -O2 -pipe -Wimplicit-function-declaration
-fdebug-prefix-
map=/mnt/share/maint/perl.x86_64/build=/usr/src/debug/perl-5.22.3-1
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/src/perl-
5.22.3=/usr/src/debug/perl-5.22.3-1
-fwrapv -fno-strict-aliasing -fstack-protector-strong
-D_FORTIFY_SOURCE=2',
optimize='-O3',
cppflags='-DPERL_USE_SAFE_PUTENV -D_GNU_SOURCE
-U__STRICT_ANSI__
-ggdb -O2 -pipe -Wimplicit-function-declaration
-fdebug-prefix-
map=/mnt/share/maint/perl.x86_64/build=/usr/src/debug/perl-5.22.3-1
-fdebug-prefix-map=/mnt/share/maint/perl.x86_64/src/perl-
5.22.3=/usr/src/debug/perl-5.22.3-1
-fwrapv -fno-strict-aliasing -fstack-protector-strong'
ccversion='', gccversion='5.4.0', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8,
byteorder=12345678,
doublekind=3
d_longlong=define, longlongsize=8, d_longdbl=define,
longdblsize=16, longdblkind=3
ivtype='long', ivsize=8, nvtype='double', nvsize=8,
Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries​:
ld='g++', ldflags =' -Wl,--enable-auto-import
-Wl,--export-all-symbols -Wl,--enable-auto-image-base
-fstack-protector-strong'
libpth=/usr/lib
libs=-lpthread -lgdbm -ldb -ldl -lcrypt -lgdbm_compat
perllibs=-lpthread -ldl -lcrypt
libc=/usr/lib/libcygwin.a, so=dll, useshrplib=true,
libperl=cygperl5_22.dll
gnulibc_version=''
Dynamic Linking​:
dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
cccdlflags=' ', lddlflags=' --shared -Wl,--enable-auto-
import
-Wl,--export-all-symbols -Wl,--enable-auto-image-base
-fstack-protector-strong'

---
@​INC for perl 5.22.3​:
/usr/lib/perl5/site_perl/5.22/x86_64-cygwin-threads
/usr/lib/perl5/site_perl/5.22
/usr/lib/perl5/vendor_perl/5.22/x86_64-cygwin-threads
/usr/lib/perl5/vendor_perl/5.22
/usr/lib/perl5/5.22/x86_64-cygwin-threads
/usr/lib/perl5/5.22

---
Environment for perl 5.22.3​:
CYGWIN_HOME=c​:\home\sur
HOME=/home/sur
LANG=ru_RU.UTF-8
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)

PATH=/usr/local/bin​:/usr/bin​:/win/c/Windows/system32​:/win/c/Windows​:/win/c/Windows/System32/Wbem​:/win/c/Windows/System32/WindowsPowerShell/v1.0​:/win/c/bin​:/win/c/prog32/SysinternalsSuite​:/win/c/Program
Files/Microsoft/Web Platform Installer​:/win/c/Program Files
(x86)/Microsoft ASP.NET/ASP.NET Web Pages/v1.0​:/win/c/Program Files
(x86)/Windows Kits/8.0/Windows Performance Toolkit​:/win/c/Program
Files/Microsoft SQL
Server/110/Tools/Binn​:/win/c/prog32/MATLAB/R2014a/runtime/win32​:/win/c/prog32/MATLAB/R2014a/bin​:/win/c/prog32/MATLAB/R2014a/polyspace/bin​:/win/c/prog64/Python34​:/win/c/prog64/Python34/Scripts​:/win/c/prog32/Borland/CBUILD1/Bin​:/win/c/prog32/Borland/CBUILD1/Projects/Bpl​:/win/c/Program
Files (x86)/Microsoft SDKs/Windows/v8.0A/bin/NETFX 4.0
Tools​:/win/c/Program Files (x86)/Microsoft SQL
Server/110/Tools/Binn​:/win/c/Program Files/Microsoft SQL
Server/110/DTS/Binn​:/win/c/Program Files (x86)/Microsoft SQL
Server/110/DTS/Binn​:/win/c/prog32/GnuPG/pub​:/win/c/home/ProgramData/Oracle/Java/javapath​:/win/c/prog64/Strawberry/c/bin​:/win/c/prog64/Strawberry/perl/site/bin​:/win/c/prog64/Strawberry/perl/bin​:/win/c/prog32/Skype/Phone​:/win/c/prog64/Java/jdk1.8.0_45​:/win/c/Program
Files/TortoiseGit/bin​:/win/c/home/sur/scripts​:/win/c/home/sur/bin​:/win/c/prog32/GMT4/bin​:/media/pri1/intelFPGA_lite/16.1/modelsim_ase/win32aloem​:/usr/lib/lapack
PERL_BADLANG (unset)
SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Feb 3, 2017

From sur98ke@gmail.com

UCS-2LE-test.zip

@p5pRT
Copy link
Author

p5pRT commented Feb 4, 2017

From @jkeenan

On Fri, 03 Feb 2017 09​:48​:43 GMT, sur98ke@​gmail.com wrote​:

Fri, 03 Feb 2017 01​:44​:54 -0800, sur98ke@​gmail.com писал​:

Thu, 02 Feb 2017 17​:26​:34 -0800, jkeenan писал​:

On Sun, 29 Jan 2017 13​:32​:15 GMT, sur98ke@​gmail.com wrote​:

This is a bug report for perl from sur98ke@​gmail.com,
generated with the help of perlbug 1.40 running under perl
5.22.3.

?Today I tried to work with UCS-2-LE-BOM encoded data using ARGV
and
STDOUT filehandles.

To do so I had to use binmode (STDOUT, ...)
for STDOUT and use open 'IN', ...
for ARGV.

What I found out is that format of argument for "use open" pragma
is
much more strict than format for open and binmode subroutines.
i.e.​:

binmode(STDOUT, "raw pop encoding(ucs-2le) crlf utf8"); # works
well
binmode(STDOUT, " raw pop encoding(ucs-2le) crlf utf8"); # works
well
binmode(STDOUT, "​:raw​:pop​:encoding(ucs-2le)​:crlf​:utf8"); # works
well

open (F, '<encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); #
works
well
open (F, '< encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); #
works
well
open (F, '<​:encoding(ucs-2le)​:crlf​:utf8', 'vlc-assoc.reg'); #
works
well

use open ('IN' , 'encoding(ucs-2le) crlf utf8'); # works well
use open ('IN' , ' encoding(ucs-2le) crlf utf8'); # Unknown
PerlIO
layer
'' at myscript.pl line ...
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8'); # Unknown
PerlIO
layer
'encoding(ucs-2le)​:crlf​:utf8' at myscript.pl line ...

And this difference in interpreting arguments IS NOT mentioned at
http​://perldoc.perl.org/open.html
http​://perldoc.perl.org/functions/binmode.html
http​://perldoc.perl.org/functions/open.html
http​://perldoc.perl.org/PerlIO.html

I couldn't even find the example of proper "use open" usage over
internet. I had to look into the file "open.pm" to find the line
number 73​:

https://perl5.git.perl.org/perl.git/blob/HEAD:/lib/open.pm#l73
foreach my $layer (split(/\s+/,$dscp)) {

which splits the argument by spaces.
In both of distributions on my machine files​:
<install path>\Strawberry\perl\lib\open.pm
<install path>\cygwin64\lib\perl5\5.22\open.pm
are equal to
https://perl5.git.perl.org/perl.git/blob_plain/HEAD:/lib/open.pm
on 29 Jan 2017.

Could you please fix this in such a way that examples​:
use open ('IN' , ' encoding(ucs-2le) crlf utf8');
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8');
would work well.

Or, at least add explanation of this limitation to perldoc​:
http​://perldoc.perl.org/open.html

Thank you.

Best regards,
Artur Mansurov.

Would you be able to provide a file attachment with several lines
of
text in the ucs-2le encoding for which your work-around was
necessary?

That would enable us to examine the problem on multiple platforms.

Thank you very much.
Jim Keenan

OK, I include sample of UCS-2LE file.

But I don't think you will really need it - because the problem
happens at the stage of module import ('use open' ...) before my
script actually start executing and accessing any files.
I'm sure that the source of the problem is poor argument parsing code
in 'open.pm' standard module. And this poorness is not mentioned in
official documentation.

Thanks you for supplying the .zip file. As I expected, the file within was useful to me in setting up a diagnostic.

I think this is at most a documentation problem.

But first let me advise that perldoc.perl.org is a site maintained by people other than the Perl 5 Porters. Occasionally it is out-of-date with respect to recent Perl releases. I don't think that's relevant here, but when seeking information on a Perl pragma, your best bet is to first say​:

#####
perldoc open
#####

When I did that, I was able to find sufficient documentation to set up the diagnostic program, pragma-handle-ucs-2le.pl, attached.

I tried 3 variants of that program by commenting/uncommenting pairs of 'use open' statements. Each variant produced exactly the same output, attached as 130668-output.txt.

The only difference was that when I used the first variant​:

#####
use open ('IN', 'encoding(ucs-2le) crlf utf8');
use open ('OUT', 'encoding(ucs-2le) crlf utf8');
#####

... the program ran without warnings. When I used the second variant​:

#####
use open ('IN', ' encoding(ucs-2le) crlf utf8');
use open ('OUT', ' encoding(ucs-2le) crlf utf8');
#####

... I got these warnings​:

#####
Unknown PerlIO layer '' at pragma-handle-ucs-2le.pl line 8.
Unknown PerlIO layer '' at pragma-handle-ucs-2le.pl line 9.
#####

... and I got similar warnings with the third variant. The cases where I got warnings were the same cases where you got warnings.

IMO there is no reason to assume that 'use open' should be as permissive in its syntax as 'binmode' or the second argument to 'open'. I wouldn't be surprised if the 'open' pragma were implemented at a much later date than 'binmode' or the 'open' built-in function and therefore implemented to throw warnings at imprecisions in the spelling of its arguments.

Hence, I think the documentation for 'perldoc open' is satisfactory as is -- but I would be willing to consider/write a one-sentence patch indicating that its syntax will warn where comparable syntax in 'binmode' and the 'open' function does not.

Thank you very much.
Jim Keenan

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Feb 4, 2017

From @jkeenan

UCS-2LE
Test File

Content

@p5pRT
Copy link
Author

p5pRT commented Feb 4, 2017

From @jkeenan

pragma-handle-ucs-2le.pl

@p5pRT
Copy link
Author

p5pRT commented Feb 11, 2017

From @jkeenan

On Sat, 04 Feb 2017 16​:34​:22 GMT, jkeenan wrote​:

On Fri, 03 Feb 2017 09​:48​:43 GMT, sur98ke@​gmail.com wrote​:

Fri, 03 Feb 2017 01​:44​:54 -0800, sur98ke@​gmail.com писал​:

Thu, 02 Feb 2017 17​:26​:34 -0800, jkeenan писал​:

On Sun, 29 Jan 2017 13​:32​:15 GMT, sur98ke@​gmail.com wrote​:

This is a bug report for perl from sur98ke@​gmail.com,
generated with the help of perlbug 1.40 running under perl
5.22.3.

?Today I tried to work with UCS-2-LE-BOM encoded data using
ARGV
and
STDOUT filehandles.

To do so I had to use binmode (STDOUT, ...)
for STDOUT and use open 'IN', ...
for ARGV.

What I found out is that format of argument for "use open"
pragma
is
much more strict than format for open and binmode subroutines.
i.e.​:

binmode(STDOUT, "raw pop encoding(ucs-2le) crlf utf8"); # works
well
binmode(STDOUT, " raw pop encoding(ucs-2le) crlf utf8"); #
works
well
binmode(STDOUT, "​:raw​:pop​:encoding(ucs-2le)​:crlf​:utf8"); #
works
well

open (F, '<encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); #
works
well
open (F, '< encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); #
works
well
open (F, '<​:encoding(ucs-2le)​:crlf​:utf8', 'vlc-assoc.reg'); #
works
well

use open ('IN' , 'encoding(ucs-2le) crlf utf8'); # works well
use open ('IN' , ' encoding(ucs-2le) crlf utf8'); # Unknown
PerlIO
layer
'' at myscript.pl line ...
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8'); # Unknown
PerlIO
layer
'encoding(ucs-2le)​:crlf​:utf8' at myscript.pl line ...

And this difference in interpreting arguments IS NOT mentioned
at
http​://perldoc.perl.org/open.html
http​://perldoc.perl.org/functions/binmode.html
http​://perldoc.perl.org/functions/open.html
http​://perldoc.perl.org/PerlIO.html

I couldn't even find the example of proper "use open" usage
over
internet. I had to look into the file "open.pm" to find the
line
number 73​:

https://perl5.git.perl.org/perl.git/blob/HEAD:/lib/open.pm#l73
foreach my $layer (split(/\s+/,$dscp)) {

which splits the argument by spaces.
In both of distributions on my machine files​:
<install path>\Strawberry\perl\lib\open.pm
<install path>\cygwin64\lib\perl5\5.22\open.pm
are equal to
https://perl5.git.perl.org/perl.git/blob_plain/HEAD:/lib/open.pm
on 29 Jan 2017.

Could you please fix this in such a way that examples​:
use open ('IN' , ' encoding(ucs-2le) crlf utf8');
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8');
would work well.

Or, at least add explanation of this limitation to perldoc​:
http​://perldoc.perl.org/open.html

Thank you.

Best regards,
Artur Mansurov.

Would you be able to provide a file attachment with several lines
of
text in the ucs-2le encoding for which your work-around was
necessary?

That would enable us to examine the problem on multiple
platforms.

Thank you very much.
Jim Keenan

OK, I include sample of UCS-2LE file.

But I don't think you will really need it - because the problem
happens at the stage of module import ('use open' ...) before my
script actually start executing and accessing any files.
I'm sure that the source of the problem is poor argument parsing
code
in 'open.pm' standard module. And this poorness is not mentioned in
official documentation.

Thanks you for supplying the .zip file. As I expected, the file
within was useful to me in setting up a diagnostic.

I think this is at most a documentation problem.

But first let me advise that perldoc.perl.org is a site maintained by
people other than the Perl 5 Porters. Occasionally it is out-of-date
with respect to recent Perl releases. I don't think that's relevant
here, but when seeking information on a Perl pragma, your best bet is
to first say​:

#####
perldoc open
#####

When I did that, I was able to find sufficient documentation to set up
the diagnostic program, pragma-handle-ucs-2le.pl, attached.

I tried 3 variants of that program by commenting/uncommenting pairs of
'use open' statements. Each variant produced exactly the same output,
attached as 130668-output.txt.

The only difference was that when I used the first variant​:

#####
use open ('IN', 'encoding(ucs-2le) crlf utf8');
use open ('OUT', 'encoding(ucs-2le) crlf utf8');
#####

... the program ran without warnings. When I used the second variant​:

#####
use open ('IN', ' encoding(ucs-2le) crlf utf8');
use open ('OUT', ' encoding(ucs-2le) crlf utf8');
#####

... I got these warnings​:

#####
Unknown PerlIO layer '' at pragma-handle-ucs-2le.pl line 8.
Unknown PerlIO layer '' at pragma-handle-ucs-2le.pl line 9.
#####

... and I got similar warnings with the third variant. The cases
where I got warnings were the same cases where you got warnings.

IMO there is no reason to assume that 'use open' should be as
permissive in its syntax as 'binmode' or the second argument to
'open'. I wouldn't be surprised if the 'open' pragma were implemented
at a much later date than 'binmode' or the 'open' built-in function
and therefore implemented to throw warnings at imprecisions in the
spelling of its arguments.

Hence, I think the documentation for 'perldoc open' is satisfactory as
is -- but I would be willing to consider/write a one-sentence patch
indicating that its syntax will warn where comparable syntax in
'binmode' and the 'open' function does not.

As I stated last week, I don't feel any need to change any documentation here. If anyone else feels strongly otherwise, please contribute a patch. Otherwise I will close this ticket within 7 days.

Thank you very much.
Jim Keenan

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Feb 16, 2017

From sur98ke@gmail.com

Fri, 10 Feb 2017 18​:49​:46 -0800, jkeenan писал​:

On Sat, 04 Feb 2017 16​:34​:22 GMT, jkeenan wrote​:

On Fri, 03 Feb 2017 09​:48​:43 GMT, sur98ke@​gmail.com wrote​:

Fri, 03 Feb 2017 01​:44​:54 -0800, sur98ke@​gmail.com писал​:

Thu, 02 Feb 2017 17​:26​:34 -0800, jkeenan писал​:

On Sun, 29 Jan 2017 13​:32​:15 GMT, sur98ke@​gmail.com wrote​:

This is a bug report for perl from sur98ke@​gmail.com,
generated with the help of perlbug 1.40 running under perl
5.22.3.

?Today I tried to work with UCS-2-LE-BOM encoded data using
ARGV
and
STDOUT filehandles.

To do so I had to use binmode (STDOUT, ...)
for STDOUT and use open 'IN', ...
for ARGV.

What I found out is that format of argument for "use open"
pragma
is
much more strict than format for open and binmode
subroutines.
i.e.​:

binmode(STDOUT, "raw pop encoding(ucs-2le) crlf utf8"); #
works
well
binmode(STDOUT, " raw pop encoding(ucs-2le) crlf utf8"); #
works
well
binmode(STDOUT, "​:raw​:pop​:encoding(ucs-2le)​:crlf​:utf8"); #
works
well

open (F, '<encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); #
works
well
open (F, '< encoding(ucs-2le) crlf utf8', 'vlc-assoc.reg'); #
works
well
open (F, '<​:encoding(ucs-2le)​:crlf​:utf8', 'vlc-assoc.reg'); #
works
well

use open ('IN' , 'encoding(ucs-2le) crlf utf8'); # works well
use open ('IN' , ' encoding(ucs-2le) crlf utf8'); # Unknown
PerlIO
layer
'' at myscript.pl line ...
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8'); # Unknown
PerlIO
layer
'encoding(ucs-2le)​:crlf​:utf8' at myscript.pl line ...

And this difference in interpreting arguments IS NOT
mentioned
at
http​://perldoc.perl.org/open.html
http​://perldoc.perl.org/functions/binmode.html
http​://perldoc.perl.org/functions/open.html
http​://perldoc.perl.org/PerlIO.html

I couldn't even find the example of proper "use open" usage
over
internet. I had to look into the file "open.pm" to find the
line
number 73​:

https://perl5.git.perl.org/perl.git/blob/HEAD:/lib/open.pm#l73
foreach my $layer (split(/\s+/,$dscp)) {

which splits the argument by spaces.
In both of distributions on my machine files​:
<install path>\Strawberry\perl\lib\open.pm
<install path>\cygwin64\lib\perl5\5.22\open.pm
are equal to
https://perl5.git.perl.org/perl.git/blob_plain/HEAD:/lib/open.pm
on 29 Jan 2017.

Could you please fix this in such a way that examples​:
use open ('IN' , ' encoding(ucs-2le) crlf utf8');
use open ('IN' , '​:encoding(ucs-2le)​:crlf​:utf8');
would work well.

Or, at least add explanation of this limitation to perldoc​:
http​://perldoc.perl.org/open.html

Thank you.

Best regards,
Artur Mansurov.

Would you be able to provide a file attachment with several
lines
of
text in the ucs-2le encoding for which your work-around was
necessary?

That would enable us to examine the problem on multiple
platforms.

Thank you very much.
Jim Keenan

OK, I include sample of UCS-2LE file.

But I don't think you will really need it - because the problem
happens at the stage of module import ('use open' ...) before my
script actually start executing and accessing any files.
I'm sure that the source of the problem is poor argument parsing
code
in 'open.pm' standard module. And this poorness is not mentioned
in
official documentation.

Thanks you for supplying the .zip file. As I expected, the file
within was useful to me in setting up a diagnostic.

I think this is at most a documentation problem.

But first let me advise that perldoc.perl.org is a site maintained by
people other than the Perl 5 Porters. Occasionally it is out-of-date
with respect to recent Perl releases. I don't think that's relevant
here, but when seeking information on a Perl pragma, your best bet is
to first say​:

#####
perldoc open
#####

When I did that, I was able to find sufficient documentation to set
up
the diagnostic program, pragma-handle-ucs-2le.pl, attached.

I tried 3 variants of that program by commenting/uncommenting pairs
of
'use open' statements. Each variant produced exactly the same
output,
attached as 130668-output.txt.

The only difference was that when I used the first variant​:

#####
use open ('IN', 'encoding(ucs-2le) crlf utf8');
use open ('OUT', 'encoding(ucs-2le) crlf utf8');
#####

... the program ran without warnings. When I used the second
variant​:

#####
use open ('IN', ' encoding(ucs-2le) crlf utf8');
use open ('OUT', ' encoding(ucs-2le) crlf utf8');
#####

... I got these warnings​:

#####
Unknown PerlIO layer '' at pragma-handle-ucs-2le.pl line 8.
Unknown PerlIO layer '' at pragma-handle-ucs-2le.pl line 9.
#####

... and I got similar warnings with the third variant. The cases
where I got warnings were the same cases where you got warnings.

IMO there is no reason to assume that 'use open' should be as
permissive in its syntax as 'binmode' or the second argument to
'open'. I wouldn't be surprised if the 'open' pragma were
implemented
at a much later date than 'binmode' or the 'open' built-in function
and therefore implemented to throw warnings at imprecisions in the
spelling of its arguments.

Hence, I think the documentation for 'perldoc open' is satisfactory
as
is -- but I would be willing to consider/write a one-sentence patch
indicating that its syntax will warn where comparable syntax in
'binmode' and the 'open' function does not.

As I stated last week, I don't feel any need to change any
documentation here. If anyone else feels strongly otherwise, please
contribute a patch. Otherwise I will close this ticket within 7 days.

Thank you very much.
Jim Keenan

Sorry for late answer.

First, sorry once again - when I posted this bug I didn't notice that despite printing warnings all three variants of "use open" still work well. Actually, when I looked at source of open.pm closer, I noticed that it just tries to split argument string by spaces into separate layer and then just checks them and CONCATENATES THEM BACK and puts result into ${^OPEN} special variable. AFAIU PerlIO implementation parses this variable and supports very free way of formatting it.

OK, I agree, that valid format for multiple layers is "space separated, each layer name starts from colon".
BUT, I've just read "perldoc perlfunc" for "open" and "binmode" and I've read "perldoc open" and I HAVEN'T FOUND ANY hint to that format. The only place I found "space separated" phrase was "perldoc perlIO", but I think this is not an obvious place to look for casual Perl user.

@p5pRT
Copy link
Author

p5pRT commented Feb 17, 2017

From @jkeenan

No patches submitted; therefore, closing this ticket.

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT p5pRT closed this as completed Feb 17, 2017
@p5pRT
Copy link
Author

p5pRT commented Feb 17, 2017

@jkeenan - Status changed from 'open' to 'rejected'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant