Skip Menu |
Report information
Id: 121734
Status: resolved
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors:
Cc:
AdminCc:

Operating System: (no value)
PatchStatus: (no value)
Severity: low
Type: unknown
Perl Version: (no value)
Fixed In: (no value)



To: perlbug [...] perl.org
Date: Fri, 25 Apr 2014 14:33:15 -0500
From: Reini Urban <rurban [...] cpanel.net>
Subject: cygwin: setlocale(LC_ALL, "ja_JP.utf8") panic: strxfrm() gets absurd Message-Id: <5.14.2_2132_1398453260@win> Reply-To: rurban@cpanel.net To: perlbug@perl.org From: rurban@cpanel.net This is a bug report for perl from rurban@x-ray.at, generated with the help of perlbug 1.39 running under perl 5.14.2. ----------------------------------------------------------------- cygwin 5.18.2 $ ./perl -MPOSIX -e'setlocale(LC_ALL, "ja_JP.utf8")' panic: strxfrm() gets absurd - a => 100, ab => 100 at -e line 1. all other locales before ja_JP.utf8 in lib/locale.t do work fine. I haven't checked if that's the problem fixed in blead with setlocale by kwh, or a cygwin bug. ----------------------------------------------------------------- --- Flags: category=library severity=high module=POSIX --- This perlbug was built using Perl 5.14.2 - Thu Jul 12 13:58:56 CDT 2012 It is being executed now by Perl 5.14.2 - Mon Sep 17 12:58:11 CDT 2012. Site configuration information for perl 5.14.2: Configured by rurban at Mon Sep 17 12:58:11 CDT 2012. Summary of my perl5 (revision 5 version 14 subversion 2) configuration: Platform: osname=cygwin, osvers=1.7.17s(0.26253), archname=cygwin-thread-multi-64int uname='cygwin_nt-5.1 win 1.7.17s(0.26253) 20120801 11:02:01 i686 cygwin ' config_args='-de -Dlibperl=cygperl5_14.dll -Dcc=gcc-4 -Dld=g++-4 -Darchname=i686-cygwin-threads-64int -Dmksymlinks -Dusethreads -Accflags=-g' hint=recommended, useposix=true, d_sigaction=define useithreads=define, usemultiplicity=define useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc-4', ccflags ='-DPERL_USE_SAFE_PUTENV -U__STRICT_ANSI__ -g -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include', optimize='-O3', cppflags='-DPERL_USE_SAFE_PUTENV -U__STRICT_ANSI__ -g -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.5.3', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='g++-4', ldflags =' -Wl,--enable-auto-import -Wl,--export-all-symbols -Wl,--enable-auto-image-base -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /usr/lib /lib libs=-lgdbm -ldb -ldl -lcrypt -lgdbm_compat perllibs=-ldl -lcrypt libc=/usr/lib/libc.a, so=dll, useshrplib=true, libperl=cygperl5_14.dll gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', lddlflags=' --shared -Wl,--enable-auto-import -Wl,--export-all-symbols -Wl,--enable-auto-image-base -L/usr/local/lib -fstack-protector' Locally applied patches: Bug#55162 File::Spec::case_tolerant performance CYG07 $vendorarch/auto/.rebase CYG15 static Win32CORE CYG17 cyg-1.7 paths-utf8 0c612ce82 Fix building static extensions on cygwin, -UUSEIMPORTLIB 1bac5ecc1 Fix 64-bit threading sv.c: S_anonymise_cv_maybe Cygwin::sync_winenv added --- @INC for perl 5.14.2: /usr/lib/perl5/site_perl/5.14/i686-cygwin-threads-64int /usr/lib/perl5/site_perl/5.14 /usr/lib/perl5/vendor_perl/5.14/i686-cygwin-threads-64int /usr/lib/perl5/vendor_perl/5.14 /usr/lib/perl5/5.14/i686-cygwin-threads-64int /usr/lib/perl5/5.14 /usr/lib/perl5/site_perl/5.10 /usr/lib/perl5/vendor_perl/5.10 /usr/lib/perl5/site_perl/5.8 . --- Environment for perl 5.14.2: HOME=/home/rurban LANG=en_US.UTF-8 LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/usr/bin:/usr/local/bin:/home/rurban/bin:/usr/sbin/:/usr/local/bin:/usr/bin:/cygdrive/c/windows/system32:/cygdrive/c/windows:/cygdrive/c/windows/System32/Wbem:/bin:/home/rurban/bin:/usr/lib/lapack PERL_BADLANG (unset) SHELL=/bin/bash
Date: Fri, 25 Apr 2014 14:02:40 -0600
To: perl5-porters [...] perl.org
Subject: Re: [perl #121734]
From: Karl Williamson <public [...] khwilliamson.com>
Download (untitled) / with headers
text/plain 249b
On 04/25/2014 01:33 PM, rurban@cpanel.net (via RT) wrote: Show quoted text
> ./perl -MPOSIX -e'setlocale(LC_ALL, "ja_JP.utf8")'
It works on my Linux; I notice that there is no -I parameter in your example. Are you sure you are getting the correct library path?
Subject: Re: [perl #121734]
Date: Fri, 25 Apr 2014 15:10:22 -0500
To: perlbug-followup [...] perl.org
From: Reini Urban <reini [...] cpanel.net>
On 04/25/2014 03:02 PM, karl williamson via RT wrote: Show quoted text
> On 04/25/2014 01:33 PM, rurban@cpanel.net (via RT) wrote:
>> ./perl -MPOSIX -e'setlocale(LC_ALL, "ja_JP.utf8")'
> > It works on my Linux; I notice that there is no -I parameter in your > example. Are you sure you are getting the correct library path?
yes, sorry. This fails only on cygwin, only tested on 32bit. It fails with most eastern locales, but some eastern locales do pass. I needed to this patch to pass the locale tests: --- perl-5.18.2/lib/locale.t~ 2014-01-06 16:46:45.000000000 -0600 +++ perl-5.18.2/lib/locale.t 2014-04-25 14:28:31.455279000 -0500 @@ -528,7 +528,9 @@ sub trylocale { my $locale = shift; return if grep { $locale eq $_ } @Locale; - return unless setlocale(LC_ALL, $locale); + # warn "# setlocale(LC_ALL, $locale)\n"; + return if ($^O eq 'cygwin' and $locale =~ /^(ja_JP\.utf8|japanese\.sjis|ko_KR|korean|zh_)/) #RT 121734 + or !setlocale(LC_ALL, $locale); my $badutf8; { local $SIG{__WARN__} = sub { -- Reini
To: Reini Urban <reini [...] cpanel.net>, perlbug-followup [...] perl.org
Subject: Re: [perl #121734]
From: Karl Williamson <public [...] khwilliamson.com>
Date: Wed, 04 Jun 2014 18:30:14 -0600
Download (untitled) / with headers
text/plain 813b
On 04/25/2014 02:10 PM, Reini Urban wrote: Show quoted text
> On 04/25/2014 03:02 PM, karl williamson via RT wrote:
>> On 04/25/2014 01:33 PM, rurban@cpanel.net (via RT) wrote:
>>> ./perl -MPOSIX -e'setlocale(LC_ALL, "ja_JP.utf8")'
>> >> It works on my Linux; I notice that there is no -I parameter in your >> example. Are you sure you are getting the correct library path?
> > yes, sorry. > > This fails only on cygwin, only tested on 32bit. > It fails with most eastern locales, but some eastern locales do pass. >
I had a look at the code. It's from these two lines in locale.c: /* 50: surely no system expands a char more. */ #define XFRMBUFSIZE (2 * 50) That '50' looks like a guess. You could try bumping it up until everything passes, and we could then use that instead of having to skip locales in the tests.
Date: Wed, 04 Jun 2014 21:22:49 -0400
From: Jarkko Hietaniemi <jhi [...] iki.fi>
Subject: Re: [perl #121734]
To: Karl Williamson <public [...] khwilliamson.com>, Reini Urban <reini [...] cpanel.net>, perlbug-followup [...] perl.org
On Wednesday-201406-04, 20:30, Karl Williamson wrote: Show quoted text
> On 04/25/2014 02:10 PM, Reini Urban wrote:
>> On 04/25/2014 03:02 PM, karl williamson via RT wrote:
>>> On 04/25/2014 01:33 PM, rurban@cpanel.net (via RT) wrote:
>>>> ./perl -MPOSIX -e'setlocale(LC_ALL, "ja_JP.utf8")'
>>> >>> It works on my Linux; I notice that there is no -I parameter in your >>> example. Are you sure you are getting the correct library path?
>> >> yes, sorry. >> >> This fails only on cygwin, only tested on 32bit. >> It fails with most eastern locales, but some eastern locales do pass. >>
> > I had a look at the code. It's from these two lines in locale.c: > /* 50: surely no system expands a char more. */ > #define XFRMBUFSIZE (2 * 50) > > That '50' looks like a guess. You could try bumping it up until > everything passes, and we could then use that instead of having to skip > locales in the tests.
Yes, a guess, by yours truly, from the time when mammoths and sabertooths roamed the lands. I've got a better idea than futzing with the futz factor. Just a minute. Show quoted text
>
Date: Wed, 04 Jun 2014 21:47:18 -0400
From: Jarkko Hietaniemi <jhi [...] iki.fi>
Subject: Re: [perl #121734]
To: Karl Williamson <public [...] khwilliamson.com>, Reini Urban <reini [...] cpanel.net>, perlbug-followup [...] perl.org
Download (untitled) / with headers
text/plain 753b
On Wednesday-201406-04, 21:22, Jarkko Hietaniemi wrote: Show quoted text
> I've got a better idea than futzing with the futz factor. Just a minute.
So try the attached patch. Instead of baking in a futz factor (which some system apparently just out-futzed) this tries divining the appropriate numbers using the NULL-buffer-zero-count feature of strxfrm(), where no actual output is done, only the required length is returned. NOTE 1: if the expansion factor of 50 really is not enough, this means that for strxfrm() of input of N bytes, Perl will reserve a buffer of more than 50 N... NOTE 2: now that I think of it, if the strxfrm() count-only feature really works (see above), then the PL_collxfrm_base and PL_collxfrm_mult can be deprecated as such.
Download locale.pat
text/plain 1.2k

Message body is not shown because sender requested not to inline it.

RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 2.1k
On Fri Apr 25 12:33:27 2014, rurban@cpanel.net wrote: Show quoted text
> $ ./perl -MPOSIX -e'setlocale(LC_ALL, "ja_JP.utf8")' > panic: strxfrm() gets absurd - a => 100, ab => 100 at -e line 1. > > all other locales before ja_JP.utf8 in lib/locale.t do work fine. > I haven't checked if that's the problem fixed in blead with setlocale > by > kwh, or a cygwin bug.
I don't see this panic with current cygwin: tony@saturn ~/dev/perl/git/perl $ ./perl -Ilib -MPOSIX -e 'setlocale(LC_ALL, "ja_JP.utf8")' tony@saturn ~/dev/perl/git/perl $ uname -a CYGWIN_NT-6.1-WOW64 saturn 1.7.30(0.272/5/3) 2014-05-23 10:36 i686 Cygwin Tested with blead and 5.18.2 (with that minimal fix to cygwin/cygwin.c). I do however see a bug in cygwin's strxfrm() implementation, here's the code from src/winsup/cygwin/nlsfuncs.cc: extern "C" size_t strxfrm (char *__restrict s1, const char *__restrict s2, size_t sn) { size_t ret; size_t n2; wchar_t *ws2; tmp_pathbuf tp; if (!collate_lcid) return strlcpy (s1, s2, sn); /* The ANSI version of LCMapString uses the default charset of the lcid, so we must use the Unicode version. */ n2 = lc_mbstowcs (collate_mbtowc, collate_charset, NULL, s2, 0) + 1; ws2 = (n2 > NT_MAX_PATH ? (wchar_t *) malloc (n2 * sizeof (wchar_t)) : tp.w_get ()); lc_mbstowcs (collate_mbtowc, collate_charset, ws2, s2, n2); /* The sort key is a NUL-terminated byte string. */ ret = LCMapStringW (collate_lcid, LCMAP_SORTKEY, ws2, -1, (PWCHAR) s1, sn); if (n2 > NT_MAX_PATH) free (ws2); if (ret == 0) { if (GetLastError () != ERROR_INSUFFICIENT_BUFFER) set_errno (EINVAL); return sn; } /* LCMapStringW returns byte count including the terminating NUL character. strxfrm is supposed to return length excluding the NUL. */ return ret - 1; } when the key string received doesn't fit in the buffer, LCMapStringW() returns 0 and sets GetLastError() to ERROR_INSUFFICIENT_BUFFER, at which point strxfrm() returns the size of the output buffer - *not* the size of the sort key as required by ANSI C. We also have a couple of bugs in POSIX::strxfrm(), but that's not used by the code that produces the panic above. Tony
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 510b
On Wed Jun 04 23:58:26 2014, tonyc wrote: Show quoted text
> I do however see a bug in cygwin's strxfrm() implementation, here's > the code from src/winsup/cygwin/nlsfuncs.cc: >
... Show quoted text
> > when the key string received doesn't fit in the buffer, LCMapStringW() > returns 0 and sets GetLastError() to ERROR_INSUFFICIENT_BUFFER, at > which point strxfrm() returns the size of the output buffer - *not* > the size of the sort key as required by ANSI C.
Reported to cygwin at https://cygwin.com/ml/cygwin/2016-04/msg00232.html Tony
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 816b
On Mon Apr 11 22:10:38 2016, tonyc wrote: Show quoted text
> On Wed Jun 04 23:58:26 2014, tonyc wrote:
> > I do however see a bug in cygwin's strxfrm() implementation, here's > > the code from src/winsup/cygwin/nlsfuncs.cc: > >
> ...
> > > > when the key string received doesn't fit in the buffer, LCMapStringW() > > returns 0 and sets GetLastError() to ERROR_INSUFFICIENT_BUFFER, at > > which point strxfrm() returns the size of the output buffer - *not* > > the size of the sort key as required by ANSI C.
> > Reported to cygwin at https://cygwin.com/ml/cygwin/2016-04/msg00232.html
Fixed in nightlies: https://cygwin.com/ml/cygwin/2016-04/msg00247.html https://cygwin.com/ml/cygwin/2016-04/msg00279.html Again, this doesn't seem to be the cause of the problem Reini originally reported (which I haven't reprotduced.) Tony
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 276b
On Wed Jun 04 23:58:26 2014, tonyc wrote: Show quoted text
> We also have a couple of bugs in POSIX::strxfrm(), but that's not used > by the code that produces the panic above.
Which I fixed in 6ec5f825e65b3170c6b870804ad6d5349c9c84d1, if anyone is wondering what I was talking about. Tony
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 2.7k
On Wed Jun 04 23:58:26 2014, tonyc wrote: Show quoted text
> On Fri Apr 25 12:33:27 2014, rurban@cpanel.net wrote:
> > $ ./perl -MPOSIX -e'setlocale(LC_ALL, "ja_JP.utf8")' > > panic: strxfrm() gets absurd - a => 100, ab => 100 at -e line 1. > > > > all other locales before ja_JP.utf8 in lib/locale.t do work fine. > > I haven't checked if that's the problem fixed in blead with setlocale > > by > > kwh, or a cygwin bug.
> > I don't see this panic with current cygwin: > > tony@saturn ~/dev/perl/git/perl > $ ./perl -Ilib -MPOSIX -e 'setlocale(LC_ALL, "ja_JP.utf8")' > > tony@saturn ~/dev/perl/git/perl > $ uname -a > CYGWIN_NT-6.1-WOW64 saturn 1.7.30(0.272/5/3) 2014-05-23 10:36 i686 > Cygwin > > Tested with blead and 5.18.2 (with that minimal fix to > cygwin/cygwin.c). > > I do however see a bug in cygwin's strxfrm() implementation, here's > the code from src/winsup/cygwin/nlsfuncs.cc: > > extern "C" size_t > strxfrm (char *__restrict s1, const char *__restrict s2, size_t sn) > { > size_t ret; > size_t n2; > wchar_t *ws2; > tmp_pathbuf tp; > > if (!collate_lcid) > return strlcpy (s1, s2, sn); > /* The ANSI version of LCMapString uses the default charset of the > lcid, > so we must use the Unicode version. */ > n2 = lc_mbstowcs (collate_mbtowc, collate_charset, NULL, s2, 0) + 1; > ws2 = (n2 > NT_MAX_PATH ? (wchar_t *) malloc (n2 * sizeof (wchar_t)) > : tp.w_get ()); > lc_mbstowcs (collate_mbtowc, collate_charset, ws2, s2, n2); > /* The sort key is a NUL-terminated byte string. */ > ret = LCMapStringW (collate_lcid, LCMAP_SORTKEY, ws2, -1, (PWCHAR) s1, > sn); > if (n2 > NT_MAX_PATH) > free (ws2); > if (ret == 0) > { > if (GetLastError () != ERROR_INSUFFICIENT_BUFFER) > set_errno (EINVAL); > return sn; > } > /* LCMapStringW returns byte count including the terminating NUL > character. > strxfrm is supposed to return length excluding the NUL. */ > return ret - 1; > } > > when the key string received doesn't fit in the buffer, LCMapStringW() > returns 0 and sets GetLastError() to ERROR_INSUFFICIENT_BUFFER, at > which point strxfrm() returns the size of the output buffer - *not* > the size of the sort key as required by ANSI C. > > We also have a couple of bugs in POSIX::strxfrm(), but that's not used > by the code that produces the panic above. > > Tony
I'm pretty sure I understand how the panic happens with this cygwin bug, and have a work-around smoking for 5.25. It seems to me though, that if a platform's strxfrm() is more badly broken that we shouldn't panic, but just not do locale-based collation for that locale, reverting to code point order, as we do for non-locale collation. Avoiding a panic is generally considered a good thing. My question is, should we raise a warning instead? -- Karl Williamson
RT-Send-CC: perl5-porters [...] perl.org
Fixed by 79f120c89a6e123770f65fd49893ef0b379cd922 -- Karl Williamson
Download (untitled) / with headers
text/plain 313b
Thank you for filing this report. You have helped make Perl better. With the release today of Perl 5.26.0, this and 210 other issues have been resolved. Perl 5.26.0 may be downloaded via: https://metacpan.org/release/XSAWYERX/perl-5.26.0 If you find that the problem persists, feel free to reopen this ticket.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org