Bleadperl 2014-06-25T12:37:43Z breaks CFAERBER/Net-IDN-Encode-2.200.tar.gz #13956

p5pRT · 2014-06-25T05:04:53Z

Migrated from rt.perl.org#122179 (status was 'rejected')

Searchable as RT122179$

p5pRT · 2014-06-25T05:04:53Z

From @andk

git bisect

commit 09edd81
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Feb 20 21:59:00 2014 -0700

Use Unicode 7.0

sample fail report

http://www.cpantesters.org/cpan/report/9115cd08-f93b-11e3-bca3-b1a10a370852

perl -V

Summary of my perl5 (revision 5 version 21 subversion 1) configuration:
Commit id: 62406c8
Platform:
osname=linux, osvers=3.14-1-amd64, archname=x86_64-linux-thread-multi
uname='linux k83 3.14-1-amd64 #1 smp debian 3.14.5-1 (2014-06-05) x86_64 gnulinux '
config_args='-Dprefix=/home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1/9980 -Dmyhostname=k83 -Dinstallusrbinperl=n -Uversiononly -Dusedevel -des -Ui_db -Duseithreads -Uuselongdouble -DDEBUGGING=-g'
hint=recommended, useposix=true, d_sigaction=define
useithreads=define, usemultiplicity=define
use64bitint=define, use64bitall=define, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O2 -g',
cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
ccversion='', gccversion='4.8.3', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc -lgdbm_compat
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
libc=libc-2.19.so, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version='2.19'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl):
Compile-time options: HAS_TIMES MULTIPLICITY PERLIO_LAYERS
PERL_DONT_CREATE_GVSV
PERL_HASH_FUNC_ONE_AT_A_TIME_HARD
PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP
PERL_NEW_COPY_ON_WRITE PERL_PRESERVE_IVUV
PERL_USE_DEVEL USE_64_BIT_ALL USE_64_BIT_INT
USE_ITHREADS USE_LARGE_FILES USE_LOCALE
USE_LOCALE_COLLATE USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC USE_PERLIO USE_PERL_ATOF
USE_REENTRANT_API
Built under linux
Compiled at Jun 20 2014 21:27:33
%ENV:
PERL5LIB=""
PERL5OPT=""
PERL5_CPANPLUS_IS_RUNNING="17922"
PERL5_CPAN_IS_RUNNING="17922"
PERL_MM_USE_DEFAULT="1"
@INC:
/home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1/9980/lib/site_perl/5.21.1/x86_64-linux-thread-multi
/home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1/9980/lib/site_perl/5.21.1
/home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1/9980/lib/5.21.1/x86_64-linux-thread-multi
/home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1/9980/lib/5.21.1
.
--
andreas

p5pRT · 2014-06-25T15:45:03Z

From @khwilliamson

I'm rejecting this ticket because the flaws are in the tests.

Unicode will continue to encode characters between 0 and 0x10FFFF. These tests were assuming that certain code points were unassigned, but Unicode 7 has assigned them.

The only code points that are guaranteed to never be assigned are the noncharacters.
Code points unlikely to be assigned are ones listed as <reserved> in NamesList.txt
http://www.unicode.org/Public/7.0.0/ucd/NamesList.txt
and things like U+03A2, which would be the uppercase of the greek small letter final sigma (but there is no upper case of that).
--
Karl Williamson

p5pRT · 2014-06-25T15:45:03Z

The RT System itself - Status changed from 'new' to 'open'

p5pRT · 2014-06-25T15:45:04Z

@khwilliamson - Status changed from 'open' to 'rejected'

p5pRT · 2014-06-29T10:11:12Z

From CFAERBER@cpan.org

The interesting part is that the tests are the tests provided with Unicode 7.0.0.

p5pRT · 2014-07-01T19:23:45Z

From @khwilliamson

On Sun Jun 29 03:11:12 2014, cfaerber wrote:

The interesting part is that the tests are the tests provided with
Unicode 7.0.0.

Yes, and my comments in rejecting this ticket were based on ignorance. I'm sorry. Though it still should have been rejected, as it still doesn't appear to me to be a core Perl bug. I looked in more detail at the first error in the CPAN report. It is this:

# Failed test 'to_ascii('Ã�ã��ð��³â´�\u1DD8') throws error P1 V6 [data/IdnaTest.txt:992]'
# at t/uts46_to_ascii-trans.t line 765.
# got: 'ss.xn--weg506dvy5n'
# expected: undef

I looked at that line in the .t file, and it is this:

𐋳ⴌ\x{1DD8}", %p)}, undef, "to_ascii$\'ß\。𐋳ⴌ\\u1DD8\'$\ throws\ error\ P1\ V6\ \[data\/IdnaTest\.txt\:992\]") or ($@ and diag($@));

(Most likely you will not have the fonts to display all of this correctly. the one character I don't have in my fonts is U+102F3, COPTIC EPACT NUMBER ONE HUNDRED, newly encoded in Unicode 7.0. I got that far before, and just assumed the test was supposed to return undef because the code point had not been encoded before, but now is. But I was wrong. There is another reason it is supposed to be undef.

Line 992 from the Unicode 7.0 IdnaTest.txt file is this:
B; ß。𐋳ⴌ\u1DD8; [P1 V6]; [P1 V6] # ß.𐋳ⴌᷘ

(BTW, thanks for cross referencing the line number of the Unicode file in the .t test; it made this a lot easier.)

The brackets indicate that this is supposed to fail, and the codes within the brackets indicate why. I started to follow why it should fail, but it wasn't obvious without more digging than I had time for. So perhaps the test from Unicode is wrong, or the module is buggy. I see that the .t correctly gets failure with the preceeding, similar tests,

--
Karl Williamson

p5pRT · 2014-07-04T10:39:35Z

From CFAERBER@cpan.org

Actually, I think your conclusion is correct and this is a bug in the test files supplied with Unicode 7.0.

The error codes P1 and V16 indicate that there is a character that is not 'valid' (and that cannot be 'mapped' in P1). However, U+102F3 is valid according to IdnaMapping.txt (line 5557):

102E1..102FB ; valid ; ; NV8 # 7.0 COPTIC EPACT DIGIT ONE..COPTIC EPACT NUMBER NINE HUNDRED

It's not valid in IDNA 2008, though (indicated by "NV8"). However, the tests in IdnaTests.txt are not supposed to test for that.

If I change the module to treat all characters added in Unicode 7.0 as 'invalid', the tests for Net::IDN::Encode complete without error under bleadperl (5.21.1) and earlier perls.

I have already reported the suspected error through the form at www.unicode.org but have not yet received a response.

p5pRT · 2014-07-11T21:10:58Z

From @khwilliamson

On 07/04/2014 04:39 AM, Claus Färber via RT wrote:

Actually, I think your conclusion is correct and this is a bug in the test files supplied with Unicode 7.0.

The error codes P1 and V16 indicate that there is a character that is not 'valid' (and that cannot be 'mapped' in P1). However, U+102F3 is valid according to IdnaMapping.txt (line 5557):

102E1..102FB ; valid ; ; NV8 # 7.0 COPTIC EPACT DIGIT ONE..COPTIC EPACT NUMBER NINE HUNDRED

It's not valid in IDNA 2008, though (indicated by "NV8"). However, the tests in IdnaTests.txt are not supposed to test for that.

If I change the module to treat all characters added in Unicode 7.0 as 'invalid', the tests for Net::IDN::Encode complete without error under bleadperl (5.21.1) and earlier perls.

I have already reported the suspected error through the form at www.unicode.org but have not yet received a response.

---
via perlbug: queue: perl5 status: rejected
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=122179

Unicode has now responded, agreeing that the test file was in error, and
creating a new one. See
http://www.unicode.org/errata/#current_errata

The announcement email credits Claus with finding the problem, but
wasn't sent to the public at large. I just sent a private email to
them suggesting they send this to their public email list.

p5pRT closed this as completed Jun 25, 2014

p5pRT added the Severity Low label Oct 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bleadperl 2014-06-25T12:37:43Z breaks CFAERBER/Net-IDN-Encode-2.200.tar.gz #13956

Bleadperl 2014-06-25T12:37:43Z breaks CFAERBER/Net-IDN-Encode-2.200.tar.gz #13956

p5pRT commented Jun 25, 2014

p5pRT commented Jun 25, 2014

p5pRT commented Jun 25, 2014

p5pRT commented Jun 25, 2014

p5pRT commented Jun 25, 2014

p5pRT commented Jun 29, 2014

p5pRT commented Jul 1, 2014

p5pRT commented Jul 4, 2014

p5pRT commented Jul 11, 2014

Bleadperl 2014-06-25T12:37:43Z breaks CFAERBER/Net-IDN-Encode-2.200.tar.gz #13956

Bleadperl 2014-06-25T12:37:43Z breaks CFAERBER/Net-IDN-Encode-2.200.tar.gz #13956

Comments

p5pRT commented Jun 25, 2014

p5pRT commented Jun 25, 2014

From @andk

p5pRT commented Jun 25, 2014

From @khwilliamson

p5pRT commented Jun 25, 2014

p5pRT commented Jun 25, 2014

p5pRT commented Jun 29, 2014

From CFAERBER@cpan.org

p5pRT commented Jul 1, 2014

From @khwilliamson

p5pRT commented Jul 4, 2014

From CFAERBER@cpan.org

p5pRT commented Jul 11, 2014

From @khwilliamson