New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bleadperl 2014-06-25T12:37:43Z breaks CFAERBER/Net-IDN-Encode-2.200.tar.gz #13956
Comments
From @andkgit bisect commit 09edd81 Use Unicode 7.0 sample fail report http://www.cpantesters.org/cpan/report/9115cd08-f93b-11e3-bca3-b1a10a370852 perl -V Summary of my perl5 (revision 5 version 21 subversion 1) configuration: Characteristics of this binary (from libperl): |
From @khwilliamsonI'm rejecting this ticket because the flaws are in the tests. Unicode will continue to encode characters between 0 and 0x10FFFF. These tests were assuming that certain code points were unassigned, but Unicode 7 has assigned them. The only code points that are guaranteed to never be assigned are the noncharacters. |
The RT System itself - Status changed from 'new' to 'open' |
@khwilliamson - Status changed from 'open' to 'rejected' |
From CFAERBER@cpan.orgThe interesting part is that the tests are the tests provided with Unicode 7.0.0. |
From @khwilliamsonOn Sun Jun 29 03:11:12 2014, cfaerber wrote:
Yes, and my comments in rejecting this ticket were based on ignorance. I'm sorry. Though it still should have been rejected, as it still doesn't appear to me to be a core Perl bug. I looked in more detail at the first error in the CPAN report. It is this: # Failed test 'to_ascii('Ã�ã��ð��³â´�\u1DD8') throws error P1 V6 [data/IdnaTest.txt:992]' I looked at that line in the .t file, and it is this: 𐋳ⴌ\x{1DD8}", %p)}, undef, "to_ascii\(\'ß\。𐋳ⴌ\\u1DD8\'\)\ throws\ error\ P1\ V6\ \[data\/IdnaTest\.txt\:992\]") or ( (Most likely you will not have the fonts to display all of this correctly. the one character I don't have in my fonts is U+102F3, COPTIC EPACT NUMBER ONE HUNDRED, newly encoded in Unicode 7.0. I got that far before, and just assumed the test was supposed to return undef because the code point had not been encoded before, but now is. But I was wrong. There is another reason it is supposed to be undef. Line 992 from the Unicode 7.0 IdnaTest.txt file is this: (BTW, thanks for cross referencing the line number of the Unicode file in the .t test; it made this a lot easier.) The brackets indicate that this is supposed to fail, and the codes within the brackets indicate why. I started to follow why it should fail, but it wasn't obvious without more digging than I had time for. So perhaps the test from Unicode is wrong, or the module is buggy. I see that the .t correctly gets failure with the preceeding, similar tests, -- |
From CFAERBER@cpan.orgActually, I think your conclusion is correct and this is a bug in the test files supplied with Unicode 7.0. The error codes P1 and V16 indicate that there is a character that is not 'valid' (and that cannot be 'mapped' in P1). However, U+102F3 is valid according to IdnaMapping.txt (line 5557): 102E1..102FB ; valid ; ; NV8 # 7.0 COPTIC EPACT DIGIT ONE..COPTIC EPACT NUMBER NINE HUNDRED It's not valid in IDNA 2008, though (indicated by "NV8"). However, the tests in IdnaTests.txt are not supposed to test for that. If I change the module to treat all characters added in Unicode 7.0 as 'invalid', the tests for Net::IDN::Encode complete without error under bleadperl (5.21.1) and earlier perls. I have already reported the suspected error through the form at www.unicode.org but have not yet received a response. |
From @khwilliamsonOn 07/04/2014 04:39 AM, Claus Färber via RT wrote:
Unicode has now responded, agreeing that the test file was in error, and The announcement email credits Claus with finding the problem, but |
Migrated from rt.perl.org#122179 (status was 'rejected')
Searchable as RT122179$
The text was updated successfully, but these errors were encountered: