Skip Menu |
Report information
Id: 74978
Status: resolved
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: tokuhirom <tokuhirom [at] gmail.com>
tokuhirom [at] gpath.example.org
Cc:
AdminCc:

Operating System: Linux
PatchStatus: (no value)
Severity: medium
Type:
  • core
  • regex
  • Unicode
Perl Version: 5.12.0
Fixed In: (no value)

Attachments


Subject: \N{} incompatibility in 5.12+
Date: Sat, 8 May 2010 16:50:42 +0900 (JST)
To: perlbug [...] perl.org
From: tokuhirom [...] gp.ath.cx (tokuhirom)
Download (untitled) / with headers
text/plain 3.3k
This is a bug report for perl from tokuhirom@gpath.example.org, generated with the help of perlbug 1.39 running under perl 5.12.0. ----------------------------------------------------------------- [Please describe your issue here] following one liner fails with perl 5.12.0. perl -e 'use charnames ":full"; /\N{FULLWIDTH LEFT PARENTHESIS}./;print "ok\n";' Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE in m/\N{U+FF08} <-- HERE ./ at -e line 1. [Please do not change anything below this line] ----------------------------------------------------------------- --- Flags: category=core severity=medium --- Site configuration information for perl 5.12.0: Configured by tokuhirom at Wed Apr 28 17:18:47 JST 2010. Summary of my perl5 (revision 5 version 12 subversion 0) configuration: Platform: osname=linux, osvers=2.6.31-17-server, archname=x86_64-linux uname='linux gpath 2.6.31-17-server #54-ubuntu smp thu dec 10 18:06:56 utc 2009 x86_64 gnulinux ' config_args='-d -Dprefix=/usr/local/app/perl-5.12.0/ -Duse64bitint' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.4.1', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64 libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=/lib/libc-2.10.1.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.10.1' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector' Locally applied patches: --- @INC for perl 5.12.0: /usr/local/app/perl-5.12.0/lib/site_perl/5.12.0/x86_64-linux /usr/local/app/perl-5.12.0/lib/site_perl/5.12.0 /usr/local/app/perl-5.12.0/lib/5.12.0/x86_64-linux /usr/local/app/perl-5.12.0/lib/5.12.0 . --- Environment for perl 5.12.0: HOME=/home/tokuhirom LANG=ja_JP.UTF-8 LANGUAGE (unset) LC_DATE=C LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/tokuhirom/bin:/home/tokuhirom/local/bin/:/usr/local/bin/:/usr/local/app/perl-5.12.0/bin/:/usr/local/app/perl/bin/:/usr/local/mysql/bin/:/usr/local/bin/:/home/tokuhirom/bin:/home/tokuhirom/local/bin/:/usr/local/bin/:/usr/local/app/perl-5.12.0/bin/:/usr/local/app/perl/bin/:/usr/local/mysql/bin/:/usr/local/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/home/tokuhirom/share/dotfiles/local/bin/:/home/tokuhirom/share/dotfiles/local/bin/ PERL_AUTOINSTALL=--defaultdeps PERL_BADLANG=0 PERL_CPANM_DEV=1 SHELL=/bin/zsh
Subject: \N{} incompatibility in perl5.12.0
Date: Sat, 8 May 2010 17:00:13 +0900
To: perlbug [...] perl.org
From: Tokuhiro Matsuno <tokuhirom [...] gmail.com>
Download (untitled) / with headers
text/plain 3.3k
This is a bug report for perl from tokuhirom@gmail.com, generated with the help of perlbug 1.39 running under perl 5.12.0. ----------------------------------------------------------------- [Please describe your issue here] following one liner works in perl5.10.0, but it fails with perl 5.12.0 % perl -e 'use charnames ":full"; /\N{FULLWIDTH LEFT PARENTHESIS}./;print "ok\n";' Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE in m/\N{U+FF08} <-- HERE ./ at -e line 1. [Please do not change anything below this line] ----------------------------------------------------------------- --- Flags: category=core severity=medium --- Site configuration information for perl 5.12.0: Configured by tokuhirom at Wed Apr 28 17:18:47 JST 2010. Summary of my perl5 (revision 5 version 12 subversion 0) configuration: Platform: osname=linux, osvers=2.6.31-17-server, archname=x86_64-linux uname='linux gpath 2.6.31-17-server #54-ubuntu smp thu dec 10 18:06:56 utc 2009 x86_64 gnulinux ' config_args='-d -Dprefix=/usr/local/app/perl-5.12.0/ -Duse64bitint' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.4.1', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64 libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=/lib/libc-2.10.1.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.10.1' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector' Locally applied patches: --- @INC for perl 5.12.0: /usr/local/app/perl-5.12.0/lib/site_perl/5.12.0/x86_64-linux /usr/local/app/perl-5.12.0/lib/site_perl/5.12.0 /usr/local/app/perl-5.12.0/lib/5.12.0/x86_64-linux /usr/local/app/perl-5.12.0/lib/5.12.0 . --- Environment for perl 5.12.0: HOME=/home/tokuhirom LANG=ja_JP.UTF-8 LANGUAGE (unset) LC_DATE=C LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/tokuhirom/bin:/home/tokuhirom/local/bin/:/usr/local/bin/:/usr/local/app/perl-5.12.0/bin/:/usr/local/app/perl/bin/:/usr/local/mysql/bin/:/usr/local/bin/:/home/tokuhirom/bin:/home/tokuhirom/local/bin/:/usr/local/bin/:/usr/local/app/perl-5.12.0/bin/:/usr/local/app/perl/bin/:/usr/local/mysql/bin/:/usr/local/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/home/tokuhirom/share/dotfiles/local/bin/:/home/tokuhirom/share/dotfiles/local/bin/ PERL_AUTOINSTALL=--defaultdeps PERL_BADLANG=0 PERL_CPANM_DEV=1 SHELL=/bin/zsh
Subject: Re: [perl #74982] \N{} incompatibility in perl5.12.0
Date: Sat, 08 May 2010 12:00:47 -0600
To: perl5-porters [...] perl.org
From: karl williamson <public [...] khwilliamson.com>
Download (untitled) / with headers
text/plain 1.1k
Tokuhiro Matsuno (via RT) wrote: Show quoted text
> # New Ticket Created by "Tokuhiro Matsuno" > # Please include the string: [perl #74982] > # in the subject line of all future correspondence about this issue. > # <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=74982 > > > > This is a bug report for perl from tokuhirom@gmail.com, > generated with the help of perlbug 1.39 running under perl 5.12.0. > > > ----------------------------------------------------------------- > [Please describe your issue here] > > following one liner works in perl5.10.0, but it fails with perl 5.12.0 > > % perl -e 'use charnames ":full"; /\N{FULLWIDTH LEFT > PARENTHESIS}./;print "ok\n";' > Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE > in m/\N{U+FF08} <-- HERE ./ at -e line 1. > > [Please do not change anything below this line] > -----------------------------------------------------------------
Thanks for the bug report. I was the one who introduced the bug. I'm sorry. I will have a patch available today. In the meantime, the problem turns out to be the period just after the '}'. If you remove that, it will work.
CC: perl5-porters [...] perl.org
Subject: Re: [perl #74982] \N{} incompatibility in perl5.12.0
Date: Sat, 8 May 2010 14:47:49 -0400
To: karl williamson <public [...] khwilliamson.com>
From: Jesse Vincent <jesse [...] fsck.com>
Download (untitled) / with headers
text/plain 295b
Show quoted text
> Thanks for the bug report. I was the one who introduced the bug. > I'm sorry. I will have a patch available today. In the meantime, > the problem turns out to be the period just after the '}'. If you > remove that, it will work.
I'm going to hold 5.12.1 RC1 for this. Best, Jesse --
Subject: PATCH: [perl #74978] \N{} incompatibility in 5.12+
Date: Sat, 08 May 2010 14:18:33 -0600
To: perl5-porters [...] perl.org, tokuhirom [...] gpath.example.org
From: karl williamson <public [...] khwilliamson.com>
Download (untitled) / with headers
text/plain 218b
Attached is a minimal patch to fix this. There are two other commits that add comments to a .t file so that someone later won't have to work as hard as I did at finding where to put the tests for something similar.
From ce65c312b89d6f851ca46d24719e07bce288ee99 Mon Sep 17 00:00:00 2001 From: Karl Williamson <khw@khw-desktop.(none)> Date: Sat, 8 May 2010 13:12:53 -0600 Subject: [PATCH] Comment where to find file's format --- t/re/re_tests | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/t/re/re_tests b/t/re/re_tests index 1807ffc..b7471d9 100644 --- a/t/re/re_tests +++ b/t/re/re_tests @@ -1,5 +1,5 @@ # This stops me getting screenfulls of syntax errors every time I accidentally -# run this file via a shell glob +# run this file via a shell glob. Format of this file is given in regexp.t __END__ abc abc y $& abc abc abc y $-[0] 0 -- 1.5.6.3
From 50e44d09a829eed4eeabf9ce78d3374a5f785d4f Mon Sep 17 00:00:00 2001 From: Karl Williamson <khw@khw-desktop.(none)> Date: Sat, 8 May 2010 13:38:27 -0600 Subject: [PATCH] Note in comment that many \N{...} tests won't work here --- t/re/re_tests | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/t/re/re_tests b/t/re/re_tests index b7471d9..c550b5a 100644 --- a/t/re/re_tests +++ b/t/re/re_tests @@ -1,5 +1,7 @@ # This stops me getting screenfulls of syntax errors every time I accidentally # run this file via a shell glob. Format of this file is given in regexp.t +# Can't use \N{VALID NAME TEST} here because need 'use charnames'; but can use +# \N{U+valid} here. __END__ abc abc y $& abc abc abc y $-[0] 0 -- 1.5.6.3
From 1bb86a94fea493dd6213e60ed8e19b51b8ceea0c Mon Sep 17 00:00:00 2001 From: Karl Williamson <khw@khw-desktop.(none)> Date: Sat, 8 May 2010 14:06:10 -0600 Subject: [PATCH] PATCH [perl #74978] dot after } breaks \N{} The problem is that a dot can come between the braces in \N{foo.bar}, but when searching for it, I didn't stop looking at the right brace, so it generated an error inappropriately. This is essentially a minimum patch; efficiency could be improved slightly with a little more work. --- regcomp.c | 8 +++----- t/re/pat.t | 8 +++++++- 2 files changed, 10 insertions(+), 6 deletions(-) diff --git a/regcomp.c b/regcomp.c index f665f0b..be5acdb 100644 --- a/regcomp.c +++ b/regcomp.c @@ -6762,11 +6762,10 @@ S_reg_namedseq(pTHX_ RExC_state_t *pRExC_state, UV *valuep, I32 *flagp) | PERL_SCAN_DISALLOW_PREFIX | (SIZE_ONLY ? PERL_SCAN_SILENT_ILLDIGIT : 0); - char * endchar = strchr(RExC_parse, '.'); - if (endchar) { + char * endchar = RExC_parse + strcspn(RExC_parse, ".}"); + if (endchar < endbrace) { ckWARNreg(endchar, "Using just the first character returned by \\N{} in character class"); } - else endchar = endbrace; length_of_hex = (STRLEN)(endchar - RExC_parse); *valuep = grok_hex(RExC_parse, &length_of_hex, &flags, NULL); @@ -6817,8 +6816,7 @@ S_reg_namedseq(pTHX_ RExC_state_t *pRExC_state, UV *valuep, I32 *flagp) /* Code points are separated by dots. If none, there is only one * code point, and is terminated by the brace */ - endchar = strchr(RExC_parse, '.'); - if (! endchar) endchar = endbrace; + endchar = RExC_parse + strcspn(RExC_parse, ".}"); /* The values are Unicode even on EBCDIC machines */ length_of_hex = (STRLEN)(endchar - RExC_parse); diff --git a/t/re/pat.t b/t/re/pat.t index 40ae52e..7b9594c 100644 --- a/t/re/pat.t +++ b/t/re/pat.t @@ -23,7 +23,7 @@ BEGIN { } -plan tests => 297; # Update this when adding/deleting tests. +plan tests => 299; # Update this when adding/deleting tests. run_tests() unless caller; @@ -987,6 +987,12 @@ sub run_tests { ok "abbbbc" =~ m/\N{3,4}/ && $& eq "abbb", '"abbbbc" =~ m/\N{3,4}/ && $& eq "abbb"'; } + { + use charnames ":full"; + local $Message = '[perl #74982] Period coming after \N{}'; + ok "\x{ff08}." =~ m/\N{FULLWIDTH LEFT PARENTHESIS}./ && $& eq "\x{ff08}."; + ok "\x{ff08}." =~ m/[\N{FULLWIDTH LEFT PARENTHESIS}]./ && $& eq "\x{ff08}."; + } } # End of sub run_tests -- 1.5.6.3
CC: perl5-porters [...] perl.org
Subject: Re: PATCH: [perl #74978] \N{} incompatibility in 5.12+
Date: Sat, 8 May 2010 17:34:31 -0400
To: karl williamson <public [...] khwilliamson.com>
From: Jesse Vincent <jesse [...] fsck.com>
Download (untitled) / with headers
text/plain 359b
On Sat, May 08, 2010 at 02:18:33PM -0600, karl williamson wrote: Show quoted text
> Attached is a minimal patch to fix this. There are two other > commits that add comments to a .t file so that someone later won't > have to work as hard as I did at finding where to put the tests for > something similar.
Thanks. Applied. +1 to backport the code patch for .1. -Jesse
CC: karl williamson <public [...] khwilliamson.com>, perl5-porters [...] perl.org
Subject: Re: PATCH: [perl #74978] \N{} incompatibility in 5.12+
Date: Sat, 8 May 2010 23:44:17 -0400
To: Jesse Vincent <jesse [...] fsck.com>
From: David Golden <xdaveg [...] gmail.com>
Download (untitled) / with headers
text/plain 473b
On Sat, May 8, 2010 at 5:34 PM, Jesse Vincent <jesse@fsck.com> wrote: Show quoted text
> > > > On Sat, May 08, 2010 at 02:18:33PM -0600, karl williamson wrote:
>> Attached is a minimal patch to fix this.  There are two other >> commits that add comments to a .t file so that someone later won't >> have to work as hard as I did at finding where to put the tests for >> something similar.
> > > Thanks. Applied.  +1 to backport the code patch for .1. > > -Jesse >
agreed. +1 to backport


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org