Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tr/// multiple transliterations #12168

Closed
p5pRT opened this issue Jun 11, 2012 · 5 comments
Closed

tr/// multiple transliterations #12168

p5pRT opened this issue Jun 11, 2012 · 5 comments

Comments

@p5pRT
Copy link

p5pRT commented Jun 11, 2012

Migrated from rt.perl.org#113584 (status was 'resolved')

Searchable as RT113584$

@p5pRT
Copy link
Author

p5pRT commented Jun 11, 2012

From @tokuhirom

Created by tokuhirom@gmail.com

perlop.pod ( http​://perldoc.perl.org/perlop.html#Quote-Like-Operators )
says

If multiple transliterations are given for a character, only the first one is used​:

  tr/AAA/XYZ/

will transliterate any A to X.

But I seems perl 5.16.0 is not works on multi-byte characters.

TEST CODE
use strict;
use warnings;
use utf8;
use 5.010000;
use autodie;

binmode STDOUT, '​:utf8';

my $x = "Perlα";
$x =~ tr/αα/βγ/;
say "$^V $x";

RESULT

The test code runson perl 5.14.2 binary, I got following result.

v5.14.2 Perlβ

And I run same script on perl 5.146.0 binary, i got following.

v5.16.0 Perlγ

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl 5.16.0:

Configured by tokuhirom at Mon Jun 11 13:06:05 JST 2012.

Summary of my perl5 (revision 5 version 16 subversion 0) configuration:
   
  Platform:
    osname=linux, osvers=3.2.0-23-generic, archname=x86_64-linux
    uname='linux www4071uf 3.2.0-23-generic #36-ubuntu smp tue apr 10 20:39:51 utc 2012 x86_64 x86_64 x86_64 gnulinux '
    config_args='-d -Dprefix=/usr/local/perl/5.16.0/'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.6.3', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/lib
    libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
    libc=, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.15'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector'

Locally applied patches:
    


@INC for perl 5.16.0:
    /usr/local/perl/5.16.0/lib/site_perl/5.16.0/x86_64-linux
    /usr/local/perl/5.16.0/lib/site_perl/5.16.0
    /usr/local/perl/5.16.0/lib/5.16.0/x86_64-linux
    /usr/local/perl/5.16.0/lib/5.16.0
    .


Environment for perl 5.16.0:
    HOME=/home/tokuhirom
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LC_ALL=C
    LC_CTYPE=C
    LC_DATE=C
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/usr/local/perl/latest/bin/:/home/tokuhirom/.rvm/gems/ruby-1.9.3-p194/bin:/home/tokuhirom/.rvm/gems/ruby-1.9.3-p194@global/bin:/home/tokuhirom/.rvm/rubies/ruby-1.9.3-p194/bin:/home/tokuhirom/.rvm/bin:/home/tokuhirom/dotfiles/local/bin/:/usr/local/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11:/usr/games:/home/tokuhirom/.rvm/bin
    PERLDOC_PAGER=less -+C -R
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Jun 11, 2012

From @cpansprout

On Mon Jun 11 03​:43​:45 2012, tokuhirom wrote​:

perlop.pod ( http​://perldoc.perl.org/perlop.html#Quote-Like-Operators
)
says

If multiple transliterations are given for a character, only the
first one is used​:

  tr/AAA/XYZ/

will transliterate any A to X.

But I seems perl 5.16.0 is not works on multi-byte characters.

TEST CODE
use strict;
use warnings;
use utf8;
use 5.010000;
use autodie;

binmode STDOUT, '​:utf8';

my $x = "Perlα";
$x =~ tr/αα/βγ/;
say "$^V $x";

RESULT

The test code runson perl 5.14.2 binary, I got following result.

v5.14.2 Perlβ

And I run same script on perl 5.146.0 binary, i got following.

v5.16.0 Perlγ

A binary search leads me to this​:

4de6d20 is the first bad commit
commit 4de6d20
Author​: Karl Williamson <public@​khwilliamson.com>
Date​: Mon Jan 2 16​:12​:21 2012 -0700

  utf8_heavy.pl​: Skip unnecessary work for official properties
 
  The tables that mktables generates are well behaved, and so the checks
  and sorting that are done for user-defined properties may be skipped.
 
  tainting needs to be preserved because $list can be passed in already
  tainted.
 
  This is also in preparation for Unicode 6.1, in which one table will
  legitimately have duplicate entries that the old code removed.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jun 11, 2012

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jun 11, 2012

From @khwilliamson

Fixed by commit cb6d347
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jun 11, 2012

@khwilliamson - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant