Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Malformed UTF-8 character warnings matching UTF-8 string against ISO-8859-1 regex #8248

Closed
p5pRT opened this issue Dec 15, 2005 · 4 comments
Closed

Comments

@p5pRT
Copy link

p5pRT commented Dec 15, 2005

Migrated from rt.perl.org#37950 (status was 'resolved')

Searchable as RT37950$

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2005

From jgmyers@proofpoint.com

Created by jgmyers@pong.us.proofpoint.com

The following test script incorrectly generates the message​:
Malformed UTF-8 character (unexpected non-continuation byte 0x00,
immediately after start byte 0xc4) in pattern match (m//) at
./demo_utf8_bug.pl line 9.

#!/usr/bin/perl

use warnings;
use strict;

my $text = " ";
utf8​::upgrade($text);

$text =~ /\xC4|a/i;

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl v5.8.7:

Configured by jgmyers at Mon Jul 25 16:01:57 PDT 2005.

Summary of my perl5 (revision 5 version 8 subversion 7) configuration:
  Platform:
    osname=linux, osvers=2.4.21-32.0.1.elsmp, 
archname=i686-linux-thread-multi
    uname='linux pong.us.proofpoint.com 2.4.21-32.0.1.elsmp #1 smp tue 
may 17 17:52:23 edt 2005 i686 i686 i386 gnulinux '
    config_args=''
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define 
usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS 
-DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include 
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O2 -g',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGING 
-fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
    ccversion='', gccversion='3.3.3', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', 
lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=/lib/libc-2.3.2.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.3.2'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
   


@INC for perl v5.8.7:
    /u/jgmyers/perl/lib/5.8.7/i686-linux-thread-multi
    /u/jgmyers/perl/lib/5.8.7
    /u/jgmyers/perl/lib/site_perl/5.8.7/i686-linux-thread-multi
    /u/jgmyers/perl/lib/site_perl/5.8.7
    /u/jgmyers/perl/lib/site_perl/5.8.6/i686-linux-thread-multi
    /u/jgmyers/perl/lib/site_perl/5.8.6
    /u/jgmyers/perl/lib/site_perl/5.8.5/i686-linux-thread-multi
    /u/jgmyers/perl/lib/site_perl/5.8.5
    /u/jgmyers/perl/lib/site_perl/5.8.3/i686-linux-thread-multi
    /u/jgmyers/perl/lib/site_perl/5.8.3
    /u/jgmyers/perl/lib/site_perl
    .


Environment for perl v5.8.7:
    HOME=/u/jgmyers
    LANG=en_US
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    
PATH=/tools/x/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/u/jgmyers/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2005

From @gisle

John Gardiner Myers (via RT) <perlbug-followup@​perl.org> writes​:

The following test script incorrectly generates the message​:
Malformed UTF-8 character

Already fixed in blead and perl-5.8.8-tobe.
http​://public.activestate.com/cgi-bin/perlbrowse?patch=25095

--Gisle

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2005

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Dec 17, 2005

@smpeters - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant