Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unicode and macosx #87

Closed
p6rt opened this issue May 19, 2008 · 11 comments
Closed

unicode and macosx #87

p6rt opened this issue May 19, 2008 · 11 comments

Comments

@p6rt
Copy link

p6rt commented May 19, 2008

Migrated from rt.perl.org#54448 (status was 'resolved')

Searchable as RT54448$

@p6rt
Copy link
Author

p6rt commented May 19, 2008

From @cognominal

On a macintel 10.5 I have some problem with unicode. unicode
characters are not recognized as such. See the rakudo test below

The configuring phase gives :

Determining whether ICU is installed...................................yes.

The compiling phase finish with an error but it apprently causes no
problems except I can't run 'make test' because of
the dependance on a successful compilation.

ar​: blib/lib/libparrot.a is a fat file (use libtool(1) or lipo(1) and
ar(1) on it)
ar​: blib/lib/libparrot.a​: Inappropriate file type or format
make​: *** [blib/lib/libparrot.a] Error 1

rakudo is generated without problem

But the following test fails. I pasted the content of the literal
string with a character that emacs says to be #x8a0

my $s = " "; say $s.chars # $s == "\x8a0"
2

I expected one.

--
cognominal stef

@p6rt
Copy link
Author

p6rt commented May 19, 2008

From @cognominal

On Mon May 19 10​:29​:29 2008, cognominal wrote​:

On a macintel 10.5 I have some problem with unicode. unicode
characters are not recognized as such. See the rakudo test below

The configuring phase gives :

Determining whether ICU is installed...................................yes.

The compiling phase finish with an error but it apprently causes no
problems except I can't run 'make test' because of
the dependance on a successful compilation.

ar​: blib/lib/libparrot.a is a fat file (use libtool(1) or lipo(1) and
ar(1) on it)
ar​: blib/lib/libparrot.a​: Inappropriate file type or format
make​: *** [blib/lib/libparrot.a] Error 1

rakudo is generated without problem

But the following test fails. I pasted the content of the literal
string with a character that emacs says to be #x8a0

my $s = " "; say $s.chars # $s == "\x8a0"
2

I expected one.

parrot compiles correctly with the correct icu fetched by the correct
configuration : perl Configure.pl --icu-config=/usr/local/bin/icu-config
like I said in #​52898

But the problem mentionned "one unicode character listed as two" remains.

@p6rt
Copy link
Author

p6rt commented May 19, 2008

@cognominal - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented May 19, 2008

From @pmichaud

On Mon, May 19, 2008 at 10​:29​:29AM -0700, Stephane Payrard wrote​:

But the following test fails. I pasted the content of the literal
string with a character that emacs says to be #x8a0

my $s = " "; say $s.chars # $s == "\x8a0"
2

I expected one.

Because Parrot's primary support for unicode is utf-8 encoding,
and because utf-8 greatly slows down parsing of long strings
(such as program source code), we've elected for the time being
to have rakudo use "fixed8" for its default input encoding. When
Parrot becomes faster at processing unicode strings, we'll likely
switch the default to utf8.(*)

This doesn't mean that unicode can't be used in rakudo programs,
though. One can always encode the character explicitly​:

  $ ./parrot perl6.pbc
  > my $s = "€"; say $s.chars; # doesn't work
  3
  > my $s = "\x20ac"; say $s.chars; # works
  1

Also, rakudo understands the --encoding=utf8 option to specify that
the source code is coming in as UTF-8​:

  $ ./parrot perl6.pbc --encoding=utf8
  > my $s = "€"; say $s.chars; # works
  1

For now I'll mark this ticket as "stalled", awaiting faster Parrot
unicode support or a decision that we're going to live with
slower parsing of source code.

Thanks!

Pm

(*) Another option we might have could be to default to utf8 and
transcode to ucs2 on platforms that have ICU present (which can be
faster), but stay at a fixed8 default for systems without ICU.
But at this stage I think consistency and explicit options are
better, otherwise people will be confused as to why a particular
program works on some systems but not others.

@p6rt
Copy link
Author

p6rt commented May 19, 2008

@pmichaud - Status changed from 'open' to 'stalled'

@p6rt
Copy link
Author

p6rt commented Dec 15, 2008

From @pmichaud

Some transcoding options have been added to HLLCompiler and Rakudo since
this ticket was last addressed... could we determine if this is still an
issue for macosx? If not, I'd like to close the ticket.

Thanks!

Pm

@p6rt
Copy link
Author

p6rt commented Dec 15, 2008

The RT System itself - Status changed from 'stalled' to 'open'

@p6rt
Copy link
Author

p6rt commented Dec 16, 2008

From @cognominal

my $s = " "; say $s.chars # now returns 1

Note : the bug was reported on macintel 32 bits which died. I am now
testing on a macintel 64 bits.
I don't know if it can affect the test.

On Mon, May 19, 2008 at 6​:28 PM, Stéphane Payrard <cognominal@​gmail.com> wrote​:

On a macintel 10.5 I have some problem with unicode. unicode
characters are not recognized as such. See the rakudo test below

The configuring phase gives :

Determining whether ICU is installed...................................yes.

The compiling phase finish with an error but it apprently causes no
problems except I can't run 'make test' because of
the dependance on a successful compilation.

ar​: blib/lib/libparrot.a is a fat file (use libtool(1) or lipo(1) and
ar(1) on it)
ar​: blib/lib/libparrot.a​: Inappropriate file type or format
make​: *** [blib/lib/libparrot.a] Error 1

rakudo is generated without problem

But the following test fails. I pasted the content of the literal
string with a character that emacs says to be #x8a0

my $s = " "; say $s.chars # $s == "\x8a0"
2

I expected one.

--
cognominal stef

--
cognominal stef

@p6rt
Copy link
Author

p6rt commented Dec 21, 2008

From @pmichaud

Since the original bug no longer appears to be present, I'm marking this
ticket as resolved.

Thanks!

Pm

1 similar comment
@p6rt
Copy link
Author

p6rt commented Dec 21, 2008

From @pmichaud

Since the original bug no longer appears to be present, I'm marking this
ticket as resolved.

Thanks!

Pm

@p6rt
Copy link
Author

p6rt commented Dec 21, 2008

@pmichaud - Status changed from 'open' to 'resolved'

@p6rt p6rt closed this as completed Dec 21, 2008
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant