Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

symbol ь in regex produce synta"x error, unexpected LABEL ('unicode')" #309

Closed
p6rt opened this issue Sep 13, 2008 · 6 comments
Closed
Labels

Comments

@p6rt
Copy link

p6rt commented Sep 13, 2008

Migrated from rt.perl.org#58820 (status was 'resolved')

Searchable as RT58820$

@p6rt
Copy link
Author

p6rt commented Sep 13, 2008

From @ilyabelikin

Hi,
regex Test { <alpha>+ };
"x/��ен�_09-10.txt" ~~ Test;
say $/;

Produce​:

error​:imcc​:syntax error, unexpected LABEL ('unicode')
  in file 'EVAL_13' line 12
Null PMC access in invoke()
current instr.​: 'parrot;PCT​::HLLCompiler;eval' pc 806
(src/PCT/HLLCompiler.pir​:481)
called from Sub 'parrot;PCT​::HLLCompiler;evalfiles' pc 1078
(src/PCT/HLLCompiler.pir​:610)
called from Sub 'parrot;PCT​::HLLCompiler;command_line' pc 1257
(src/PCT/HLLCompiler.pir​:699)
called from Sub 'parrot;Perl6​::Compiler;main' pc 16035 (perl6.pir​:172)

When I removed � it`s work.

I have tried to make smaller, but when I​:
regex Test { <alpha>+ };
"��ен�" ~~ Test;
say $/;

all work fine.


osname= linux
osvers= 2.6.15.7
arch= i486-linux-gnu-thread-multi
cc= cc
Summary of my parrot 0.7.0 (r31017) configuration​:
  configdate='Fri Sep 12 11​:16​:58 2008 GMT'
  Platform​:
  osname=linux, archname=i486-linux-gnu-thread-multi
  jitcapable=1, jitarchname=i386-linux,
  jitosname=LINUX, jitcpuarch=i386
  execcapable=1
  perl=/usr/local/bin/perl
  Compiler​:
  cc='cc', ccflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
-DDEBIAN -pipe -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -DHASATTRIBUTE_CONST -DHASATTRIBUTE_DEPRECATED
-DHASATTRIBUTE_MALLOC -DHASATTRIBUTE_NONNULL
-DHASATTRIBUTE_NORETURN -DHASATTRIBUTE_PURE -DHASATTRIBUTE_UNUSED
-DHASATTRIBUTE_WARN_UNUSED_RESULT -falign-functions=16
-fvisibility=hidden -maccumulate-outgoing-args -W -Wall
-Waggregate-return -Wcast-align -Wcast-qual -Wchar-subscripts
-Wcomment -Wdisabled-optimization -Wendif-labels -Wextra -Wformat
-Wformat-extra-args -Wformat-nonliteral -Wformat-security -Wformat-y2k
-Wimplicit -Wimport -Winit-self -Winline -Winvalid-pch
-Wmissing-braces -Wmissing-field-initializers
-Wno-missing-format-attribute -Wmissing-include-dirs -Wpacked
-Wparentheses -Wpointer-arith -Wreturn-type -Wsequence-point
-Wno-shadow -Wsign-compare -Wstrict-aliasing -Wstrict-aliasing=2
-Wswitch -Wswitch-default -Wtrigraphs -Wundef -Wunknown-pragmas
-Wno-unused -Wvariadic-macros -Wwrite-strings -Wbad-function-cast
-Wc++-compat -Wdeclaration-after-statement
-Wimplicit-function-declaration -Wimplicit-int -Wmain
-Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wnonnull
-DDISABLE_GC_DEBUG=1 -DNDEBUG -O2 -DHAS_GETTEXT',
  Linker and Libraries​:
  ld='cc', ldflags=' -L/usr/local/lib',
  cc_ldflags='',
  libs='-ldl -lm -lpthread -lcrypt -lrt -lgmp -lreadline -lpcre -lcrypto '
  Dynamic Linking​:
  share_ext='.so', ld_share_flags='-shared -L/usr/local/lib -fPIC',
  load_ext='.so', ld_load_flags='-shared -L/usr/local/lib -fPIC'
  Types​:
  iv=long, intvalsize=4, intsize=4, opcode_t=long, opcode_t_size=4,
  ptrsize=4, ptr_alignment=1 byteorder=1234,
  nv=double, numvalsize=8, doublesize=8


Environment​:
  HOME =/home/ihrd
  LANG =en_NZ.UTF-8
  LANGUAGE (unset)
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)
  PATH =/usr/local/sbin​:/usr/local/bin​:/usr/sbin​:/usr/bin​:/sbin​:/bin​:/usr/bin/X11​:/usr/games
  SHELL =/bin/bash

Ilya
Vladivostok.pm

@p6rt
Copy link
Author

p6rt commented Sep 25, 2008

From @moritz

On Sat Sep 13 00​:30​:24 2008, ihrd wrote​:

Hi,
regex Test { <alpha>+ };
"x/��ен�_09-10.txt" ~~ Test;
say $/;

This works all fine here, r31410 (and with icu linked).
Could you please try again with a recent version of parrot and rakudo,
and tell us which character encoding you are using? (I hope UTF-8, it's
what rakudo expects its source to be written in).

Also do you have libicu (or whatever it's called on your system;
official name "International Components for Unicode") installed on your
system, and detected by parrot's Configure.pl?

@p6rt
Copy link
Author

p6rt commented Sep 25, 2008

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Nov 7, 2008

From @pmichaud

On Thu Sep 25 10​:43​:10 2008, moritz wrote​:

On Sat Sep 13 00​:30​:24 2008, ihrd wrote​:

Hi,
regex Test { <alpha>+ };
"x/��ен�_09-10.txt" ~~ Test;
say $/;

This works all fine here, r31410 (and with icu linked).
Could you please try again with a recent version of parrot and rakudo,
and tell us which character encoding you are using? (I hope UTF-8, it's
what rakudo expects its source to be written in).

On my system (r32442, no libicu) I get the same error as the original
post. I suspect it's a bug in Parrot itself, as it appears to be
generating an incorrect unicode string (see RT #​60396).

When RT #​60396 is fixed, I would expect this code to work properly.

Thanks!

Pm

@p6rt
Copy link
Author

p6rt commented Nov 8, 2008

From @chromatic

Fixed in r32444.

@p6rt
Copy link
Author

p6rt commented Nov 8, 2008

@chromatic - Status changed from 'open' to 'resolved'

@p6rt p6rt closed this as completed Nov 8, 2008
@p6rt p6rt added the Bug label Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant