Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rakudo hangs or crashes on ranges of large \x character specifications in a regex #1112

Closed
p6rt opened this issue Jul 1, 2009 · 10 comments
Closed
Labels

Comments

@p6rt
Copy link

p6rt commented Jul 1, 2009

Migrated from rt.perl.org#67122 (status was 'resolved')

Searchable as RT67122$

@p6rt
Copy link
Author

p6rt commented Jul 1, 2009

From @masak

<masak> rakudo​: / <[\x10000..\xEFFFF]> /; say "alive"
<p6eval> rakudo 5351a3​: ( no output )
<masak> pmichaud​: is this a known issue?
<masak> locally, I had a bus error.
<pmichaud> masak​: not known issue
* masak submits rakudobug
<pmichaud> I'm guessing parrotbug :-)
<masak> PGE-bug, possibly.
<masak> pmichaud​: note that it hangs during execution, not during compilation.
<pmichaud> rakudo​: / <[\x100..\xEFF]> /; say "alive"
<p6eval> rakudo 5351a3​: OUTPUT«alive␤»
<pmichaud> PGE gets that one right.
<masak> and the hex numbers have to be high enough for it to trigger.
<pmichaud> right
<pmichaud> I suspect that Parrot strings are having difficulty with it.

@p6rt
Copy link
Author

p6rt commented Jul 5, 2009

From @kyleha

This is an automatically generated mail to inform you that tests are now available in t/t/spec/S05-metasyntax/charset.t

@p6rt
Copy link
Author

p6rt commented Jul 5, 2009

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Jun 30, 2010

From @bbkr

[16​:21] <bbkr> rakudo​: / <[\x10000..\xEFFFF]> /; say "alive" # testing 67122
[16​:21] <p6eval> rakudo aa015a​: OUTPUT«alive␤»
[16​:21] <bbkr> yay
[16​:21] <pmichaud> that's less "yay" than might be evident at first
[16​:21] <pmichaud> it doesn't fail because <[\x10000.\xeffff]> doesn't
do anything at present :-|
[16​:22] <pmichaud> afk, bbiab
[16​:23] <bbkr> rakudo​: "\x[10001]" ~~ /<[\x10000..\xEFFFF]>/
[16​:23] <p6eval> rakudo aa015a​: ( no output )
[16​:23] <bbkr> rakudo​: say "\x[10001]" ~~ /<[\x10000..\xEFFFF]>/
[16​:23] <p6eval> rakudo aa015a​: OUTPUT«␤»
[16​:24] <bbkr> pmichaud​: indeed, parses but doesn't work as expected.

@p6rt
Copy link
Author

p6rt commented Sep 11, 2011

From @bbkr

NOM

still broken (no output)

bbkr​:nom bbkr$ ./perl6 -e 'say "\x[10001]" ~~ /<[\x10000..\xEFFFF]>/'

bbkr​:nom bbkr$

@p6rt
Copy link
Author

p6rt commented Apr 20, 2012

From @bbkr

On 2012.04​:

$ ./perl6 -e 'say "\x[10001]" ~~ /<[\x10000..\xEFFFF]>/'
===SORRY!===
Invalid character for UTF-8 encoding

@p6rt
Copy link
Author

p6rt commented Mar 30, 2013

From @coke

On Fri Apr 20 02​:54​:24 2012, bbkr wrote​:

On 2012.04​:

$ ./perl6 -e 'say "\x[10001]" ~~ /<[\x10000..\xEFFFF]>/'
===SORRY!===
Invalid character for UTF-8 encoding

Isn't this correct behavior now?

--
Will "Coke" Coleda

@p6rt
Copy link
Author

p6rt commented Jul 19, 2014

From @FROGGS

On all platforms it built a string with every codepoint of the given range, and when :i was in effect it built that string with uppercase and lowercase codepoint.
Building this string is very expensive and might even crash of long ranges.

This is probably fixed for MoarVM (branch charrange in MoarVM and nqp).

About that invalid character message​: This also popped up when an invalid codepoint was withing that range, since every codepoint of that range was turned into a char and concatenated to a string.

The changed code only would complain if the lower and upper bounds were invalid codepoints.
I think it is good to allow ranges that include invalid codepoints, because you cannot match against these codepoints because utf8 string cannot contain them.

Hopefully the charrange branches can be merged this weekend.

@p6rt
Copy link
Author

p6rt commented Jul 20, 2014

From @FROGGS

Fixed in Raku/nqp@a0842ef500

@p6rt
Copy link
Author

p6rt commented Jul 20, 2014

@FROGGS - Status changed from 'open' to 'resolved'

@p6rt p6rt closed this as completed Jul 20, 2014
@p6rt p6rt added the Bug label Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant