New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIGSEGV compiling regexp in 5.10.0 #9512
Comments
From @salvaCreated by salva@ubuntu.int.qindel.comCompiling/parsing complex regular expressions cause perl to SIGSEGV. For instance: my $len = 6000; Perl Info
|
From @salva
The minimum length that causes the SIGSEGV depends on how perl is Crashing a perl compiled with the default configuration requires $len = - Salva |
@salva - Status changed from 'new' to 'open' |
From @nwc10Dave notes: Appears to be a 5.10.0 regression. Still present in maint,blead |
From p5p@spam.wizbit.beOn Mon Oct 06 09:52:20 2008, salva wrote:
What appears changed between the different perl version is the length If I use length 17000 then perl-5.9.4 works and perl-5.9.5 produces a If I use length 25000 then perl-5.6.0 and above all crash. If I use length 35000 then perl-5.005_62@4668 segfaults as well. If I use length 45000 then perl-p-5.005_58@3859 segfaults as well. And finally the Change that caused the segmentation fault: ----Program---- ----Output of ...p5eAlRt/perl-5.004_54@266/bin/perl---- ----EOF ($?='512')---- ----EOF ($?='11')---- http://public.activestate.com/cgi-bin/perlbrowse/p/267 Jumbo regexp patch applied (with minor fix-up tweaks): Before the change the regex returned the error message 'regexp too big'. In perl@266/perl@267: regcomp.h contains: regcomp.c contains: #ifndef REGALIGN_STRUCT Which means that the 'regexp too big' was never triggered. The error 'regexp too big' was removed when 'REG_ALIGN' was removed: applied patch, with indentation tweaks Currently the code in blead: 'regexp too big' is mentioned in the section 'Obsolete Diagnostics' in =item regexp too big (F) The current implementation of regular expressions uses shorts as So the real questions: does this limit still apply? Should there be a Best regards, Bram |
From p5p@spam.wizbit.beAsked dmq to take a quick look at this: [19:06] <dmq> there should not be a regex limit at all. |
From @hvdsOn Mon Oct 06 09:06:13 2008, salva wrote:
Just took a quick look at this: it segvs in today's blead due to a I have not tried to work out what would be involved in fixing it, and Hugo |
From @obraAs of today's blead, bumping the constant up to 20000, it passes, while While it should never fail, it doesn't currently appear to be a regression. |
From @salvaOn Tue Dec 15 19:04:05 2009, jesse wrote:
Can this limit be OS/architecture/perl config dependant? Can you add the following test so it gets smoked? $ cat op/reg_69654.t # See http://rt.perl.org/rt3/Ticket/Display.html?id=59654 BEGIN { use Test::More; my @sizes = (1000, 2000, 5000, 10000, 20000); plan tests => scalar @sizes; for my $len (@sizes) { |
This now dies with the error "Too many nested open parens in regex". I think that is a reasonable outcome. We're always going to have limits, and we now properly handle exceeding this limit. So I think this is closable |
I think I'd say it's a bug that we hit the C stack for this. While I don't think there are that many real-world cases where #8369 really restricts what you can do, this one seems much more likely to stop useful stuff getting done by automated construction of regexps for complex grammars. I might feel differently if it were only capturing parens that triggered this: I think it's less likely that you'd need 1000s of nested capturing parens to express a useful pattern. But non-capturing parens feel like they should be barely more costly than a no-op, and it is really useful to sprinkle them around liberally when constructing a regexp recursively. |
Regarding @hvds last comment, I guess we hit the C stack while calling reg over and over. This is a parser bug, not an optimizer bug I think. |
I am closing this ticket as fixed. |
Migrated from rt.perl.org#59654 (status was 'open')
Searchable as RT59654$
The text was updated successfully, but these errors were encountered: