Skip Menu |
Report information

Subject: panic: swash_fetch got swatch of unexpected bit width
Download (untitled) / with headers
text/plain 340b
After upgrading to perl 5.20.2 my code fails with message: 'panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64' Simple test case attached. On perl 5.20.1 this code works without panic. The bug is subtle and is not reproduced after small modifications to test case (like storing string in intermediate variable).
Subject: panic_swash_fetch_testcase.pl
#!/usr/bin/perl use utf8; use open qw/:std :encoding(utf8)/; use Devel::Peek; my $x = "0005 RU Москва"; $x =~ /^(.*?)\s(.*)$/; #Dump($2); warn lc($2);
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 329b
There is related issue with utf8 and regexps in perl 5.20.2 This oneliner works correctly on perl 5.20.1 and show wrong result on perl 5.20.2 perl 5.20.1 > perl -C -Mutf8 -E '$x = "Москва"; $x =~ /^(.*)$/; say lc $1' москва perl 5.20.2 > perl -C -Mutf8 -E '$x = "Москва"; $x =~ /^(.*)$/; say lc $1' 1400ква
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 11.9k
On Wed Mar 18 08:45:28 2015, citrin.ru wrote: Show quoted text
> There is related issue with utf8 and regexps in perl 5.20.2 > > This oneliner works correctly on perl 5.20.1 and show wrong result on > perl 5.20.2 > > perl 5.20.1 > perl -C -Mutf8 -E '$x = "Москва"; $x =~ /^(.*)$/; say lc > $1' > москва > perl 5.20.2 > perl -C -Mutf8 -E '$x = "Москва"; $x =~ /^(.*)$/; say lc > $1' > 1400ква
=== base/tperl5.8.9 5.008009 x86_64-linux-thread-multi-ld Unrecognized switch: -E (-h will show valid options). Exit status: 512 === base/perl5.10.0 5.010000 x86_64-linux москва === base/tperl5.10.0 5.010000 x86_64-linux-thread-multi-ld москва === base/perl5.10.1 5.010001 x86_64-linux москва === base/tperl5.10.1 5.010001 x86_64-linux-thread-multi-ld москва === base/perl5.11.0 5.011000 x86_64-linux москва === base/tperl5.11.0 5.011000 x86_64-linux-thread-multi-ld москва === base/perl5.11.1 5.011001 x86_64-linux москва === base/tperl5.11.1 5.011001 x86_64-linux-thread-multi-ld москва === base/perl5.11.2 5.011002 x86_64-linux москва === base/tperl5.11.2 5.011002 x86_64-linux-thread-multi-ld москва === base/perl5.11.3 5.011003 x86_64-linux москва === base/tperl5.11.3 5.011003 x86_64-linux-thread-multi-ld москва === base/perl5.11.4 5.011004 x86_64-linux москва === base/tperl5.11.4 5.011004 x86_64-linux-thread-multi-ld москва === base/perl5.11.5 5.011005 x86_64-linux москва === base/tperl5.11.5 5.011005 x86_64-linux-thread-multi-ld москва === base/perl5.12.0 5.012000 x86_64-linux москва === base/tperl5.12.0 5.012000 x86_64-linux-thread-multi-ld москва === base/perl5.12.1 5.012001 x86_64-linux москва === base/tperl5.12.1 5.012001 x86_64-linux-thread-multi-ld москва === base/perl5.12.2 5.012002 x86_64-linux москва === base/tperl5.12.2 5.012002 x86_64-linux-thread-multi-ld москва === base/perl5.12.3 5.012003 x86_64-linux москва === base/tperl5.12.3 5.012003 x86_64-linux-thread-multi-ld москва === base/perl5.12.4 5.012004 x86_64-linux москва === base/tperl5.12.4 5.012004 x86_64-linux-thread-multi-ld москва === base/perl5.12.5 5.012005 x86_64-linux москва === base/tperl5.12.5 5.012005 x86_64-linux-thread-multi-ld москва === base/perl5.13.0 5.013000 x86_64-linux москва === base/tperl5.13.0 5.013000 x86_64-linux-thread-multi-ld москва === base/perl5.13.1 5.013001 x86_64-linux москва === base/tperl5.13.1 5.013001 x86_64-linux-thread-multi-ld москва === base/perl5.13.2 5.013002 x86_64-linux москва === base/tperl5.13.2 5.013002 x86_64-linux-thread-multi-ld москва === base/perl5.13.3 5.013003 x86_64-linux москва === base/tperl5.13.3 5.013003 x86_64-linux-thread-multi-ld москва === base/perl5.13.4 5.013004 x86_64-linux москва === base/tperl5.13.4 5.013004 x86_64-linux-thread-multi-ld москва === base/perl5.13.5 5.013005 x86_64-linux москва === base/tperl5.13.5 5.013005 x86_64-linux-thread-multi-ld москва === base/perl5.13.6 5.013006 x86_64-linux москва === base/tperl5.13.6 5.013006 x86_64-linux-thread-multi-ld москва === base/perl5.13.7 5.013007 x86_64-linux москва === base/tperl5.13.7 5.013007 x86_64-linux-thread-multi-ld москва === base/perl5.13.8 5.013008 x86_64-linux москва === base/tperl5.13.8 5.013008 x86_64-linux-thread-multi-ld москва === base/perl5.13.9 5.013009 x86_64-linux москва === base/tperl5.13.9 5.013009 x86_64-linux-thread-multi-ld москва === base/perl5.13.10 5.013010 x86_64-linux москва === base/tperl5.13.10 5.013010 x86_64-linux-thread-multi-ld москва === base/perl5.13.11 5.013011 x86_64-linux москва === base/tperl5.13.11 5.013011 x86_64-linux-thread-multi-ld москва === base/perl5.14.0 5.014000 x86_64-linux москва === base/tperl5.14.0 5.014000 x86_64-linux-thread-multi-ld москва === base/perl5.14.1 5.014001 x86_64-linux москва === base/tperl5.14.1 5.014001 x86_64-linux-thread-multi-ld москва === base/perl5.14.2 5.014002 x86_64-linux москва === base/tperl5.14.2 5.014002 x86_64-linux-thread-multi-ld москва === base/perl5.14.3 5.014003 x86_64-linux москва === base/tperl5.14.3 5.014003 x86_64-linux-thread-multi-ld москва === base/perl5.14.4 5.014004 x86_64-linux москва === base/tperl5.14.4 5.014004 x86_64-linux-thread-multi-ld москва === base/perl5.15.0 5.015000 x86_64-linux москва === base/tperl5.15.0 5.015000 x86_64-linux-thread-multi-ld москва === base/perl5.15.1 5.015001 x86_64-linux москва === base/tperl5.15.1 5.015001 x86_64-linux-thread-multi-ld москва === base/perl5.15.2 5.015002 x86_64-linux москва === base/tperl5.15.2 5.015002 x86_64-linux-thread-multi-ld москва === base/perl5.15.3 5.015003 x86_64-linux москва === base/tperl5.15.3 5.015003 x86_64-linux-thread-multi-ld москва === base/perl5.15.4 5.015004 x86_64-linux москва === base/tperl5.15.4 5.015004 x86_64-linux-thread-multi-ld москва === base/perl5.15.5 5.015005 x86_64-linux москва === base/tperl5.15.5 5.015005 x86_64-linux-thread-multi-ld москва === base/perl5.15.6 5.015006 x86_64-linux москва === base/tperl5.15.6 5.015006 x86_64-linux-thread-multi-ld москва === base/perl5.15.7 5.015007 x86_64-linux москва === base/tperl5.15.7 5.015007 x86_64-linux-thread-multi-ld москва === base/perl5.15.8 5.015008 x86_64-linux москва === base/tperl5.15.8 5.015008 x86_64-linux-thread-multi-ld москва === base/perl5.15.9 5.015009 x86_64-linux москва === base/tperl5.15.9 5.015009 x86_64-linux-thread-multi-ld москва === base/perl5.16.0 5.016000 x86_64-linux москва === base/tperl5.16.0 5.016000 x86_64-linux-thread-multi-ld москва === base/perl5.16.1 5.016001 x86_64-linux москва === base/tperl5.16.1 5.016001 x86_64-linux-thread-multi-ld москва === base/perl5.16.2 5.016002 x86_64-linux москва === base/tperl5.16.2 5.016002 x86_64-linux-thread-multi-ld москва === base/perl5.16.3 5.016003 x86_64-linux москва === base/tperl5.16.3 5.016003 x86_64-linux-thread-multi-ld москва === base/perl5.17.0 5.017000 x86_64-linux москва === base/tperl5.17.0 5.017000 x86_64-linux-thread-multi-ld москва === base/perl5.17.1 5.017001 x86_64-linux москва === base/tperl5.17.1 5.017001 x86_64-linux-thread-multi-ld москва === base/perl5.17.2 5.017002 x86_64-linux москва === base/tperl5.17.2 5.017002 x86_64-linux-thread-multi-ld москва === base/perl5.17.3 5.017003 x86_64-linux москва === base/tperl5.17.3 5.017003 x86_64-linux-thread-multi-ld москва === base/perl5.17.4 5.017004 x86_64-linux москва === base/tperl5.17.4 5.017004 x86_64-linux-thread-multi-ld москва === base/perl5.17.5 5.017005 x86_64-linux москва === base/tperl5.17.5 5.017005 x86_64-linux-thread-multi-ld москва === base/perl5.17.6 5.017006 x86_64-linux москва === base/tperl5.17.6 5.017006 x86_64-linux-thread-multi-ld москва === base/perl5.17.7 5.017007 x86_64-linux москва === base/tperl5.17.7 5.017007 x86_64-linux-thread-multi-ld москва === base/perl5.17.8 5.017008 x86_64-linux москва === base/tperl5.17.8 5.017008 x86_64-linux-thread-multi-ld москва === base/perl5.17.9 5.017009 x86_64-linux москва === base/tperl5.17.9 5.017009 x86_64-linux-thread-multi-ld москва === base/perl5.17.10 5.017010 x86_64-linux москва === base/tperl5.17.10 5.017010 x86_64-linux-thread-multi-ld москва === base/perl5.17.11 5.017011 x86_64-linux москва === base/tperl5.17.11 5.017011 x86_64-linux-thread-multi-ld москва === base/perl5.18.0 5.018000 x86_64-linux москва === base/tperl5.18.0 5.018000 x86_64-linux-thread-multi-ld москва === base/perl5.18.1 5.018001 x86_64-linux москва === base/tperl5.18.1 5.018001 x86_64-linux-thread-multi-ld москва === base/perl5.18.2 5.018002 x86_64-linux москва === base/tperl5.18.2 5.018002 x86_64-linux-thread-multi-ld москва === base/perl5.18.3 5.018003 x86_64-linux москва === base/tperl5.18.3 5.018003 x86_64-linux-thread-multi-ld москва === base/perl5.18.4 5.018004 x86_64-linux москва === base/tperl5.18.4 5.018004 x86_64-linux-thread-multi-ld москва === base/perl5.19.0 5.019000 x86_64-linux москва === base/tperl5.19.0 5.019000 x86_64-linux-thread-multi-ld москва === base/perl5.19.1 5.019001 x86_64-linux москва === base/tperl5.19.1 5.019001 x86_64-linux-thread-multi-ld москва === base/perl5.19.2 5.019002 x86_64-linux москва === base/tperl5.19.2 5.019002 x86_64-linux-thread-multi-ld москва === base/perl5.19.3 5.019003 x86_64-linux москва === base/tperl5.19.3 5.019003 x86_64-linux-thread-multi-ld москва === base/perl5.19.4 5.019004 x86_64-linux москва === base/tperl5.19.4 5.019004 x86_64-linux-thread-multi-ld москва === base/perl5.19.5 5.019005 x86_64-linux москва === base/tperl5.19.5 5.019005 x86_64-linux-thread-multi-ld москва === base/perl5.19.6 5.019006 x86_64-linux москва === base/tperl5.19.6 5.019006 x86_64-linux-thread-multi-ld москва === base/perl5.19.7 5.019007 x86_64-linux москва === base/tperl5.19.7 5.019007 x86_64-linux-thread-multi-ld москва === base/perl5.19.8 5.019008 x86_64-linux москва === base/tperl5.19.8 5.019008 x86_64-linux-thread-multi-ld москва === base/perl5.19.9 5.019009 x86_64-linux москва === base/tperl5.19.9 5.019009 x86_64-linux-thread-multi-ld москва === base/perl5.19.10 5.019010 x86_64-linux москва === base/tperl5.19.10 5.019010 x86_64-linux-thread-multi-ld москва === base/perl5.19.11 5.019011 x86_64-linux москва === base/tperl5.19.11 5.019011 x86_64-linux-thread-multi-ld москва === base/perl5.20.0 5.020000 x86_64-linux москва === base/tperl5.20.0 5.020000 x86_64-linux-thread-multi-ld москва === base/perl5.20.1 5.020001 x86_64-linux москва === base/tperl5.20.1 5.020001 x86_64-linux-thread-multi-ld москва === base/perl5.20.2 5.020002 x86_64-linux 1400ква === base/tperl5.20.2 5.020002 x86_64-linux-thread-multi-ld 1400ква === base/perl5.21.0 5.021000 x86_64-linux москва === base/tperl5.21.0 5.021000 x86_64-linux-thread-multi-ld москва === base/perl5.21.1 5.021001 x86_64-linux москва === base/tperl5.21.1 5.021001 x86_64-linux-thread-multi-ld москва === base/perl5.21.2 5.021002 x86_64-linux москва === base/tperl5.21.2 5.021002 x86_64-linux-thread-multi-ld москва === base/perl5.21.3 5.021003 x86_64-linux москва === base/tperl5.21.3 5.021003 x86_64-linux-thread-multi-ld москва === base/perl5.21.4 5.021004 x86_64-linux 18a0ква === base/tperl5.21.4 5.021004 x86_64-linux-thread-multi-ld 18a0ква === base/perl5.21.5 5.021005 x86_64-linux 18a0ква === base/tperl5.21.5 5.021005 x86_64-linux-thread-multi-ld 18a0ква === base/perl5.21.6 5.021006 x86_64-linux 18a0ква === base/tperl5.21.6 5.021006 x86_64-linux-thread-multi-ld 18a0ква === base/perl5.21.7 5.021007 x86_64-linux 18a0ква === base/tperl5.21.7 5.021007 x86_64-linux-thread-multi-ld 18a0ква === base/perl5.21.8 5.021008 x86_64-linux 18a0ква === base/tperl5.21.8 5.021008 x86_64-linux-thread-multi-ld 18a0ква === base/perl5.21.9 5.021009 x86_64-linux 18a0ква === base/tperl5.21.9 5.021009 x86_64-linux-thread-multi-ld 18a0ква === /usr/bin/perl 5.010001 x86_64-linux-thread-multi москва
From: Dave Mitchell <davem [...] iabyn.com>
CC: perl5-porters [...] perl.org, demerphq <demerphq [...] gmail.com>
Date: Wed, 18 Mar 2015 21:13:21 +0000
To: "Anton V. Yuzhaninov via RT" <perlbug-comment [...] perl.org>
Subject: Re: [perl #124109] panic: swash_fetch got swatch of unexpected bit width
Download (untitled) / with headers
text/plain 1.4k
On Wed, Mar 18, 2015 at 08:45:29AM -0700, Anton V. Yuzhaninov via RT wrote: Show quoted text
> There is related issue with utf8 and regexps in perl 5.20.2 > > This oneliner works correctly on perl 5.20.1 and show wrong result on perl 5.20.2 > > perl 5.20.1 > perl -C -Mutf8 -E '$x = "Москва"; $x =~ /^(.*)$/; say lc $1' > москва > perl 5.20.2 > perl -C -Mutf8 -E '$x = "Москва"; $x =~ /^(.*)$/; say lc $1' > 1400ква
It bisects to this: commit 2c1f00b9036a7987c714a407662651ef7da99495 Author: Yves Orton <demerphq@gmail.com> AuthorDate: Thu Sep 11 21:55:08 2014 +0200 Commit: Yves Orton <demerphq@gmail.com> CommitDate: Thu Sep 11 22:45:31 2014 +0200 perl #122747: localize PL_curpm to null in _core_swash_init Set PL_curpm to null before we do any swash intialization in _core_swash_init(). This "hides" the current regop from the swash code, with the intent of prevent weird reentrancy bugs when the swashes are initialized. Long term you could argue that we should just not use the regex engine to initialize a swash, and then this would be unnecessary. Thanks to FC for the suggestion! which was cherry-picked into 5.20.2 by b0ed92ce4c0a1752748e14dfd31d244ff2a7a93e. -- The Enterprise's efficient long-range scanners detect a temporal vortex distortion in good time, allowing it to be safely avoided via a minor course correction. -- Things That Never Happen in "Star Trek" #21
CC: perl5-porters [...] perl.org, demerphq <demerphq [...] gmail.com>
From: Dave Mitchell <davem [...] iabyn.com>
Date: Thu, 19 Mar 2015 15:04:08 +0000
To: "Anton V. Yuzhaninov via RT" <perlbug-comment [...] perl.org>
Subject: Re: [perl #124109] panic: swash_fetch got swatch of unexpected bit width
Download (untitled) / with headers
text/plain 858b
On Wed, Mar 18, 2015 at 09:13:21PM +0000, Dave Mitchell wrote: Show quoted text
> On Wed, Mar 18, 2015 at 08:45:29AM -0700, Anton V. Yuzhaninov via RT wrote:
> > There is related issue with utf8 and regexps in perl 5.20.2 > > > > This oneliner works correctly on perl 5.20.1 and show wrong result on perl 5.20.2 > > > > perl 5.20.1 > perl -C -Mutf8 -E '$x = "Москва"; $x =~ /^(.*)$/; say lc $1' > > москва > > perl 5.20.2 > perl -C -Mutf8 -E '$x = "Москва"; $x =~ /^(.*)$/; say lc $1' > > 1400ква
> > It bisects to this:
I'm just about to have a look at this, in case anyone's already started. -- Music lesson: a symbiotic relationship whereby a pupil's embellishments concerning the amount of practice performed since the last lesson are rewarded with embellishments from the teacher concerning the pupil's progress over the corresponding period.
Subject: Re: [perl #124109] panic: swash_fetch got swatch of unexpected bit width
To: "Anton V. Yuzhaninov via RT" <perlbug-comment [...] perl.org>
Date: Fri, 20 Mar 2015 11:55:34 +0000
From: Dave Mitchell <davem [...] iabyn.com>
CC: perl5-porters [...] perl.org, demerphq <demerphq [...] gmail.com>
Download (untitled) / with headers
text/plain 3.4k
On Thu, Mar 19, 2015 at 03:04:08PM +0000, Dave Mitchell wrote: Show quoted text
> On Wed, Mar 18, 2015 at 09:13:21PM +0000, Dave Mitchell wrote:
> > On Wed, Mar 18, 2015 at 08:45:29AM -0700, Anton V. Yuzhaninov via RT wrote:
> > > There is related issue with utf8 and regexps in perl 5.20.2 > > > > > > This oneliner works correctly on perl 5.20.1 and show wrong result on perl 5.20.2 > > > > > > perl 5.20.1 > perl -C -Mutf8 -E '$x = "Москва"; $x =~ /^(.*)$/; say lc $1' > > > москва > > > perl 5.20.2 > perl -C -Mutf8 -E '$x = "Москва"; $x =~ /^(.*)$/; say lc $1' > > > 1400ква
> > > > It bisects to this:
> > I'm just about to have a look at this, in case anyone's already started.
I've now pushed the branch smoke-me/davem/save_re for smoking. The top commit is shown below; two earlier commits revert removing Perl_save_re_context(). For maint-5.20, I'm not sure whether its best to revert Yves' 2c1f00b9036a, which fixes a longstanding "use re 'taint'" issue but introduces a regression; or cherry-pick my fix, which fixes the regression, leaves the longstanding issue fixed, but might introduce other regressions - since it now scans %:: decide which of $1,$2... to localise, rather than using RX_NPARENS(PL_curpm). commit 9c186b0ba7509de9ee21c79481df6c7123fdccb8 Author: David Mitchell <davem@iabyn.com> AuthorDate: Thu Mar 19 20:26:39 2015 +0000 Commit: David Mitchell <davem@iabyn.com> CommitDate: Fri Mar 20 11:42:13 2015 +0000 Reinstate save_re_context()'s "local $n" action RT #124109. 2c1f00b9036 localised PL_curpm to NULL when calling swash init code (i.e. perl-level code that is loaded and executed when something like "lc $large_codepoint" is executed). b4fa55d3f1 followed this up by gutting Perl_save_re_context(), since that function did, basically, if (PL_curpm) { for (i = 1; i <= RX_NPARENS(PM_GETRE(PL_curpm))) { do the C equivalent of the perl code "local ${i}"; } } and now that PL_curpm was null, the code wasn't called any more. However, it turns out that the localisation *was* still needed, it's just that nothing in the test suite actually tested for it. In something like the following: $x = "\x{41c}"; $x =~ /(.*)/; $s = lc $1; pp_lc() calls get magic on $1, which sets $1's PV value to a copy of the substring captured by the current pattern match. Then pp_lc() calls a function to convert the string to upper case, which triggers a swash load, which calls perl code that does a pattern match and, most importantly, uses the value of $1. This triggers get magic on $1, which overwrites $1's PV value with a new value. When control returns to pp_lc(), $1 now holds the wrong string value. Hence $1, $2 etc need localising as well as PL_curpm. The old way that Perl_save_re_context() used to work (localising $1..${RX_NPARENS}) is no longer viable, as we don't know what regex the caller has as its current pm any more. Instead I've made Perl_save_re_context() scan %:: for any entries that match $`,$&,$', $1, $2,.... and localise them. -- "But Sidley Park is already a picture, and a most amiable picture too. The slopes are green and gentle. The trees are companionably grouped at intervals that show them to advantage. The rill is a serpentine ribbon unwound from the lake peaceably contained by meadows on which the right amount of sheep are tastefully arranged." -- Lady Croom, "Arcadia"
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 2.2k
On Fri Mar 20 04:56:03 2015, davem wrote: Show quoted text
> I've now pushed the branch smoke-me/davem/save_re for smoking. > The top commit is shown below; two earlier commits revert removing > Perl_save_re_context(). > > For maint-5.20, I'm not sure whether its best to revert Yves' > 2c1f00b9036a, which fixes a longstanding "use re 'taint'" issue but > introduces a regression; or cherry-pick my fix, which fixes the > regression, leaves the longstanding issue fixed, but might introduce > other regressions - since it now scans %:: decide which of $1,$2... > to localise, rather than using RX_NPARENS(PL_curpm).
I think for that reason it is problematic, not only for maint, but also for blead. We should not internally be resetting the iterator provided for Perl space. This would break: while(my($key,$val) = each %::) { # read $1 } If this is the only possible approach, then it seems that ‘manual’ iterator without using the iterator is the way to go. I think we already have code like that in dump.c. Show quoted text
> pp_lc() calls get magic on $1, which sets $1's PV value to a copy of > the > substring captured by the current pattern match. > Then pp_lc() calls a function to convert the string to upper case, > which > triggers a swash load, which calls perl code that does a pattern match
Is that in utf8_heavy.pl? Show quoted text
> and, most importantly, uses the value of $1. This triggers get magic > on > $1, which overwrites $1's PV value with a new value. When control > returns > to pp_lc(), $1 now holds the wrong string value.
Ow. That potentially affects any magical variable read by swash code. Until we find a cleaner solution, wouldn’t it be better to localise only $1, $2 and $3, which are all that utf8_heavy.pl uses? We could, as an interim measure, add a porting test (oh no!) that ensures we do not accidentally add $4 to utf8_heavy.pl without also localising it. Also, if we handle this issue when swashes are loaded, rather than in the regexp engine, wouldn’t the localisation take place less often? Doing the localisation in a different place perhaps should way, though, till after 5.22. After all, we *know* it works in the current spot. Show quoted text
> Then pp_lc() calls a function to convert the string to upper case,
Surely it converts it to lowercase. -- Father Chrysostomos
To: Father Chrysostomos via RT <perlbug-followup [...] perl.org>
Subject: Re: [perl #124109] panic: swash_fetch got swatch of unexpected bit width
CC: perl5-porters [...] perl.org
From: Dave Mitchell <davem [...] iabyn.com>
Date: Wed, 25 Mar 2015 12:10:55 +0000
Download (untitled) / with headers
text/plain 3.1k
On Tue, Mar 24, 2015 at 11:28:27PM -0700, Father Chrysostomos via RT wrote: Show quoted text
> > pp_lc() calls get magic on $1, which sets $1's PV value to a copy of > > the > > substring captured by the current pattern match. > > Then pp_lc() calls a function to convert the string to upper case, > > which > > triggers a swash load, which calls perl code that does a pattern match
> > Is that in utf8_heavy.pl?
Yes, it's the method SWASHNEW() in utf8_heavy.pl. But note that the swash mechanism loads utf8.pm, which has an AUTOLOAD method, so that when SWASHNEW() is called, utf8_heavy.pl is required, which also loads strict, warnings and re. So quite a bit a code is required and/or executed on first swash loading. Show quoted text
> > and, most importantly, uses the value of $1. This triggers get magic > > on > > $1, which overwrites $1's PV value with a new value. When control > > returns > > to pp_lc(), $1 now holds the wrong string value.
> > Ow. That potentially affects any magical variable read by swash code. > Until we find a cleaner solution, wouldn’t it be better to localise only > $1, $2 and $3, which are all that utf8_heavy.pl uses? We could, as an > interim measure, add a porting test (oh no!) that ensures we do not > accidentally add $4 to utf8_heavy.pl without also localising it. > > Also, if we handle this issue when swashes are loaded, rather than in > the regexp engine, wouldn’t the localisation take place less often? > Doing the localisation in a different place perhaps should way, though, > till after 5.22. After all, we *know* it works in the current spot.
Note that it isn't just swash code. Any code that "unexpectedly" calls into Perl from C can have similar side effects, for example ties: sub TIESCALAR { bless [] } sub FETCH { "xyz" =~ /(...)/ or die; my $x = $1; "foo"; } my $s; tie $s, 'main'; "abc" =~ /(...)/ or die; my $x = $1 . $s; print "x=[$x]\n"; This outputs "xyzfoo" rather than "abcfoo". Note also that save_re_context() isn't called just when loading swatches, but at other places, such as calling PL_warnhook/diehook. So just testing utf8_heavy.pl (and the modules it loads) in porting isn't sufficient as an interim measure (But save_re_context() is *not* called before calling tie methods). Perhaps: * we restore the original behaviour of save_re_context(), that does local ${$_} for 1..RX_NPARENS() This makes all all the other callers of save_re_context() no worse than 5.20. * In addition for the case where PL_curpm is null, rather than skipping the localisation, we localise $1..$3. This fixes the 5.21.x regression. We then add a porting test that checks that utf8.pm, utf8_heavy.pl and dependences don't contain any punctuation vars in their source apart from $1,$2,$3. * Post 5.22, we try and do something more clever. Show quoted text
> > Then pp_lc() calls a function to convert the string to upper case,
> > Surely it converts it to lowercase.
Damn, no wonder my code has been failing for all these years ;-) -- "I do not resent criticism, even when, for the sake of emphasis, it parts for the time with reality". -- Winston Churchill, House of Commons, 22nd Jan 1941.
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 877b
On Wed Mar 25 05:11:38 2015, davem wrote: Show quoted text
> Perhaps: > > * we restore the original behaviour of save_re_context(), that does > local ${$_} for 1..RX_NPARENS() > This makes all all the other callers of save_re_context() no worse than > 5.20. > * In addition for the case where PL_curpm is null, rather than skipping > the localisation, we localise $1..$3. This fixes the 5.21.x regression. > We then add a porting test that checks that utf8.pm, utf8_heavy.pl and > dependences don't contain any punctuation vars in their source apart > from $1,$2,$3. > > * Post 5.22, we try and do something more clever.
I think that suggestion is good. Show quoted text
> > > Then pp_lc() calls a function to convert the string to upper case,
> > > > Surely it converts it to lowercase.
> > Damn, no wonder my code has been failing for all these years ;-)
:-) -- Father Chrysostomos
Subject: Re: [perl #124109] panic: swash_fetch got swatch of unexpected bit width
To: Father Chrysostomos via RT <perlbug-followup [...] perl.org>
Date: Wed, 25 Mar 2015 17:21:56 +0000
From: Dave Mitchell <davem [...] iabyn.com>
CC: perl5-porters [...] perl.org
On Wed, Mar 25, 2015 at 08:45:57AM -0700, Father Chrysostomos via RT wrote: Show quoted text
> On Wed Mar 25 05:11:38 2015, davem wrote:
> > Perhaps: > > > > * we restore the original behaviour of save_re_context(), that does > > local ${$_} for 1..RX_NPARENS() > > This makes all all the other callers of save_re_context() no worse than > > 5.20. > > * In addition for the case where PL_curpm is null, rather than skipping > > the localisation, we localise $1..$3. This fixes the 5.21.x regression. > > We then add a porting test that checks that utf8.pm, utf8_heavy.pl and > > dependences don't contain any punctuation vars in their source apart > > from $1,$2,$3. > > > > * Post 5.22, we try and do something more clever.
> > I think that suggestion is good.
Now done as smoke-me/davem/save_re2 -- Music lesson: a symbiotic relationship whereby a pupil's embellishments concerning the amount of practice performed since the last lesson are rewarded with embellishments from the teacher concerning the pupil's progress over the corresponding period.
Date: Mon, 30 Mar 2015 12:15:42 +0100
CC: perl5-porters [...] perl.org
From: Dave Mitchell <davem [...] iabyn.com>
Subject: Re: [perl #124109] panic: swash_fetch got swatch of unexpected bit width
To: Father Chrysostomos via RT <perlbug-followup [...] perl.org>
Download (untitled) / with headers
text/plain 1.2k
On Wed, Mar 25, 2015 at 05:21:56PM +0000, Dave Mitchell wrote: Show quoted text
> On Wed, Mar 25, 2015 at 08:45:57AM -0700, Father Chrysostomos via RT wrote:
> > On Wed Mar 25 05:11:38 2015, davem wrote:
> > > Perhaps: > > > > > > * we restore the original behaviour of save_re_context(), that does > > > local ${$_} for 1..RX_NPARENS() > > > This makes all all the other callers of save_re_context() no worse than > > > 5.20. > > > * In addition for the case where PL_curpm is null, rather than skipping > > > the localisation, we localise $1..$3. This fixes the 5.21.x regression. > > > We then add a porting test that checks that utf8.pm, utf8_heavy.pl and > > > dependences don't contain any punctuation vars in their source apart > > > from $1,$2,$3. > > > > > > * Post 5.22, we try and do something more clever.
> > > > I think that suggestion is good.
> > Now done as smoke-me/davem/save_re2
Now merged into blead as 4e0341d Perl_save_re_context(): re-indent after last commit 3553f4f save_re_context(): do "local $n" with no PL_curpm e8d8f80 Revert "Gut Perl_save_re_context" 2782061 Revert "Don’t call save_re_context" 7c6e85a Revert "Mathomise save_re_context" -- Never do today what you can put off till tomorrow.
To: Father Chrysostomos via RT <perlbug-followup [...] perl.org>
Subject: Re: [perl #124109] panic: swash_fetch got swatch of unexpected bit width
CC: perl5-porters [...] perl.org
From: Dave Mitchell <davem [...] iabyn.com>
Date: Mon, 30 Mar 2015 12:46:22 +0100
Download (untitled) / with headers
text/plain 593b
On Mon, Mar 30, 2015 at 12:15:42PM +0100, Dave Mitchell wrote: Show quoted text
> Now merged into blead as > > 4e0341d Perl_save_re_context(): re-indent after last commit > 3553f4f save_re_context(): do "local $n" with no PL_curpm > e8d8f80 Revert "Gut Perl_save_re_context" > 2782061 Revert "Don’t call save_re_context" > 7c6e85a Revert "Mathomise save_re_context"
3553f4f (and 4e0341d) will need cherry-picking for maint-5.20 -- The Enterprise is captured by a vastly superior alien intelligence which does not put them on trial. -- Things That Never Happen in "Star Trek" #10
Download (untitled) / with headers
text/plain 200b
Thank you for submitting this ticket. The issue should now be resolved with the release today of Perl v5.22, which is available at http://www.perl.org/get.html -- Karl Williamson for the Perl 5 team


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org