Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v5.25.5-184-ga5540cf breaks texinfo #15696

Closed
p5pRT opened this issue Nov 3, 2016 · 58 comments
Closed

v5.25.5-184-ga5540cf breaks texinfo #15696

p5pRT opened this issue Nov 3, 2016 · 58 comments

Comments

@p5pRT
Copy link

p5pRT commented Nov 3, 2016

Migrated from rt.perl.org#130010 (status was 'resolved')

Searchable as RT130010$

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2016

From @jkeenan

[Transferring to RT a report of blead breaking a downstream program
originally filed on perl.perl5.porters by Mathieu Arnold. -- jkeenan]

#####

Like every week, I updated the perl5-devel FreeBSD port (this week, from
v5.25.5-115-g4eadd82 to v5.25.6-72-g92c843f). Perl built just fine, and
make test runs just fine fine too.

Today, I discovered that texinfo could not be built with the new Perl
version, after a quick git bisect (git bisect is your god), I ended up
with v5.25.5-183-g8204e83 being the last good, and v5.25.5-184-ga5540cf
being the first bad.

The error I get in texinfo is​:

Modification of a read-only value attempted at
../tp/Texinfo/Convert/Line.pm line 309.

That line being​:

} elsif ($text =~ s/^(([^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)//) {

When that fails, $text contains " Texinfo documentation system".

Adding use diagnostics to that file, I get [pastebin.com link has
expired]

It turns out that the problem is \p{InFullwidth} (comes from
Unicode​::EastAsianWidth) and its usage, in that case (I don't know
texinfo at all) got broken.

[For reference, Texinfo​::Convert​::Line is part of the GNU texinfo
library​: http​://ftpmirror.gnu.org/texinfo/texinfo-6.1.tar.gz]

Mat​: Could you please provide the output of ./perl -lib -V from the
machine on which you detected this problem? Also, since it appears that
your original pastebin link has expired, can you provide its content in
a reply to this ticket?

Thank you very much.
Jim Keenan

[perl -V info omitted because this is not being sent from the
problematic environment.]

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2016

From @khwilliamson

As a BBC ticket, I've marked this as a blocker for 5.26
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2016

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2016

From mat@cpan.org

Content of the pastebin​:

Modification of a read-only value attempted at ../tp/Texinfo/Convert/Line.pm
  line 309 (#1)
  (F) You tried, directly or indirectly, to change the value of a
  constant. You didn't, of course, try "2 = 1", because the compiler
  catches that. But an easy way to do the same thing is​:

  sub mod { $_[0] = 1 }
  mod(2);

  Another way is to assign to a substr() that's off the end of the string.

  Yet another way is to assign to a foreach loop VAR when VAR
  is aliased to a constant in the look LIST​:

  $x = 1;
  foreach my $n ($x, 2) {
  $n *= 2; # modifies the $x, but fails on attempt to
  } # modify the 2

Uncaught exception from user code​:
  Modification of a read-only value attempted at ../tp/Texinfo/Convert/Line.pm line 309.
  Texinfo​::Convert​::Line​::add_text(Texinfo​::Convert​::Line=HASH(0x80872f7f8), "Texinfo documentation system") called at ../tp/Texinfo/Convert/Plaintext.pm line 1668
  Texinfo​::Convert​::Plaintext​::_convert(Texinfo​::Convert​::Info=HASH(0x808701900), HASH(0x805486360)) called at ../tp/Texinfo/Convert/Plaintext.pm line 3187
  Texinfo​::Convert​::Plaintext​::_convert(Texinfo​::Convert​::Info=HASH(0x808701900), HASH(0x808708030)) called at ../tp/Texinfo/Convert/Plaintext.pm line 705
  Texinfo​::Convert​::Plaintext​::convert_line(Texinfo​::Convert​::Info=HASH(0x808701900), HASH(0x808708030)) called at ../tp/Texinfo/Convert/Info.pm line 340
  Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x808701900)) called at ../tp/Texinfo/Convert/Info.pm line 81
  Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x808701900), HASH(0x8053d33c0)) called at ../tp/texi2any line 1348

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2016

From mat@cpan.org

root@​11amd64-ports-perl5-devel​:~ # perl -lib -V
-i used with no filenames on the command line, reading from STDIN.
Summary of my perl5 (revision 5 version 25 subversion 7) configuration​:

  Platform​:
  osname=freebsd
  osvers=11.0-release-p3
  archname=amd64-freebsd-thread-multi
  uname='freebsd 11amd64-ports-perl5-devel-job-02 11.0-release-p3 freebsd 11.0-release-p3 amd64 '
  config_args='-sde -Dprefix=/usr/local -Dlibperl=libperl.so.5.25.6.144 -Darchlib=/usr/local/lib/perl5/5.25/mach -Dprivlib=/usr/local/lib/perl5/5.25 -Dman3dir=/usr/local/lib/perl5/5.25/perl/man/man3 -Dman1dir=/usr/local/lib/perl5/5.25/perl/man/man1 -Dsitearch=/usr/local/lib/p
erl5/site_perl/mach/5.25 -Dsitelib=/usr/local/lib/perl5/site_perl -Dscriptdir=/usr/local/bin -Dsiteman3dir=/usr/local/lib/perl5/site_perl/man/man3 -Dsiteman1dir=/usr/local/lib/perl5/site_perl/man/man1 -Ui_malloc -Ui_iconv -Uinstallusrbinperl -Dusenm=n -Dcc=cc -Duseshrplib -Dinc
_version_list=none -Dcf_by=mat -Dcf_email=mat@​FreeBSD.org -Dcf_time=Wed Nov 02 10​:24​:46 UTC 2016 -Alddlflags=-L/wrkdirs/usr/ports/lang/perl5-devel/work/perl5-5.25.6-144-g26e9d72 -L/usr/local/lib/perl5/5.25/mach/CORE -lperl -Dshrpldflags=$(LDDLFLAGS​:N-L/wrkdirs/usr/ports/lang/pe
rl5-devel/work/perl5-5.25.6-144-g26e9d72​:N-L/usr/local/lib/perl5/5.25/mach/CORE​:N-lperl) -Wl,-soname,$(LIBPERL) -Dusedevel -Uversiononly -Doptimize=-O2 -pipe -fstack-protector -fno-strict-aliasing -Ui_gdbm -Dusemultiplicity=y -Duse64bitint -Dusethreads=y -Dusemymalloc=n'
  hint=recommended
  useposix=true
  d_sigaction=define
  useithreads=define
  usemultiplicity=define
  use64bitint=define
  use64bitall=define
  uselongdouble=undef
  usemymalloc=n
  bincompat5005=undef
  Compiler​:
  cc='cc'
  ccflags ='-DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_FORTIFY_SOURCE=2'
  optimize='-O2 -pipe -fstack-protector -fno-strict-aliasing'
  cppflags='-DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
  ccversion=''
  gccversion='4.2.1 Compatible FreeBSD Clang 3.8.0 (tags/RELEASE_380/final 262564)'
  gccosandvers=''
  intsize=4
  longsize=8
  ptrsize=8
  doublesize=8
  byteorder=12345678
  doublekind=3
  d_longlong=define
  longlongsize=8
  d_longdbl=define
  longdblsize=16
  longdblkind=3
  ivtype='long'
  ivsize=8
  nvtype='double'
  nvsize=8
  Off_t='off_t'
  lseeksize=8
  alignbytes=8
  prototype=define
  Linker and Libraries​:
  ld='cc'
  ldflags ='-lpthread -Wl,-E -fstack-protector-strong -L/usr/local/lib'
  libpth=/usr/lib /usr/local/lib /usr/bin/../lib/clang/3.8.0/lib /usr/lib
  libs=-lpthread -lm -lcrypt -lutil
  perllibs=-lpthread -lm -lcrypt -lutil
  libc=
  so=so
  useshrplib=true
  libperl=libperl.so.5.25.6.144
  gnulibc_version=''
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs
  dlext=so
  d_dlsymun=undef
  ccdlflags=' -Wl,-R/usr/local/lib/perl5/5.25/mach/CORE'
  cccdlflags='-DPIC -fPIC'
  lddlflags='-shared -L/usr/local/lib/perl5/5.25/mach/CORE -lperl -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl)​:

  Compile-time options​:

  HAS_TIMES

  MULTIPLICITY

  PERLIO_LAYERS

  PERL_COPY_ON_WRITE

  PERL_DONT_CREATE_GVSV

  PERL_HASH_FUNC_ONE_AT_A_TIME_HARD

  PERL_IMPLICIT_CONTEXT

  PERL_MALLOC_WRAP

  PERL_OP_PARENT

  PERL_PRESERVE_IVUV

  PERL_USE_DEVEL

  USE_64_BIT_ALL

  USE_64_BIT_INT

  USE_ITHREADS

  USE_LARGE_FILES

  USE_LOCALE

  USE_LOCALE_COLLATE

  USE_LOCALE_CTYPE

  USE_LOCALE_NUMERIC

  USE_LOCALE_TIME

  USE_PERLIO

  USE_PERL_ATOF

  USE_REENTRANT_API

  Built under freebsd

  @​INC​:

  /usr/local/lib/perl5/site_perl/mach/5.25

  /usr/local/lib/perl5/site_perl

  /usr/local/lib/perl5/5.25/mach

  /usr/local/lib/perl5/5.25

  .

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2016

From @jkeenan

On Wed Nov 02 20​:33​:30 2016, jkeen@​verizon.net wrote​:

[Transferring to RT a report of blead breaking a downstream program
originally filed on perl.perl5.porters by Mathieu Arnold. -- jkeenan]

[For reference, Texinfo​::Convert​::Line is part of the GNU texinfo
library​: http​://ftpmirror.gnu.org/texinfo/texinfo-6.1.tar.gz]

Today I downloaded this tarball and attempted to configure/build/test it following the instructions in its INSTALL.generic file.

I experienced a problem as early as './configure'. As that shell script ran, at one point I got output like this​:

#####
/bin/bash ./libtool --tag=CC --mode=link cc -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DVERSION=\"6.0\" -DXS_VERSION=\"6.0\" "-I/home/jkeenan/perl5/perlbrew/perls/perl-5.24.0/lib/5.24.0/x86_64-linux/CORE" -no-undefined -L/home/jkeenan/perl5/perlbrew/perls/perl-5.24.0/lib/5.24.0/x86_64-linux/CORE -lperl -avoid-version -module -Wl,-E -o TestXS.la -rpath /usr/local/lib/texinfo TestXS_la-TestXS.lo
libtool​: link​: cc -shared -fPIC -DPIC .libs/TestXS_la-TestXS.o -L/home/jkeenan/perl5/perlbrew/perls/perl-5.24.0/lib/5.24.0/x86_64-linux/CORE -lperl -Wl,-E -Wl,-soname -Wl,TestXS.so -o .libs/TestXS.so
/usr/bin/ld​: /home/jkeenan/perl5/perlbrew/perls/perl-5.24.0/lib/5.24.0/x86_64-linux/CORE/libperl.a(op.o)​: relocation R_X86_64_32S against `PL_opargs' can not be used when making a shared object; recompile with -fPIC
/home/jkeenan/perl5/perlbrew/perls/perl-5.24.0/lib/5.24.0/x86_64-linux/CORE/libperl.a​: error adding symbols​: Bad value
collect2​: error​: ld returned 1 exit status
Makefile​:1124​: recipe for target 'TestXS.la' failed
make​: *** [TestXS.la] Error 1
checking whether we can build Perl extension (XS) modules... no
#####

(Full typescript available upon request.)

texinfo's ./configure helpfully writes a 'config.log' file. grepping that file, I observe repeated instances of the following​:

#####
configure​: failed program was​:
| /* confdefs.h */
| #define PACKAGE_NAME "GNU Texinfo"
| #define PACKAGE_TARNAME "texinfo"
| #define PACKAGE_VERSION "6.1"
| #define PACKAGE_STRING "GNU Texinfo 6.1"
| #define PACKAGE_BUGREPORT "bug-texinfo@​gnu.org"
| #define PACKAGE_URL "http​://www.gnu.org/software/texinfo/"
| #define PACKAGE "texinfo"
| #define VERSION "6.1"
| /* end confdefs.h. */
| #include <ac_nonexistent.h>
#####

Preliminary inferences​:

1. texinfo's problems are not limited to FreeBSD.

2. texinfo's problems are not limited to that introduced at the commit point in 5.25 reported by mat. The program's ./configure appears to have problems on perl-5.24.0.

3. But since once I ran 'make' I had what appeared to be a valid 'Makefile' -- and since I know little about texinfo itself -- I'm not sure of the real importance of (1) and (2) or its relation to the blead breakage reported by mat.

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2016

From @jkeenan

On Thu Nov 03 11​:20​:48 2016, jkeenan wrote​:
[snip]

Preliminary inferences​:

1. texinfo's problems are not limited to FreeBSD.

2. texinfo's problems are not limited to that introduced at the commit
point in 5.25 reported by mat. The program's ./configure appears to
have problems on perl-5.24.0.

3. But since once I ran 'make' I had what appeared to be a valid
'Makefile' -- and since I know little about texinfo itself -- I'm not
sure of the real importance of (1) and (2) or its relation to the
blead breakage reported by mat.

I subsequently installed blead​:

#####
This is perl 5, version 25, subversion 7 (v5.25.7 (v5.25.6-172-g3e8592f)) built for x86_64-linux
(with 56 registered patches, see perl -V for more detail)
#####

... and then, before calling texinfo's './Configure', I said​:

#####
export PERL=/home/jkeenan/testing/blead/bin/perl
#####

I got similar errors in config.log, but, once again, they didn't seem to prevent me from getting a valid 'Makefile'.

I then called 'make check' and logged the output. texinfo's 'make check' runs quite a few shell scripts.

#####
FAIL​: test_scripts/formatting_cond_info.sh
FAIL​: test_scripts/formatting_cond_info_no-ifhtml_no-ifinfo_no-iftex.sh
FAIL​: test_scripts/formatting_cond_info_ifhtml_ifinfo_iftex.sh
FAIL​: test_scripts/formatting_direntry_dircategory_info_split.sh
FAIL​: test_scripts/formatting_split_nocopying.sh
FAIL​: test_scripts/formatting_split_nocopying_split.sh
FAIL​: test_scripts/formatting_split_nocopying_split_dev_null.sh
FAIL​: test_scripts/formatting_simplest_test_prefix_info.sh
FAIL​: test_scripts/formatting_documentlanguage_set_option_info.sh
FAIL​: test_scripts/formatting_simple_with_menu_docbook_info.sh
FAIL​: t/stdout.sh
FAIL​: t/stdout_split.sh
#####

After some digging, I came up with this ack​:

#####
$ ack -A1 '^Modification' ./tp/tests
tp/tests/formatting/out_parser/simplest_test_prefix_info/simplest.2
1​:Modification of a read-only value attempted at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
2-panic​: POPSTACK

tp/tests/formatting/out_parser/simple_with_menu_docbook_info/simple_with_menu.2
1​:Modification of a read-only value attempted at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
2-panic​: POPSTACK

tp/tests/formatting/out_parser/cond_info_no-ifhtml_no-ifinfo_no-iftex/cond.2
1​:Modification of a read-only value attempted at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
2-panic​: POPSTACK

tp/tests/formatting/out_parser/direntry_dircategory_info_split/direntry_dircategory.2
3​:Modification of a read-only value attempted at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
4-panic​: POPSTACK

tp/tests/formatting/out_parser/split_nocopying_split_dev_null/split_nocopying.2
1​:Modification of a read-only value attempted at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
2-panic​: POPSTACK

tp/tests/formatting/out_parser/cond_info_ifhtml_ifinfo_iftex/cond.2
1​:Modification of a read-only value attempted at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
2-panic​: POPSTACK

tp/tests/formatting/out_parser/split_nocopying_split/split_nocopying.2
1​:Modification of a read-only value attempted at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
2-panic​: POPSTACK

tp/tests/formatting/out_parser/split_nocopying/split_nocopying.2
1​:Modification of a read-only value attempted at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
2-panic​: POPSTACK

tp/tests/formatting/out_parser/documentlanguage_set_option_info/documentlanguage_set.2
1​:Modification of a read-only value attempted at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
2-panic​: POPSTACK

tp/tests/formatting/out_parser/cond_info/cond.2
1​:Modification of a read-only value attempted at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
2-panic​: POPSTACK
#####

This is the same kind of failure reported by mat, but reported from a different location.

Here is the relevant part of texinfo's tp/Texinfo/Convert/ParagraphNonXS.pm

#####
sub add_text($$)
{
  my $paragraph = shift;
  my $text = shift;
  $paragraph->{'end_line_count'} = 0;
  my $result = '';

  my $protect_spaces_flag = $paragraph->{'protect_spaces'};

  ##### below is line 327 #####
  my @​segments = split
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
  $text;
#####

I don't see any attempt to modify a read-only value at that line -- but, again, I know little about texinfo.

Here's the code (excerpted) from tp/Texinfo/Convert/Line.pm that is relevant to mat's original report.

#####
263 sub add_text($$)
264 {
265 my $line = shift;
266 my $text = shift;
267 $line->{'end_line_count'} = 0;
268 my $result = '';
269
270 while ($text ne '') {
271 if ($line->{'DEBUG'}) {
272 my $word = 'UNDEF';
273 $word = $line->{'word'} if (defined($line->{'word'}));
274 print STDERR "s `$line->{'space'}', w `$word'\n";
275 }
276 # \x{202f}\x{00a0} are non breaking spaces
277 if ($text =~ s/^([^\S\x{202f}\x{00a0}\n]+)//) {
...
309 } elsif ($text =~ s/^(([^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)//) {
310 my $added_word = $1;
...
#####

In each case there's some heavy-duty regex action going on. Perhaps we've recently introduced some regex changes that are intefering with texinfo's assumptions. Any ideas?

Thank you very much.
--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2016

From @jkeenan

On Thu Nov 03 12​:28​:06 2016, jkeenan wrote​:

After some digging, I came up with this ack​:

#####
$ ack -A1 '^Modification' ./tp/tests
tp/tests/formatting/out_parser/simplest_test_prefix_info/simplest.2
1​:Modification of a read-only value attempted at
../../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
2-panic​: POPSTACK

I can now reproduce panics at both the location reported by mat and the location I noted above.

#####
$ cat as_in_paragraphnonxs.pl
my $text = "This is ";
say STDERR "\$text​: <$text>";

my @​segments = split
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
  $text;
#####
$ cat as_in_line.pl
my $text = "This is ";
say STDERR "\$text​: before​: <$text>";

$text =~ s/^(([^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)//;
say STDERR "\$text​: after​: <$text>";
#####

Running each against blead​:

#####
./perl -v

This is perl 5, version 25, subversion 7 (v5.25.7 (v5.25.6-172-g3e8592f)) built for x86_64-linux
(with 56 registered patches, see perl -V for more detail)
#####
$ ./perl -Ilib texinfo/as_in_paragraphnonxs.pl
$text​: <This is >
Can't find Unicode property definition "InFullwidth" at /home/jkeenan/learn/perl/p5p/texinfo/as_in_paragraphnonxs.pl line 4.
panic​: POPSTACK
#####
$ ./perl -Ilib texinfo/as_in_line.pl
$text​: before​: <This is >
Can't find Unicode property definition "InFullwidth" at /home/jkeenan/learn/perl/p5p/texinfo/as_in_line.pl line 4.
#####

FBOW, I cannot find "InFullwidth" in the Perl 5 core distribution.

#####
$ ack -i InFullwidth .
$
#####

Hope that helps.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2016

From @khwilliamson

On 11/03/2016 02​:00 PM, James E Keenan via RT wrote​:

FBOW, I cannot find "InFullwidth" in the Perl 5 core distribution.

That's because it is a subroutine in Unicode​::EastAsianWidth

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2016

From @jkeenan

On Thu Nov 03 13​:00​:51 2016, jkeenan wrote​:

On Thu Nov 03 12​:28​:06 2016, jkeenan wrote​:

[snip]

#####
$ cat as_in_paragraphnonxs.pl
my $text = "This is ";
say STDERR "\$text​: <$text>";

my @​segments = split
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
$text;
#####
$ cat as_in_line.pl
my $text = "This is ";
say STDERR "\$text​: before​: <$text>";

$text =~ s/^(([^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)//;
say STDERR "\$text​: after​: <$text>";
#####

Running each against blead​:

#####
./perl -v

This is perl 5, version 25, subversion 7 (v5.25.7 (v5.25.6-172-
g3e8592f)) built for x86_64-linux
(with 56 registered patches, see perl -V for more detail)
#####
$ ./perl -Ilib texinfo/as_in_paragraphnonxs.pl
$text​: <This is >
Can't find Unicode property definition "InFullwidth" at
/home/jkeenan/learn/perl/p5p/texinfo/as_in_paragraphnonxs.pl line 4.
panic​: POPSTACK
#####
$ ./perl -Ilib texinfo/as_in_line.pl
$text​: before​: <This is >
Can't find Unicode property definition "InFullwidth" at
/home/jkeenan/learn/perl/p5p/texinfo/as_in_line.pl line 4.
#####

FBOW, I cannot find "InFullwidth" in the Perl 5 core distribution.

#####
$ ack -i InFullwidth .
$
#####

Hope that helps.

A DDG search suggests that 'InFullwidth' is a property settable in these CPAN distributions, neither of which is part of the Perl 5 core distribution.

#####
https://metacpan.org/pod/Unicode::EastAsianWidth
https://metacpan.org/pod/Unicode::Property::XS
#####

And an 'ack' on the texinfo source code quickly shows that it has a dependency on Unicode​::EastAsianWidth​:

#####
$ ack -l 'Unicode(​::|-|\/)(EastAsianWidth)' | wc -l
49

$ ack -l 'Unicode(​::|-|\/)(Property​::XS)' | wc -l
0
#####

If, once I have built blead, installed Unicode​::EastAsianWidth and then required that module in my two little scripts above, the panics go away.

#####
$ ./bin/perl /home/jkeenan/learn/perl/p5p/texinfo/as_in_paragraphnonxs.pl
$text​: <This is >

$ ./bin/perl /home/jkeenan/learn/perl/p5p/texinfo/as_in_line.pl
$text​: before​: <This is >
$text​: after​: < is >
#####

mat​: Could you explore what happens when you try to configure/make/make check texinfo in an environment where Unicode​::EastAsianWidth is already installed against what texinfo's ./configure believes to be $PERL?

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2016

From mat@cpan.org

mat​: Could you explore what happens when you try to
configure/make/make check texinfo in an environment where
Unicode​::EastAsianWidth is already installed against what texinfo's
./configure believes to be $PERL?

Mmmm, Unicode​::EastAsianWidth is bundled with the texinfo FreeBSD uses, it comes from​:

ftp​://ftp.stack.nl/pub/users/johans/texinfo/20160425/texinfo-6.1.tar.xz

(I don't quite understand why it is not the official tarball, from what I gather, there are some datafiles that are updated)

I will try to see what happens if I have Unicode​::EastAsianWidth present when building texinfo tomorrow.

@p5pRT
Copy link
Author

p5pRT commented Nov 6, 2016

From @jkeenan

On Thu, 03 Nov 2016 23​:06​:13 GMT, mat@​cpan.org wrote​:

mat​: Could you explore what happens when you try to
configure/make/make check texinfo in an environment where
Unicode​::EastAsianWidth is already installed against what texinfo's
./configure believes to be $PERL?

Mmmm, Unicode​::EastAsianWidth is bundled with the texinfo FreeBSD
uses, it comes from​:

ftp​://ftp.stack.nl/pub/users/johans/texinfo/20160425/texinfo-
6.1.tar.xz

(I don't quite understand why it is not the official tarball, from
what I gather, there are some datafiles that are updated)

I will try to see what happens if I have Unicode​::EastAsianWidth
present when building texinfo tomorrow.

mat, thanks for looking into this. texinfo is a *massive* distribution -- over 4200 files in just one of the top-level directories -- and its test suite is not easy to understand. I found that I had to insert 'print' statements for debugging, run 'make check', and then grep the distro for instances of the "Modification ..." failures we've been investigating. I can't reproduce those failures/crashes in simple Perl programs where Unicode​::EastAsianWidth is clearly 'use'd. So the challenge will be to get a reproducible failure that can be run outside of 'make check'.

The only thing I'm fairly certain of now is that this is not FreeBSD-specific.

Since texinfo is maintained upstream of you by GNU, I feel we should alert the GNU maintainers about this problem.

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 7, 2016

From mat@cpan.org

Le Thu, 03 Nov 2016 13​:52​:10 -0700, jkeenan a écrit :

mat​: Could you explore what happens when you try to
configure/make/make check texinfo in an environment where
Unicode​::EastAsianWidth is already installed against what texinfo's
./configure believes to be $PERL?

Friday, RT was down, so I could not comment, but installing Unicode​::EastAsianWidth does not help.
There is only one Perl installed in the testbed, it is the problematic Perl 5.25, and it is /usr/local/bin/perl, and ./configure does find it without problems.

@p5pRT
Copy link
Author

p5pRT commented Nov 7, 2016

From @jkeenan

On Mon, 07 Nov 2016 10​:21​:33 GMT, mat@​cpan.org wrote​:

Le Thu, 03 Nov 2016 13​:52​:10 -0700, jkeenan a écrit :

mat​: Could you explore what happens when you try to
configure/make/make check texinfo in an environment where
Unicode​::EastAsianWidth is already installed against what texinfo's
./configure believes to be $PERL?

Friday, RT was down, so I could not comment, but installing
Unicode​::EastAsianWidth does not help.
There is only one Perl installed in the testbed, it is the problematic
Perl 5.25, and it is /usr/local/bin/perl, and ./configure does find it
without problems.

Agreed. In the hope that others may help with this debugging effort, I am attaching several files​:

* 130010_debugging_procedures.pod, which has step-by-step instructions as to how to reproduce the failures in the texinfo test suite.

* install_branch_for_testing.sh​: Shell script (adapted from rafl and khw) for installing a perl built at the failing commit.

* dollar-tree.txt and dollar-converter.txt​: Referenced in 130010_debugging_procedures.pod

The question is​: What is it about this the pattern​:

#####
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/
#####

... that (a) as of commit C<a5540cf> but not previously; and (b) in the context of this test suite but not in isolation, perceives something to be a read-only value not subject to modification?

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 7, 2016

@p5pRT
Copy link
Author

p5pRT commented Nov 7, 2016

From @jkeenan

dollar-converter.txt

@p5pRT
Copy link
Author

p5pRT commented Nov 7, 2016

From @jkeenan

x $tree

0 HASH(0x3208208)
  'contents' => ARRAY(0x320c7e0)
  0 HASH(0x2eba650)
  'contents' => ARRAY(0x2e49ec8)
  0 HASH(0x32079e0)
  'contents' => ARRAY(0x320fc20)
  0 HASH(0x124c528)
  'contents' => ARRAY(0x2cd4108)
  0 HASH(0x2e44f20)
  'text' => '\\input texinfo @​c -*-texinfo-*-
'
  'type' => 'preamble_text'
  1 HASH(0x2e53968)
  'text' => '
'
  'type' => 'preamble_text'
  'parent' => HASH(0x32079e0)
  -> REUSED_ADDRESS
  'type' => 'preamble'
  'parent' => HASH(0x2eba650)
  -> REUSED_ADDRESS
  'type' => 'preamble_before_setfilename'
  1 HASH(0x3207e30)
  'args' => ARRAY(0x3207ec0)
  0 HASH(0x2e53a88)
  'contents' => ARRAY(0x32080a0)
  0 HASH(0x32080b8)
  'extra' => HASH(0x3207e00)
  'command' => HASH(0x3207e30)
  -> REUSED_ADDRESS
  'parent' => HASH(0x2e53a88)
  -> REUSED_ADDRESS
  'text' => ' '
  'type' => 'empty_spaces_after_command'
  1 HASH(0x3207de8)
  'parent' => HASH(0x2e53a88)
  -> REUSED_ADDRESS
  'text' => 'simplest.info'
  2 HASH(0x320c8a0)
  'parent' => HASH(0x2e53a88)
  -> REUSED_ADDRESS
  'text' => '
'
  'type' => 'spaces_at_end'
  'parent' => HASH(0x3207e30)
  -> REUSED_ADDRESS
  'type' => 'misc_line_arg'
  'cmdname' => 'setfilename'
  'extra' => HASH(0x3208178)
  'spaces_after_command' => HASH(0x32080b8)
  -> REUSED_ADDRESS
  'text_arg' => 'simplest.info'
  'line_nr' => HASH(0x3202110)
  'file_name' => 'simplest.texi'
  'line_nr' => 3
  'macro' => ''
  'parent' => HASH(0x2eba650)
  -> REUSED_ADDRESS
  2 HASH(0x320cc18)
  'parent' => HASH(0x2eba650)
  -> REUSED_ADDRESS
  'text' => '
'
  'type' => 'empty_line'
  'parent' => HASH(0x3208208)
  -> REUSED_ADDRESS
  'type' => 'text_root'
  1 HASH(0x3208010)
  'args' => ARRAY(0x320ca98)
  0 HASH(0x3208058)
  'contents' => ARRAY(0x320c960)
  0 HASH(0x3207ef0)
  'extra' => HASH(0x320c6d8)
  'command' => HASH(0x3208010)
  -> REUSED_ADDRESS
  'parent' => HASH(0x3208058)
  -> REUSED_ADDRESS
  'text' => ' '
  'type' => 'empty_spaces_after_command'
  1 HASH(0x320cb10)
  'parent' => HASH(0x3208058)
  -> REUSED_ADDRESS
  'text' => 'Top'
  2 HASH(0x320f5f0)
  'parent' => HASH(0x3208058)
  -> REUSED_ADDRESS
  'text' => '
'
  'type' => 'spaces_at_end'
  'parent' => HASH(0x3208010)
  -> REUSED_ADDRESS
  'type' => 'misc_line_arg'
  'cmdname' => 'node'
  'contents' => ARRAY(0x3208040)
  0 HASH(0x320cd38)
  'parent' => HASH(0x3208010)
  -> REUSED_ADDRESS
  'text' => '
'
  'type' => 'empty_line'
  1 HASH(0x320ca08)
  'contents' => ARRAY(0x320f9b0)
  0 HASH(0x320f830)
  'parent' => HASH(0x320ca08)
  -> REUSED_ADDRESS
  'text' => 'This is a very simple texi manual '
  1 HASH(0x320ccf0)
  'cmdname' => ' '
  'parent' => HASH(0x320ca08)
  -> REUSED_ADDRESS
  2 HASH(0x320f890)
  'parent' => HASH(0x320ca08)
  -> REUSED_ADDRESS
  'text' => ' <>.
'
  'parent' => HASH(0x3208010)
  -> REUSED_ADDRESS
  'type' => 'paragraph'
  2 HASH(0x320fb48)
  'parent' => HASH(0x3208010)
  -> REUSED_ADDRESS
  'text' => '
'
  'type' => 'empty_line'
  'extra' => HASH(0x320c708)
  'node_content' => ARRAY(0x26b2010)
  0 HASH(0x320cb10)
  -> REUSED_ADDRESS
  'nodes_manuals' => ARRAY(0x320ca20)
  0 HASH(0x320f6e0)
  'node_content' => ARRAY(0x26b2010)
  -> REUSED_ADDRESS
  'normalized' => 'Top'
  'normalized' => 'Top'
  'spaces_after_command' => HASH(0x3207ef0)
  -> REUSED_ADDRESS
  'line_nr' => HASH(0x3207ff8)
  'file_name' => 'simplest.texi'
  'line_nr' => 5
  'macro' => ''
  'node_up' => HASH(0x3207aa0)
  'extra' => HASH(0x3210070)
  'manual_content' => ARRAY(0x320fc50)
  0 HASH(0x32101c0)
  'parent' => HASH(0x2ec3c58)
  'contents' => ARRAY(0x320fab8)
  0 HASH(0x3210208)
  'parent' => HASH(0x2ec3c58)
  -> REUSED_ADDRESS
  'text' => '(dir)'
  'type' => 'root_line'
  'text' => 'dir'
  'top_node_up' => HASH(0x3208010)
  -> REUSED_ADDRESS
  'type' => 'top_node_up'
  'parent' => HASH(0x3208208)
  -> REUSED_ADDRESS
  2 HASH(0x320fc38)
  'args' => ARRAY(0x320c930)
  0 HASH(0x320cd08)
  'parent' => HASH(0x320fc38)
  -> REUSED_ADDRESS
  'text' => '
'
  'type' => 'misc_arg'
  'cmdname' => 'bye'
  'parent' => HASH(0x3208208)
  -> REUSED_ADDRESS
  'type' => 'document_root'

@p5pRT
Copy link
Author

p5pRT commented Nov 7, 2016

@p5pRT
Copy link
Author

p5pRT commented Nov 7, 2016

From @jkeenan

On Mon, 07 Nov 2016 13​:47​:55 GMT, jkeenan wrote​:

The question is​: What is it about this the pattern​:

#####
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/
#####

... that (a) as of commit C<a5540cf> but not previously; and (b) in
the context of this test suite but not in isolation, perceives
something to be a read-only value not subject to modification?

My next brainstorm​: Add "use re 'debug';" to sub add_text() in tp/Texinfo/Convert/ParagraphNonXS.pm.

When I did so and ran the debugging program found in one of my previous attachments, I got this output​:

#####
Texinfo​::Convert​::ParagraphNonXS​::add_text(../../tp/Texinfo/Convert/ParagraphNonXS.pm​:329)​:
329​: my @​segments = split
330​: /([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
331​: $text;
  DB<6> n
Matching REx "([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFull"... against "This is "
  0 <> <This is > | 0| 1​:BRANCH(18)
  0 <> <This is > | 1| 2​:OPEN1(4)
  0 <> <This is > | 1| 4​:PLUS(16)
  | 1| ANYOF[\t\n\x0B\f\r \x85][1680 2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
  | 1| failed...
  0 <> <This is > | 0| 18​:BRANCH(34)
  0 <> <This is > | 1| 19​:OPEN2(21)
  0 <> <This is > | 1| 21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
  | 1| failed...
  0 <> <This is > | 0| 34​:BRANCH(68)
  0 <> <This is > | 1| 35​:OPEN3(37)
  0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
  0 <> <This is > | 2| 39​:BRANCH(51)
  0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r \x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680 2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
  Texinfo​::Convert​::ParagraphNonXS​::add_text(Texinfo​::Convert​::ParagraphNonXS=HASH(0x35ba938), "This is ") called at ../../tp/Texinfo/Convert/Info.pm line 308
  Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x35b3038)) called at ../../tp/Texinfo/Convert/Info.pm line 81
  Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x35b3038), HASH(0x35ab7c0)) called at ../texi2any.pl line 1348
panic​: POPSTACK
Debugged program terminated.
#####

Since I have never previously used the regex debugger, I have no idea if there are any clues to a solution in that output.

Thoughts?

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 7, 2016

From @jkeenan

On Mon, 07 Nov 2016 16​:29​:40 GMT, jkeenan wrote​:

On Mon, 07 Nov 2016 13​:47​:55 GMT, jkeenan wrote​:

The question is​: What is it about this the pattern​:

#####
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/
#####

... that (a) as of commit C<a5540cf> but not previously; and (b) in
the context of this test suite but not in isolation, perceives
something to be a read-only value not subject to modification?

My next brainstorm​: Add "use re 'debug';" to sub add_text() in
tp/Texinfo/Convert/ParagraphNonXS.pm.

When I did so and ran the debugging program found in one of my
previous attachments, I got this output​:

#####
Texinfo​::Convert​::ParagraphNonXS​::add_text(../../tp/Texinfo/Convert/ParagraphNonXS.pm​:329)​:
329​: my @​segments = split
330​:
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
331​: $text;
DB<6> n
Matching REx
"([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFull"...
against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680
2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1|
21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r
\x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680
2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at
../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
Texinfo​::Convert​::ParagraphNonXS​::add_text(Texinfo​::Convert​::ParagraphNonXS=HASH(0x35ba938),
"This is ") called at ../../tp/Texinfo/Convert/Info.pm line 308
Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x35b3038))
called at ../../tp/Texinfo/Convert/Info.pm line 81
Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x35b3038),
HASH(0x35ab7c0)) called at ../texi2any.pl line 1348
panic​: POPSTACK
Debugged program terminated.
#####

Since I have never previously used the regex debugger, I have no idea
if there are any clues to a solution in that output.

Compiling perl at what has been identified as the last good commit, and then running the test program through the debugger, I got much better output. It's quite long, so I'm posting it here​:

https://gist.github.com/jkeenan/184d2aaf914e4aa0410fe2ea1f36da91

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 7, 2016

From @khwilliamson

Top posting.

This smells like something to try valgrind on. Try

make test.valgrind

On 11/07/2016 05​:59 PM, James E Keenan via RT wrote​:

On Mon, 07 Nov 2016 16​:29​:40 GMT, jkeenan wrote​:

On Mon, 07 Nov 2016 13​:47​:55 GMT, jkeenan wrote​:

The question is​: What is it about this the pattern​:

#####
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/
#####

... that (a) as of commit C<a5540cf> but not previously; and (b) in
the context of this test suite but not in isolation, perceives
something to be a read-only value not subject to modification?
My next brainstorm​: Add "use re 'debug';" to sub add_text() in
tp/Texinfo/Convert/ParagraphNonXS.pm.

When I did so and ran the debugging program found in one of my
previous attachments, I got this output​:

#####
Texinfo​::Convert​::ParagraphNonXS​::add_text(../../tp/Texinfo/Convert/ParagraphNonXS.pm​:329)​:
329​: my @​segments = split
330​:
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
331​: $text;
DB<6> n
Matching REx
"([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFull"...
against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680
2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1|
21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r
\x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680
2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at
../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
Texinfo​::Convert​::ParagraphNonXS​::add_text(Texinfo​::Convert​::ParagraphNonXS=HASH(0x35ba938),
"This is ") called at ../../tp/Texinfo/Convert/Info.pm line 308
Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x35b3038))
called at ../../tp/Texinfo/Convert/Info.pm line 81
Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x35b3038),
HASH(0x35ab7c0)) called at ../texi2any.pl line 1348
panic​: POPSTACK
Debugged program terminated.
#####

Since I have never previously used the regex debugger, I have no idea
if there are any clues to a solution in that output.

Compiling perl at what has been identified as the last good commit, and then running the test program through the debugger, I got much better output. It's quite long, so I'm posting it here​:

https://gist.github.com/jkeenan/184d2aaf914e4aa0410fe2ea1f36da91

Thank you very much.

@p5pRT
Copy link
Author

p5pRT commented Nov 7, 2016

From @khwilliamson

On 11/07/2016 08​:34 PM, Karl Williamson wrote​:

Top posting.

This smells like something to try valgrind on. Try

make test.valgrind

You can use TEST_JOBS= to make it go faster if you have multiple cores

On 11/07/2016 05​:59 PM, James E Keenan via RT wrote​:

On Mon, 07 Nov 2016 16​:29​:40 GMT, jkeenan wrote​:

On Mon, 07 Nov 2016 13​:47​:55 GMT, jkeenan wrote​:

The question is​: What is it about this the pattern​:

#####
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/

#####

... that (a) as of commit C<a5540cf> but not previously; and (b) in
the context of this test suite but not in isolation, perceives
something to be a read-only value not subject to modification?
My next brainstorm​: Add "use re 'debug';" to sub add_text() in
tp/Texinfo/Convert/ParagraphNonXS.pm.

When I did so and ran the debugging program found in one of my
previous attachments, I got this output​:

#####
Texinfo​::Convert​::ParagraphNonXS​::add_text(../../tp/Texinfo/Convert/ParagraphNonXS.pm​:329)​:

329​: my @​segments = split
330​:
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,

331​: $text;
DB<6> n
Matching REx
"([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFull"...
against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680
2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1|
21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r
\x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680
2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at
../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
Texinfo​::Convert​::ParagraphNonXS​::add_text(Texinfo​::Convert​::ParagraphNonXS=HASH(0x35ba938),
"This is ") called at ../../tp/Texinfo/Convert/Info.pm line 308
Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x35b3038))
called at ../../tp/Texinfo/Convert/Info.pm line 81
Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x35b3038),
HASH(0x35ab7c0)) called at ../texi2any.pl line 1348
panic​: POPSTACK
Debugged program terminated.
#####

Since I have never previously used the regex debugger, I have no idea
if there are any clues to a solution in that output.

Compiling perl at what has been identified as the last good commit,
and then running the test program through the debugger, I got much
better output. It's quite long, so I'm posting it here​:

https://gist.github.com/jkeenan/184d2aaf914e4aa0410fe2ea1f36da91

Thank you very much.

@p5pRT
Copy link
Author

p5pRT commented Nov 7, 2016

From @demerphq

On 7 November 2016 at 17​:29, James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

On Mon, 07 Nov 2016 13​:47​:55 GMT, jkeenan wrote​:

The question is​: What is it about this the pattern​:

#####
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/
#####

... that (a) as of commit C<a5540cf> but not previously; and (b) in
the context of this test suite but not in isolation, perceives
something to be a read-only value not subject to modification?

My next brainstorm​: Add "use re 'debug';" to sub add_text() in tp/Texinfo/Convert/ParagraphNonXS.pm.

When I did so and ran the debugging program found in one of my previous attachments, I got this output​:

#####
Texinfo​::Convert​::ParagraphNonXS​::add_text(../../tp/Texinfo/Convert/ParagraphNonXS.pm​:329)​:
329​: my @​segments = split
330​: /([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
331​: $text;
DB<6> n
Matching REx "([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFull"... against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680 2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1| 21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r \x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680 2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
Texinfo​::Convert​::ParagraphNonXS​::add_text(Texinfo​::Convert​::ParagraphNonXS=HASH(0x35ba938), "This is ") called at ../../tp/Texinfo/Convert/Info.pm line 308
Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x35b3038)) called at ../../tp/Texinfo/Convert/Info.pm line 81
Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x35b3038), HASH(0x35ab7c0)) called at ../texi2any.pl line 1348
panic​: POPSTACK
Debugged program terminated.
#####

Since I have never previously used the regex debugger, I have no idea if there are any clues to a solution in that output.

Thoughts?

It looks like the pattern

[^\s\p{InFullwidth}]

causes this.

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2016

From @jkeenan

On Mon, 07 Nov 2016 16​:59​:37 GMT, jkeenan wrote​:

On Mon, 07 Nov 2016 16​:29​:40 GMT, jkeenan wrote​:

On Mon, 07 Nov 2016 13​:47​:55 GMT, jkeenan wrote​:

The question is​: What is it about this the pattern​:

#####
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/
#####

... that (a) as of commit C<a5540cf> but not previously; and (b) in
the context of this test suite but not in isolation, perceives
something to be a read-only value not subject to modification?

My next brainstorm​: Add "use re 'debug';" to sub add_text() in
tp/Texinfo/Convert/ParagraphNonXS.pm.

When I did so and ran the debugging program found in one of my
previous attachments, I got this output​:

#####
Texinfo​::Convert​::ParagraphNonXS​::add_text(../../tp/Texinfo/Convert/ParagraphNonXS.pm​:329)​:
329​: my @​segments = split
330​:
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
331​: $text;
DB<6> n
Matching REx
"([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFull"...
against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680
2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1|
21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r
\x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680
2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at
../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
Texinfo​::Convert​::ParagraphNonXS​::add_text(Texinfo​::Convert​::ParagraphNonXS=HASH(0x35ba938),
"This is ") called at ../../tp/Texinfo/Convert/Info.pm line 308
Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x35b3038))
called at ../../tp/Texinfo/Convert/Info.pm line 81
Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x35b3038),
HASH(0x35ab7c0)) called at ../texi2any.pl line 1348
panic​: POPSTACK
Debugged program terminated.
#####

Since I have never previously used the regex debugger, I have no idea
if there are any clues to a solution in that output.

Compiling perl at what has been identified as the last good commit,
and then running the test program through the debugger, I got much
better output. It's quite long, so I'm posting it here​:

https://gist.github.com/jkeenan/184d2aaf914e4aa0410fe2ea1f36da91

Thank you very much.

Yet another gist (actually, excerpts)​:

https://gist.github.com/jkeenan/faad48a7d3dfe0c40eab07684388edfb

At the first "bad" commit, I build perl with -DDEBUGGING. I reduced the invocation of the Perl script from within the texinfo test suite that I had been using to the minimum number of command-line switches that would still generate the panic. I got the output in the gist.

I got similar output when I used the Perl debugger and stepped through to the failure point and then, unlike previously I called 's' *into* the 'my @​segments' line. But, whatever debugging procedure I use, I always seem to end up with output like this​:

#####
Matching REx "([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)" against "This is "
  0 <> <This is > | 0| 1​:BRANCH(18)
  0 <> <This is > | 1| 2​:OPEN1(4)
  0 <> <This is > | 1| 4​:PLUS(16)
  | 1| ANYOF[\t\n\x0B\f\r \x85][1680 2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
  | 1| failed...
  0 <> <This is > | 0| 18​:BRANCH(34)
  0 <> <This is > | 1| 19​:OPEN2(21)
  0 <> <This is > | 1| 21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
  | 1| failed...
  0 <> <This is > | 0| 34​:BRANCH(68)
  0 <> <This is > | 1| 35​:OPEN3(37)
  0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
  0 <> <This is > | 2| 39​:BRANCH(51)
  0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r \x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680 2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at tp/../tp/Texinfo/Convert/ParagraphNonXS.pm line 328.
panic​: POPSTACK
#####

... with no insight into what the read-only value is and where the 'modification' is being attempted.

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2016

From @khwilliamson

On 11/08/2016 02​:12 AM, James E Keenan via RT wrote​:

On Mon, 07 Nov 2016 16​:59​:37 GMT, jkeenan wrote​:

On Mon, 07 Nov 2016 16​:29​:40 GMT, jkeenan wrote​:

On Mon, 07 Nov 2016 13​:47​:55 GMT, jkeenan wrote​:

The question is​: What is it about this the pattern​:

#####
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/
#####

... that (a) as of commit C<a5540cf> but not previously; and (b) in
the context of this test suite but not in isolation, perceives
something to be a read-only value not subject to modification?
My next brainstorm​: Add "use re 'debug';" to sub add_text() in
tp/Texinfo/Convert/ParagraphNonXS.pm.

When I did so and ran the debugging program found in one of my
previous attachments, I got this output​:

#####
Texinfo​::Convert​::ParagraphNonXS​::add_text(../../tp/Texinfo/Convert/ParagraphNonXS.pm​:329)​:
329​: my @​segments = split
330​:
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
331​: $text;
DB<6> n
Matching REx
"([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFull"...
against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680
2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1|
21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r
\x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680
2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at
../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
Texinfo​::Convert​::ParagraphNonXS​::add_text(Texinfo​::Convert​::ParagraphNonXS=HASH(0x35ba938),
"This is ") called at ../../tp/Texinfo/Convert/Info.pm line 308
Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x35b3038))
called at ../../tp/Texinfo/Convert/Info.pm line 81
Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x35b3038),
HASH(0x35ab7c0)) called at ../texi2any.pl line 1348
panic​: POPSTACK
Debugged program terminated.
#####

Since I have never previously used the regex debugger, I have no idea
if there are any clues to a solution in that output.

Compiling perl at what has been identified as the last good commit,
and then running the test program through the debugger, I got much
better output. It's quite long, so I'm posting it here​:

https://gist.github.com/jkeenan/184d2aaf914e4aa0410fe2ea1f36da91

Thank you very much.
Yet another gist (actually, excerpts)​:

https://gist.github.com/jkeenan/faad48a7d3dfe0c40eab07684388edfb

At the first "bad" commit, I build perl with -DDEBUGGING. I reduced the invocation of the Perl script from within the texinfo test suite that I had been using to the minimum number of command-line switches that would still generate the panic. I got the output in the gist.

I got similar output when I used the Perl debugger and stepped through to the failure point and then, unlike previously I called 's' *into* the 'my @​segments' line. But, whatever debugging procedure I use, I always seem to end up with output like this​:

#####
Matching REx "([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)" against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680 2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1| 21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r \x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680 2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at tp/../tp/Texinfo/Convert/ParagraphNonXS.pm line 328.
panic​: POPSTACK
#####

... with no insight into what the read-only value is and where the 'modification' is being attempted.

Thank you very much.

My valgrind suggestion didn't work. Try this patch to try to force a
core dump at the time of the panic

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2016

From @jkeenan

On Tue, 08 Nov 2016 01​:12​:25 GMT, jkeenan wrote​:

On Mon, 07 Nov 2016 16​:59​:37 GMT, jkeenan wrote​:

On Mon, 07 Nov 2016 16​:29​:40 GMT, jkeenan wrote​:

On Mon, 07 Nov 2016 13​:47​:55 GMT, jkeenan wrote​:

The question is​: What is it about this the pattern​:

#####
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/
#####

... that (a) as of commit C<a5540cf> but not previously; and (b)
in
the context of this test suite but not in isolation, perceives
something to be a read-only value not subject to modification?

My next brainstorm​: Add "use re 'debug';" to sub add_text() in
tp/Texinfo/Convert/ParagraphNonXS.pm.

When I did so and ran the debugging program found in one of my
previous attachments, I got this output​:

#####
Texinfo​::Convert​::ParagraphNonXS​::add_text(../../tp/Texinfo/Convert/ParagraphNonXS.pm​:329)​:
329​: my @​segments = split
330​:
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
331​: $text;
DB<6> n
Matching REx
"([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFull"...
against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680
2000-200A 2028-2029 205F 3000] can match 0 times out of
2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1|
21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r
\x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680
2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at
../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
Texinfo​::Convert​::ParagraphNonXS​::add_text(Texinfo​::Convert​::ParagraphNonXS=HASH(0x35ba938),
"This is ") called at ../../tp/Texinfo/Convert/Info.pm line 308
Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x35b3038))
called at ../../tp/Texinfo/Convert/Info.pm line 81
Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x35b3038),
HASH(0x35ab7c0)) called at ../texi2any.pl line 1348
panic​: POPSTACK
Debugged program terminated.
#####

Since I have never previously used the regex debugger, I have no
idea
if there are any clues to a solution in that output.

Compiling perl at what has been identified as the last good commit,
and then running the test program through the debugger, I got much
better output. It's quite long, so I'm posting it here​:

https://gist.github.com/jkeenan/184d2aaf914e4aa0410fe2ea1f36da91

Thank you very much.

Yet another gist (actually, excerpts)​:

https://gist.github.com/jkeenan/faad48a7d3dfe0c40eab07684388edfb

At the first "bad" commit, I build perl with -DDEBUGGING. I reduced
the invocation of the Perl script from within the texinfo test suite
that I had been using to the minimum number of command-line switches
that would still generate the panic. I got the output in the gist.

I got similar output when I used the Perl debugger and stepped through
to the failure point and then, unlike previously I called 's' *into*
the 'my @​segments' line. But, whatever debugging procedure I use, I
always seem to end up with output like this​:

#####
Matching REx
"([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)"
against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680
2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1|
21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r
\x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680
2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at
tp/../tp/Texinfo/Convert/ParagraphNonXS.pm line 328.
panic​: POPSTACK
#####

... with no insight into what the read-only value is and where the
'modification' is being attempted.

Thank you very much.

One thing I forgot to mention earlier. If you step through the program with the Perl debugger and, when you come to this critical line​:

#####
my @​segments = split
  /([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
  $text;
#####
... and type 's' rather than 'n' (which would immediately trigger the panic), you step into the mysterious world of utf8_heavy.pl and its subroutine STASHNEW(). You eventually get to a point where you have this structure​:
#####
  DB<24> x $SWASH
0 utf8=HASH(0x3b8e5f0)
  'BITS' => 1
  'EXTRAS' => '# comment
+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth
'
  'LIST' => ''
  'NONE' => 0
  'TYPE' => ''
  'USER_DEFINED' => 1
  'utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth' => utf8=HASH(0x3b8dc90)
  'BITS' => 1
  'EXTRAS' => ''
  'LIST' => "1100\cI115F\cJ2329\cI232A\cJ2E80\cI2FFB\cJ3000\cI3000\cJ3001\cI303E\cJ3041\cI33FF\cJ3400\cI4DB5\cJ4E00\cI9FBB\cJA000\cIA4C6\cJAC00\cID7A3\cJF900\cIFAD9\cJFE10\cIFE19\cJFE30\cIFE6B\cJFF01\cIFF60\cJFFE0\cIFFE6\cJ20000\cI2A6D6\cJ2A6D7\cI2F7FF\cJ2F800\cI2FA1D\cJ2FA1E\cI2FFFD\cJ30000\cI3FFFD\cJ"
  'NONE' => 0
  'TYPE' => 'InFullwidth'
  'USER_DEFINED' => 1
#####
... and it is at the *2nd* time you arrive at 'return $SWASH' that the panic occurs.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2016

From @jkeenan

On Tue, 08 Nov 2016 01​:21​:54 GMT, khw wrote​:

My valgrind suggestion didn't work. Try this patch to try to force a
core dump at the time of the panic

Patch apparently not attached.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2016

From @jkeenan

On Mon, 07 Nov 2016 22​:14​:43 GMT, demerphq wrote​:

On 7 November 2016 at 17​:29, James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

On Mon, 07 Nov 2016 13​:47​:55 GMT, jkeenan wrote​:

The question is​: What is it about this the pattern​:

#####
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/
#####

... that (a) as of commit C<a5540cf> but not previously; and (b) in
the context of this test suite but not in isolation, perceives
something to be a read-only value not subject to modification?

My next brainstorm​: Add "use re 'debug';" to sub add_text() in
tp/Texinfo/Convert/ParagraphNonXS.pm.

When I did so and ran the debugging program found in one of my
previous attachments, I got this output​:

#####
Texinfo​::Convert​::ParagraphNonXS​::add_text(../../tp/Texinfo/Convert/ParagraphNonXS.pm​:329)​:
329​: my @​segments = split
330​:
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
331​: $text;
DB<6> n
Matching REx
"([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFull"...
against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680
| 2000-200A 2028-2029 205F 3000] can
| match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1|
21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r
\x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680
2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at
../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
Texinfo​::Convert​::ParagraphNonXS​::add_text(Texinfo​::Convert​::ParagraphNonXS=HASH(0x35ba938),
"This is ") called at ../../tp/Texinfo/Convert/Info.pm line 308
Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x35b3038))
called at ../../tp/Texinfo/Convert/Info.pm line 81
Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x35b3038),
HASH(0x35ab7c0)) called at ../texi2any.pl line 1348
panic​: POPSTACK
Debugged program terminated.
#####

Since I have never previously used the regex debugger, I have no idea
if there are any clues to a solution in that output.

Thoughts?

It looks like the pattern

[^\s\p{InFullwidth}]

causes this.

Yves

Very likely. When, at an earlier stage of debugging, I removed the two parts of the pattern that contained '\p{InFullwidth}', the panic disappeared.

However, that pattern was not panicking up until commit a5540cf. And, even with a5540cf and commits thereafter (e.g., HEAD of blead), I can write a very simple perl program that has that pattern and not get a panic.

It's something about the interaction of commit a5540cf, that pattern, and the code in texinfo that is the problem.

Thanks for taking a look at this.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2016

From @khwilliamson

On 11/08/2016 04​:37 AM, James E Keenan via RT wrote​:

On Tue, 08 Nov 2016 01​:21​:54 GMT, khw wrote​:

My valgrind suggestion didn't work. Try this patch to try to force a
core dump at the time of the panic
Patch apparently not attached.

Sorry. but here it is.

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2016

From @khwilliamson

0001-Add-an-assert-to-try-to-get-a-stack-trace.patch
From b76ea3d255eb3b4b8a89dd36ce3c0f1ce56f2efe Mon Sep 17 00:00:00 2001
From: Karl Williamson <khw@cpan.org>
Date: Tue, 8 Nov 2016 02:19:41 +0100
Subject: [PATCH] Add an assert to try to get a stack trace

---
 util.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/util.c b/util.c
index a69ddad..f906c8b 100644
--- a/util.c
+++ b/util.c
@@ -1877,6 +1877,7 @@ paths reduces CPU cache pressure.
 void
 Perl_croak_no_modify(void)
 {
+    assert(0);
     Perl_croak_nocontext( "%s", PL_no_modify);
 }
 
-- 
2.9.3

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2016

From @jkeenan

On Tue, 08 Nov 2016 07​:38​:12 GMT, khw wrote​:

On 11/08/2016 04​:37 AM, James E Keenan via RT wrote​:

On Tue, 08 Nov 2016 01​:21​:54 GMT, khw wrote​:

My valgrind suggestion didn't work. Try this patch to try to force a
core dump at the time of the panic
Patch apparently not attached.

Sorry. but here it is.

Output is large; posted here​:

https://gist.github.com/jkeenan/c47a1938a668192c26e869423d9c1f66/raw/d7beb872f83dda578dba1bfaee9a355bce042023/post-khw-assert.txt

But I'm not sure that told us anything new. Here's the tail of the output​:

#####
Matching REx "([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)" against "This is "
  0 <> <This is > | 0| 1​:BRANCH(18)
  0 <> <This is > | 1| 2​:OPEN1(4)
  0 <> <This is > | 1| 4​:PLUS(16)
  | 1| ANYOF[\t\n\x0B\f\r \x85][1680 2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
  | 1| failed...
  0 <> <This is > | 0| 18​:BRANCH(34)
  0 <> <This is > | 1| 19​:OPEN2(21)
  0 <> <This is > | 1| 21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
  | 1| failed...
  0 <> <This is > | 0| 34​:BRANCH(68)
  0 <> <This is > | 1| 35​:OPEN3(37)
  0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
  0 <> <This is > | 2| 39​:BRANCH(51)
  0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r \x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680 2000-200A 2028-2029 202F 205F 3000](64)
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted (core dumped)
#####

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2016

From @jkeenan

On Tue, 08 Nov 2016 12​:20​:37 GMT, jkeenan wrote​:

On Tue, 08 Nov 2016 07​:38​:12 GMT, khw wrote​:

On 11/08/2016 04​:37 AM, James E Keenan via RT wrote​:

On Tue, 08 Nov 2016 01​:21​:54 GMT, khw wrote​:

My valgrind suggestion didn't work. Try this patch to try to
force a
core dump at the time of the panic
Patch apparently not attached.

Sorry. but here it is.

Output is large; posted here​:

For reference, I pushed a branch​:

jkeenan/a5540cf-amended

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2016

From @khwilliamson

On 11/08/2016 01​:20 PM, James E Keenan via RT wrote​:

On Tue, 08 Nov 2016 07​:38​:12 GMT, khw wrote​:

On 11/08/2016 04​:37 AM, James E Keenan via RT wrote​:

On Tue, 08 Nov 2016 01​:21​:54 GMT, khw wrote​:

My valgrind suggestion didn't work. Try this patch to try to force a
core dump at the time of the panic
Patch apparently not attached.

Sorry. but here it is.
Output is large; posted here​:

https://gist.github.com/jkeenan/c47a1938a668192c26e869423d9c1f66/raw/d7beb872f83dda578dba1bfaee9a355bce042023/post-khw-assert.txt

But I'm not sure that told us anything new. Here's the tail of the output​:

#####
Matching REx "([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)" against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680 2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1| 21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r \x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680 2000-200A 2028-2029 202F 205F 3000](64)
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted (core dumped)
This will help if it actually dumped a core; examining the core file
should allow you to get a stack trace. They used to be named just
'core', but maybe it is something else besides nowadays, like perl.core
or core.perl. I do it so infrequently I have to look it up each time.
Be sure your perl is compiled with -DDEBUGGING, and -D'optimize=-ggdb3'
-A'optimize=-ggdb3' -A'optimize=-O0'
That will make the back trace more readable.

It looks like the man page of gdb says you can say

gdb /path/to/perl /path/to/core

and then use the bt command to print the stack.

If this doesn't work, someone on irc can advise you

#####

Thank you very much.

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @jkeenan

On Tue, 08 Nov 2016 03​:44​:25 GMT, jkeenan wrote​:

On Mon, 07 Nov 2016 22​:14​:43 GMT, demerphq wrote​:

On 7 November 2016 at 17​:29, James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

On Mon, 07 Nov 2016 13​:47​:55 GMT, jkeenan wrote​:

The question is​: What is it about this the pattern​:

#####
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/
#####

... that (a) as of commit C<a5540cf> but not previously; and (b)
in
the context of this test suite but not in isolation, perceives
something to be a read-only value not subject to modification?

My next brainstorm​: Add "use re 'debug';" to sub add_text() in
tp/Texinfo/Convert/ParagraphNonXS.pm.

When I did so and ran the debugging program found in one of my
previous attachments, I got this output​:

#####
Texinfo​::Convert​::ParagraphNonXS​::add_text(../../tp/Texinfo/Convert/ParagraphNonXS.pm​:329)​:
329​: my @​segments = split
330​:
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
331​: $text;
DB<6> n
Matching REx
"([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFull"...
against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680
| 2000-200A 2028-2029 205F 3000] can
| match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1|
21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r
\x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680
2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at
../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
Texinfo​::Convert​::ParagraphNonXS​::add_text(Texinfo​::Convert​::ParagraphNonXS=HASH(0x35ba938),
"This is ") called at ../../tp/Texinfo/Convert/Info.pm line 308
Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x35b3038))
called at ../../tp/Texinfo/Convert/Info.pm line 81
Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x35b3038),
HASH(0x35ab7c0)) called at ../texi2any.pl line 1348
panic​: POPSTACK
Debugged program terminated.
#####

Since I have never previously used the regex debugger, I have no
idea
if there are any clues to a solution in that output.

Thoughts?

It looks like the pattern

[^\s\p{InFullwidth}]

causes this.

Yves

Very likely. When, at an earlier stage of debugging, I removed the
two parts of the pattern that contained '\p{InFullwidth}', the panic
disappeared.

However, that pattern was not panicking up until commit a5540cf. And,
even with a5540cf and commits thereafter (e.g., HEAD of blead), I can
write a very simple perl program that has that pattern and not get a
panic.

It's something about the interaction of commit a5540cf, that pattern,
and the code in texinfo that is the problem.

Thanks for taking a look at this.

I am attaching a program, 'pseudo_east_asian_width.pl', which may be useful in resolving this problem.

Once we solve the problem, we have to be able to write tests that demonstrate that solution and fail in the event of regressions. To write such a test in the core distribution we won't be able to say​:

#####
use Unicode​::EastAsianWidth;
#####

... but we will need to simulate enough of that module's functionality to write a realistic test.

In the attachment, 'package gamma' (the name is purely provisional) simulates much of Unicode​::EastAsianWidth. 'package main' exercises gamma's 'InPseudoFullwidth' user-defined regex property in a similar (I think) to what Unicode​::EastAsianWidth does. If I go into the two parts of the texinfo library where mat and I have identified failures -- Texinfo​::Convert​::Line and Texinfo​::Convert​::ParagraphNonXS -- and replace calls to Unicode​::EastAsianWidth with calls to gamma and then call the texi2any.pl program with the perl from the "bad" commit point, then I can reproduce the panic condition.

I haven't yet been able to generate the panic condition outside the texi2any.pl program, though.

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @jkeenan

On Tue, 08 Nov 2016 22​:16​:57 GMT, khw wrote​:

On 11/08/2016 01​:20 PM, James E Keenan via RT wrote​:

On Tue, 08 Nov 2016 07​:38​:12 GMT, khw wrote​:

On 11/08/2016 04​:37 AM, James E Keenan via RT wrote​:

On Tue, 08 Nov 2016 01​:21​:54 GMT, khw wrote​:

My valgrind suggestion didn't work. Try this patch to try to
force a
core dump at the time of the panic
Patch apparently not attached.

Sorry. but here it is.
Output is large; posted here​:

https://gist.github.com/jkeenan/c47a1938a668192c26e869423d9c1f66/raw/d7beb872f83dda578dba1bfaee9a355bce042023/post-
khw-assert.txt

But I'm not sure that told us anything new. Here's the tail of the
output​:

#####
Matching REx
"([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)"
against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680
2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1|
21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r
\x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680
2000-200A 2028-2029 202F 205F 3000](64)
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted (core dumped)
This will help if it actually dumped a core; examining the core file
should allow you to get a stack trace. They used to be named just
'core', but maybe it is something else besides nowadays, like
perl.core
or core.perl. I do it so infrequently I have to look it up each
time.
Be sure your perl is compiled with -DDEBUGGING, and -D'optimize=-
ggdb3'
-A'optimize=-ggdb3' -A'optimize=-O0'
That will make the back trace more readable.

I followed those instructions but, once again, notwithstanding the "core dumped" message, I could not locate any such core dump file.

I issued a 'find' with a '-cmin 5' and came up with nothing.

I know the kind of file you were expecting because in our FreeBSD-11.0 lib/locale.t problems (remember them?) I get 't/perl.core' all the time.

It looks like the man page of gdb says you can say

gdb /path/to/perl /path/to/core

and then use the bt command to print the stack.

If this doesn't work, someone on irc can advise you

#####

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @demerphq

On 7 Nov 2016 17​:29, "James E Keenan via RT" <perlbug-followup@​perl.org>
wrote​:

On Mon, 07 Nov 2016 13​:47​:55 GMT, jkeenan wrote​:

The question is​: What is it about this the pattern​:

#####
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{
InFullwidth}]|[\x{202f}\x{00a0}])+)/
#####

... that (a) as of commit C<a5540cf> but not previously; and (b) in
the context of this test suite but not in isolation, perceives
something to be a read-only value not subject to modification?

My next brainstorm​: Add "use re 'debug';" to sub add_text() in
tp/Texinfo/Convert/ParagraphNonXS.pm.

When I did so and ran the debugging program found in one of my previous
attachments, I got this output​:

#####
Texinfo​::Convert​::ParagraphNonXS​::add_text(../../tp/Texinfo/Convert/
ParagraphNonXS.pm​:329)​:
329​: my @​segments = split
330​: /([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{
InFullwidth}]|[\x{202f}\x{00a0}])+)/,
331​: $text;
DB<6> n
Matching REx "([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFull"...
against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680
2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1| 21​:ANYOF[+utf8​::Texinfo​::
Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r
\x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680
2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at ../../tp/Texinfo/Convert/ParagraphNonXS.pm
line 329.
at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
Texinfo​::Convert​::ParagraphNonXS​::add_text(Texinfo​::Convert​::
ParagraphNonXS=HASH(0x35ba938), "This is ") called at
../../tp/Texinfo/Convert/Info.pm line 308
Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x35b3038))
called at ../../tp/Texinfo/Convert/Info.pm line 81
Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x35b3038),
HASH(0x35ab7c0)) called at ../texi2any.pl line 1348
panic​: POPSTACK
Debugged program terminated.
#####

Since I have never previously used the regex debugger, I have no idea if
there are any clues to a solution in that output.

Thoughts?

Have you tried putting a breakpoint or assert on the code that issues the
error message about an attempt to modify a RO value, and then used gdb to
get a stack trace?

If its a macro that you arent sure where it gets called, put an assert that
will fail, like assert(0); in where the code to print the message gets
called, and then run under gdb.

Once you have a stack-trace we will be interested in whatever happens in or
after the call to reginclass().

Once you have that you should be able to add a sv_dump() call to the right
place to see the var that is triggering the RO. (You can do this right
before the assert(0); call).

BTW, from what i can tell from the re 'debug' output, the error comes from
trying to fulfil

[^\s\p{InFullwidth}]

which is the third item in that alternation. So it seems like the issue is
somehow related to negating a charclass, possibly in negating the return
from InFullWidth. I was unaware that \p{} could execute at run time, I
thought it would be resolved at compile time and baked into the char-class.
So I would also look for things in InFullWidth that might not work if
called twice. Does InFullWidth somehow return constant values?

Anyway, I have not managed to find a copy of InFullWidth to look at, nor
the code that calls this. But I would use Devel​::Peek sv_dump(), gdb, and
the techniques above to dig into this if I were debugging it. I am not sure
where the core comes in, i have never really debugged a core dump.

Another thing I would do is twiddle that charclass. If you change it to

[^ \t\n\r\p{InFullWidth}]

does the bug go away?

What happens if you remove all the \p{InFullWidth} from that pattern?

No doubt you are way beyond any of the advice here, but i thought i would
write this anyway.

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @khwilliamson

On 11/09/2016 09​:54 AM, demerphq wrote​:

On 7 Nov 2016 17​:29, "James E Keenan via RT"
<perlbug-followup@​perl.org <mailto​:perlbug-followup@​perl.org>> wrote​:

On Mon, 07 Nov 2016 13​:47​:55 GMT, jkeenan wrote​:

The question is​: What is it about this the pattern​:

#####

/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/

#####

... that (a) as of commit C<a5540cf> but not previously; and (b) in
the context of this test suite but not in isolation, perceives
something to be a read-only value not subject to modification?

My next brainstorm​: Add "use re 'debug';" to sub add_text() in
tp/Texinfo/Convert/ParagraphNonXS.pm.

When I did so and ran the debugging program found in one of my
previous attachments, I got this output​:

#####

Texinfo​::Convert​::ParagraphNonXS​::add_text(../../tp/Texinfo/Convert/ParagraphNonXS.pm​:329)​:

329​: my @​segments = split
330​:
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
331​: $text;
DB<6> n
Matching REx
"([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFull"...
against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680
2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1|
21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r
\x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680
2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at
../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.

Texinfo​::Convert​::ParagraphNonXS​::add_text(Texinfo​::Convert​::ParagraphNonXS=HASH(0x35ba938),
"This is ") called at ../../tp/Texinfo/Convert/Info.pm line 308

Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x35b3038))
called at ../../tp/Texinfo/Convert/Info.pm line 81

Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x35b3038),
HASH(0x35ab7c0)) called at ../texi2any.pl <http​://texi2any.pl> line 1348

panic​: POPSTACK
Debugged program terminated.
#####

Since I have never previously used the regex debugger, I have no
idea if there are any clues to a solution in that output.

Thoughts?

Have you tried putting a breakpoint or assert on the code that issues
the error message about an attempt to modify a RO value, and then used
gdb to get a stack trace?

If its a macro that you arent sure where it gets called, put an assert
that will fail, like assert(0); in where the code to print the message
gets called, and then run under gdb.

Once you have a stack-trace we will be interested in whatever happens
in or after the call to reginclass().

Once you have that you should be able to add a sv_dump() call to the
right place to see the var that is triggering the RO. (You can do this
right before the assert(0); call).

BTW, from what i can tell from the re 'debug' output, the error comes
from trying to fulfil

[^\s\p{InFullwidth}]

which is the third item in that alternation. So it seems like the
issue is somehow related to negating a charclass, possibly in negating
the return from InFullWidth. I was unaware that \p{} could execute at
run time, I thought it would be resolved at compile time and baked
into the char-class. So I would also look for things in InFullWidth
that might not work if called twice. Does InFullWidth somehow return
constant values?

Anyway, I have not managed to find a copy of InFullWidth to look at,
nor the code that calls this. But I would use Devel​::Peek sv_dump(),
gdb, and the techniques above to dig into this if I were debugging it.
I am not sure where the core comes in, i have never really debugged a
core dump.

Another thing I would do is twiddle that charclass. If you change it to

[^ \t\n\r\p{InFullWidth}]

does the bug go away?

What happens if you remove all the \p{InFullWidth} from that pattern?

No doubt you are way beyond any of the advice here, but i thought i
would write this anyway.

Yves

User-defined \p properties, which InFullWidth is, that aren't available
at compile time are expanded at run time. It used to be that all \p
were done at run time. The problem here likely is that the changes in
the blamed commit are trying to modify a swash byproduct that has been
set read-only. I have looked at the code, and don't see how that is
happening, which is why a stack trace is needed. So, getting that back
trace is the best way forward.

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @jkeenan

On Wed, 09 Nov 2016 09​:13​:47 GMT, khw wrote​:

On 11/09/2016 09​:54 AM, demerphq wrote​:

On 7 Nov 2016 17​:29, "James E Keenan via RT"
<perlbug-followup@​perl.org <mailto​:perlbug-followup@​perl.org>> wrote​:

On Mon, 07 Nov 2016 13​:47​:55 GMT, jkeenan wrote​:

The question is​: What is it about this the pattern​:

#####

/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/

#####

... that (a) as of commit C<a5540cf> but not previously; and (b)
in
the context of this test suite but not in isolation, perceives
something to be a read-only value not subject to modification?

My next brainstorm​: Add "use re 'debug';" to sub add_text() in
tp/Texinfo/Convert/ParagraphNonXS.pm.

When I did so and ran the debugging program found in one of my
previous attachments, I got this output​:

#####

Texinfo​::Convert​::ParagraphNonXS​::add_text(../../tp/Texinfo/Convert/ParagraphNonXS.pm​:329)​:

329​: my @​segments = split
330​:
/([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
331​: $text;
DB<6> n
Matching REx
"([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?​:[^\s\p{InFull"...
against "This is "
0 <> <This is > | 0| 1​:BRANCH(18)
0 <> <This is > | 1| 2​:OPEN1(4)
0 <> <This is > | 1| 4​:PLUS(16)
| 1| ANYOF[\t\n\x0B\f\r \x85][1680
2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
| 1| failed...
0 <> <This is > | 0| 18​:BRANCH(34)
0 <> <This is > | 1| 19​:OPEN2(21)
0 <> <This is > | 1|
21​:ANYOF[+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth](32)
| 1| failed...
0 <> <This is > | 0| 34​:BRANCH(68)
0 <> <This is > | 1| 35​:OPEN3(37)
0 <> <This is > | 1| 37​:CURLYM[0]{1,INFTY}(66)
0 <> <This is > | 2| 39​:BRANCH(51)
0 <> <This is > | 3| 40​:ANYOF[^\t\n\x0B\f\r
\x85\xA0{+utf8​::Texinfo​::Convert​::ParagraphNonXS​::InFullwidth}1680
2000-200A 2028-2029 202F 205F 3000](64)
Modification of a read-only value attempted at
../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.

Texinfo​::Convert​::ParagraphNonXS​::add_text(Texinfo​::Convert​::ParagraphNonXS=HASH(0x35ba938),
"This is ") called at ../../tp/Texinfo/Convert/Info.pm line 308

Texinfo​::Convert​::Info​::_info_header(Texinfo​::Convert​::Info=HASH(0x35b3038))
called at ../../tp/Texinfo/Convert/Info.pm line 81

Texinfo​::Convert​::Info​::output(Texinfo​::Convert​::Info=HASH(0x35b3038),
HASH(0x35ab7c0)) called at ../texi2any.pl <http​://texi2any.pl> line
1348

panic​: POPSTACK
Debugged program terminated.
#####

Since I have never previously used the regex debugger, I have no
idea if there are any clues to a solution in that output.

Thoughts?

Have you tried putting a breakpoint or assert on the code that issues
the error message about an attempt to modify a RO value, and then
used
gdb to get a stack trace?

If its a macro that you arent sure where it gets called, put an
assert
that will fail, like assert(0); in where the code to print the
message
gets called, and then run under gdb.

Once you have a stack-trace we will be interested in whatever happens
in or after the call to reginclass().

Once you have that you should be able to add a sv_dump() call to the
right place to see the var that is triggering the RO. (You can do
this
right before the assert(0); call).

BTW, from what i can tell from the re 'debug' output, the error comes
from trying to fulfil

[^\s\p{InFullwidth}]

which is the third item in that alternation. So it seems like the
issue is somehow related to negating a charclass, possibly in
negating
the return from InFullWidth. I was unaware that \p{} could execute
at
run time, I thought it would be resolved at compile time and baked
into the char-class. So I would also look for things in InFullWidth
that might not work if called twice. Does InFullWidth somehow return
constant values?

Anyway, I have not managed to find a copy of InFullWidth to look at,
nor the code that calls this. But I would use Devel​::Peek sv_dump(),
gdb, and the techniques above to dig into this if I were debugging
it.
I am not sure where the core comes in, i have never really debugged a
core dump.

Another thing I would do is twiddle that charclass. If you change it
to

[^ \t\n\r\p{InFullWidth}]

does the bug go away?

What happens if you remove all the \p{InFullWidth} from that pattern?

No doubt you are way beyond any of the advice here, but i thought i
would write this anyway.

Yves

User-defined \p properties, which InFullWidth is, that aren't
available
at compile time are expanded at run time. It used to be that all \p
were done at run time. The problem here likely is that the changes in
the blamed commit are trying to modify a swash byproduct that has been
set read-only. I have looked at the code, and don't see how that is
happening, which is why a stack trace is needed. So, getting that
back
trace is the best way forward.

Once we're into the world of gdb and backtrace, we're at a point where people more fluent in those techniques than me should take over.

To facilitate this, I have placed a copy of the texinfo-6.1 source code on github.com​:

https://github.com/jkeenan/texinfo-p5p

In a local clone of this repository you can create branches with debugging code to your heart's content.

For suggestions as to how to proceed, see https://github.com/jkeenan/texinfo-p5p/blob/master/130010_debugging_procedure.pod.

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @hvds

I built and installed blead@​392582f8 with './Configure -des -Dcc=gcc -Dprefix=/opt/blead-d0 -Doptimize='-g -O0' -DDEBUGGING -Dusedevel -Uversiononly'.

On attempting the texinfo build, the first point it gives the error can be simplified to this command​:
% ( cd doc && TEXINFO_DEV_SOURCE=1 top_srcdir=".." top_builddir=".." ${PERL} ../tp/texi2any -I . -o texinfo.info texinfo.texi )
Modification of a read-only value attempted at ../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
panic​: POPSTACK
%

After a quick grep for the error message (found in perl.h) and then for PL_no_modify, I rebuilt and installed the perl with this patch​:
--- a/util.c
+++ b/util.c
@​@​ -1877,6 +1877,7 @​@​ paths reduces CPU cache pressure.
void
Perl_croak_no_modify(void)
{
+ assert(0);
  Perl_croak_nocontext( "%s", PL_no_modify);
}

On reinvoking the above command, I get a core dump with the backtrace below. I'll try to cut down the repro case, but the backtrace might allow others to make some progress in the meantime.

Hugo

(gdb) where
#0 0x00007f60916d2cc9 in __GI_raise (sig=sig@​entry=6)
  at ../nptl/sysdeps/unix/sysv/linux/raise.c​:56
#1 0x00007f60916d60d8 in __GI_abort () at abort.c​:89
#2 0x00007f60916cbb86 in __assert_fail_base (
  fmt=0x7f609181c830 "%s%s%s​:%u​: %s%sAssertion `%s' failed.\n%n",
  assertion=assertion@​entry=0x7de9ce "0", file=file@​entry=0x7dcd77 "util.c",
  line=line@​entry=1880,
  function=function@​entry=0x7e25b0 <__PRETTY_FUNCTION__.15620> "Perl_croak_no_modify") at assert.c​:92
#3 0x00007f60916cbc32 in __GI___assert_fail (assertion=0x7de9ce "0",
  file=0x7dcd77 "util.c", line=1880,
  function=0x7e25b0 <__PRETTY_FUNCTION__.15620> "Perl_croak_no_modify")
  at assert.c​:101
#4 0x000000000056162e in Perl_croak_no_modify () at util.c​:1880
#5 0x00000000005ed6f1 in Perl_sv_force_normal_flags (sv=0x712f998, flags=4)
  at sv.c​:5248
#6 0x00000000005ec24f in Perl_sv_usepvn_flags (sv=0x712f998,
  ptr=0x7134ac0 "", len=328, flags=256) at sv.c​:5087
#7 0x000000000050afb6 in S_invlist_replace_list_destroys_src (dest=0x712f998,
  src=0x712f9b0) at regcomp.c​:8492
#8 0x000000000050d26e in Perl__invlist_union_maybe_complement_2nd (
  a=0x384be30, b=0x712f998, complement_b=false, output=0x7fff2ed01748)
  at regcomp.c​:9243
#9 0x00000000006fa6d9 in Perl__core_swash_init (pkg=0x7d024a "utf8",
  name=0x7c5966 "", listsv=0x384bdb8, minbits=1, none=0, invlist=0x384be30,
  flags_p=0x7fff2ed01990 "\005") at utf8.c​:3401
#10 0x000000000053b7df in Perl__get_regclass_nonbitmap_data (prog=0x38460b0,
  node=0x3854e5c, doinit=true, listsvp=0x0,
  only_utf8_locale_ptr=0x7fff2ed01ad0, output_invlist=0x0) at regcomp.c​:18080
#11 0x00000000006f0464 in S_reginclass (prog=0x38460b0, n=0x3854e5c,
  p=0x2cf63a0 "This is ", p_end=0x2cf63a1 "his is ", utf8_target=false)
  at regexec.c​:9338
#12 0x00000000006e33e6 in S_regmatch (reginfo=0x7fff2ed024a0,
  startpos=0x2cf63a0 "This is ", prog=0x3854dc0) at regexec.c​:6342
#13 0x00000000006d8d83 in S_regtry (reginfo=0x7fff2ed024a0,
  startposp=0x7fff2ed02308) at regexec.c​:3641
#14 0x00000000006d87fb in Perl_regexec_flags (rx=0x384bc80,
  stringarg=0x2cf63a0 "This is ", strend=0x2cf63a8 "",
  strbeg=0x2cf63a0 "This is ", minend=1, sv=0x384bbc0, data=0x0, flags=0)
  at regexec.c​:3498
#15 0x000000000064c198 in Perl_pp_split () at pp.c​:6022
#16 0x000000000055acf1 in Perl_runops_debug () at dump.c​:2249
#17 0x0000000000462aa2 in S_run_body (oldscope=1) at perl.c​:2526
#18 0x0000000000462093 in perl_run (my_perl=0x26cf010) at perl.c​:2449
#19 0x000000000041ef95 in main (argc=7, argv=0x7fff2ed029e8,
  env=0x7fff2ed02a28) at perlmain.c​:123

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @demerphq

On 9 November 2016 at 17​:22, Hugo van der Sanden via RT
<perlbug-followup@​perl.org> wrote​:

I built and installed blead@​392582f8 with './Configure -des -Dcc=gcc -Dprefix=/opt/blead-d0 -Doptimize='-g -O0' -DDEBUGGING -Dusedevel -Uversiononly'.

On attempting the texinfo build, the first point it gives the error can be simplified to this command​:
% ( cd doc && TEXINFO_DEV_SOURCE=1 top_srcdir=".." top_builddir=".." ${PERL} ../tp/texi2any -I . -o texinfo.info texinfo.texi )
Modification of a read-only value attempted at ../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
panic​: POPSTACK
%

After a quick grep for the error message (found in perl.h) and then for PL_no_modify, I rebuilt and installed the perl with this patch​:
--- a/util.c
+++ b/util.c
@​@​ -1877,6 +1877,7 @​@​ paths reduces CPU cache pressure.
void
Perl_croak_no_modify(void)
{
+ assert(0);
Perl_croak_nocontext( "%s", PL_no_modify);
}

On reinvoking the above command, I get a core dump with the backtrace below. I'll try to cut down the repro case, but the backtrace might allow others to make some progress in the meantime.

Hugo

(gdb) where
#0 0x00007f60916d2cc9 in __GI_raise (sig=sig@​entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c​:56
#1 0x00007f60916d60d8 in __GI_abort () at abort.c​:89
#2 0x00007f60916cbb86 in __assert_fail_base (
fmt=0x7f609181c830 "%s%s%s​:%u​: %s%sAssertion `%s' failed.\n%n",
assertion=assertion@​entry=0x7de9ce "0", file=file@​entry=0x7dcd77 "util.c",
line=line@​entry=1880,
function=function@​entry=0x7e25b0 <__PRETTY_FUNCTION__.15620> "Perl_croak_no_modify") at assert.c​:92
#3 0x00007f60916cbc32 in __GI___assert_fail (assertion=0x7de9ce "0",
file=0x7dcd77 "util.c", line=1880,
function=0x7e25b0 <__PRETTY_FUNCTION__.15620> "Perl_croak_no_modify")
at assert.c​:101
#4 0x000000000056162e in Perl_croak_no_modify () at util.c​:1880
#5 0x00000000005ed6f1 in Perl_sv_force_normal_flags (sv=0x712f998, flags=4)
at sv.c​:5248
#6 0x00000000005ec24f in Perl_sv_usepvn_flags (sv=0x712f998,
ptr=0x7134ac0 "", len=328, flags=256) at sv.c​:5087
#7 0x000000000050afb6 in S_invlist_replace_list_destroys_src (dest=0x712f998,
src=0x712f9b0) at regcomp.c​:8492
#8 0x000000000050d26e in Perl__invlist_union_maybe_complement_2nd (
a=0x384be30, b=0x712f998, complement_b=false, output=0x7fff2ed01748)
at regcomp.c​:9243
#9 0x00000000006fa6d9 in Perl__core_swash_init (pkg=0x7d024a "utf8",
name=0x7c5966 "", listsv=0x384bdb8, minbits=1, none=0, invlist=0x384be30,
flags_p=0x7fff2ed01990 "\005") at utf8.c​:3401
#10 0x000000000053b7df in Perl__get_regclass_nonbitmap_data (prog=0x38460b0,
node=0x3854e5c, doinit=true, listsvp=0x0,
only_utf8_locale_ptr=0x7fff2ed01ad0, output_invlist=0x0) at regcomp.c​:18080
#11 0x00000000006f0464 in S_reginclass (prog=0x38460b0, n=0x3854e5c,
p=0x2cf63a0 "This is ", p_end=0x2cf63a1 "his is ", utf8_target=false)
at regexec.c​:9338
#12 0x00000000006e33e6 in S_regmatch (reginfo=0x7fff2ed024a0,
startpos=0x2cf63a0 "This is ", prog=0x3854dc0) at regexec.c​:6342
#13 0x00000000006d8d83 in S_regtry (reginfo=0x7fff2ed024a0,
startposp=0x7fff2ed02308) at regexec.c​:3641
#14 0x00000000006d87fb in Perl_regexec_flags (rx=0x384bc80,
stringarg=0x2cf63a0 "This is ", strend=0x2cf63a8 "",
strbeg=0x2cf63a0 "This is ", minend=1, sv=0x384bbc0, data=0x0, flags=0)
at regexec.c​:3498
#15 0x000000000064c198 in Perl_pp_split () at pp.c​:6022
#16 0x000000000055acf1 in Perl_runops_debug () at dump.c​:2249
#17 0x0000000000462aa2 in S_run_body (oldscope=1) at perl.c​:2526
#18 0x0000000000462093 in perl_run (my_perl=0x26cf010) at perl.c​:2449
#19 0x000000000041ef95 in main (argc=7, argv=0x7fff2ed029e8,
env=0x7fff2ed02a28) at perlmain.c​:123

Thanks. I did the same, and added an sv_dump to the caller of
Perl_croak_no_modify()

For some reason we are trying to process an INVLIST object, which
IIUIR should never be perl visible.

Karl, is this enough for you to dig in?

BTW, you dont need an installed perl to replicate this, I was able to
replicate with the following​:

TEXINFO_DEV_SOURCE=1 top_srcdir=".." top_builddir=".." gdb --args
/git_tree/perl/perl -I/git_tree/perl/lib ../tp/texi2any -I . -o
texinfo.info texinfo.texi

Yves

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
SV = INVLIST(0x919d00) at 0x592c4d0
  REFCNT = 1
  FLAGS = (POK,READONLY,PROTECT,pPOK)
  PV = 0x5931a60
  [0] 0x1100 .. 0x115F
  [2] 0x2329 .. 0x232A
  [4] 0x2E80 .. 0x2FFB
  [6] 0x3000 .. 0x303E
  [8] 0x3041 .. 0x4DB5
  [10] 0x4E00 .. 0x9FBB
  [12] 0xA000 .. 0xA4C6
  [14] 0xAC00 .. 0xD7A3
  [16] 0xF900 .. 0xFAD9
  [18] 0xFE10 .. 0xFE19
  [20] 0xFE30 .. 0xFE6B
  [22] 0xFF01 .. 0xFF60
  [24] 0xFFE0 .. 0xFFE6
  [26] 0x20000 .. 0x2FFFD
  [28] 0x30000 .. 0x3FFFD
  CUR = 248
  LEN = 258
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @demerphq

On 9 November 2016 at 18​:27, demerphq <demerphq@​gmail.com> wrote​:

On 9 November 2016 at 17​:22, Hugo van der Sanden via RT
<perlbug-followup@​perl.org> wrote​:

I built and installed blead@​392582f8 with './Configure -des -Dcc=gcc -Dprefix=/opt/blead-d0 -Doptimize='-g -O0' -DDEBUGGING -Dusedevel -Uversiononly'.

On attempting the texinfo build, the first point it gives the error can be simplified to this command​:
% ( cd doc && TEXINFO_DEV_SOURCE=1 top_srcdir=".." top_builddir=".." ${PERL} ../tp/texi2any -I . -o texinfo.info texinfo.texi )
Modification of a read-only value attempted at ../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
panic​: POPSTACK
%

After a quick grep for the error message (found in perl.h) and then for PL_no_modify, I rebuilt and installed the perl with this patch​:
--- a/util.c
+++ b/util.c
@​@​ -1877,6 +1877,7 @​@​ paths reduces CPU cache pressure.
void
Perl_croak_no_modify(void)
{
+ assert(0);
Perl_croak_nocontext( "%s", PL_no_modify);
}

On reinvoking the above command, I get a core dump with the backtrace below. I'll try to cut down the repro case, but the backtrace might allow others to make some progress in the meantime.

Hugo

(gdb) where
#0 0x00007f60916d2cc9 in __GI_raise (sig=sig@​entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c​:56
#1 0x00007f60916d60d8 in __GI_abort () at abort.c​:89
#2 0x00007f60916cbb86 in __assert_fail_base (
fmt=0x7f609181c830 "%s%s%s​:%u​: %s%sAssertion `%s' failed.\n%n",
assertion=assertion@​entry=0x7de9ce "0", file=file@​entry=0x7dcd77 "util.c",
line=line@​entry=1880,
function=function@​entry=0x7e25b0 <__PRETTY_FUNCTION__.15620> "Perl_croak_no_modify") at assert.c​:92
#3 0x00007f60916cbc32 in __GI___assert_fail (assertion=0x7de9ce "0",
file=0x7dcd77 "util.c", line=1880,
function=0x7e25b0 <__PRETTY_FUNCTION__.15620> "Perl_croak_no_modify")
at assert.c​:101
#4 0x000000000056162e in Perl_croak_no_modify () at util.c​:1880
#5 0x00000000005ed6f1 in Perl_sv_force_normal_flags (sv=0x712f998, flags=4)
at sv.c​:5248
#6 0x00000000005ec24f in Perl_sv_usepvn_flags (sv=0x712f998,
ptr=0x7134ac0 "", len=328, flags=256) at sv.c​:5087
#7 0x000000000050afb6 in S_invlist_replace_list_destroys_src (dest=0x712f998,
src=0x712f9b0) at regcomp.c​:8492
#8 0x000000000050d26e in Perl__invlist_union_maybe_complement_2nd (
a=0x384be30, b=0x712f998, complement_b=false, output=0x7fff2ed01748)
at regcomp.c​:9243
#9 0x00000000006fa6d9 in Perl__core_swash_init (pkg=0x7d024a "utf8",
name=0x7c5966 "", listsv=0x384bdb8, minbits=1, none=0, invlist=0x384be30,
flags_p=0x7fff2ed01990 "\005") at utf8.c​:3401
#10 0x000000000053b7df in Perl__get_regclass_nonbitmap_data (prog=0x38460b0,
node=0x3854e5c, doinit=true, listsvp=0x0,
only_utf8_locale_ptr=0x7fff2ed01ad0, output_invlist=0x0) at regcomp.c​:18080
#11 0x00000000006f0464 in S_reginclass (prog=0x38460b0, n=0x3854e5c,
p=0x2cf63a0 "This is ", p_end=0x2cf63a1 "his is ", utf8_target=false)
at regexec.c​:9338
#12 0x00000000006e33e6 in S_regmatch (reginfo=0x7fff2ed024a0,
startpos=0x2cf63a0 "This is ", prog=0x3854dc0) at regexec.c​:6342
#13 0x00000000006d8d83 in S_regtry (reginfo=0x7fff2ed024a0,
startposp=0x7fff2ed02308) at regexec.c​:3641
#14 0x00000000006d87fb in Perl_regexec_flags (rx=0x384bc80,
stringarg=0x2cf63a0 "This is ", strend=0x2cf63a8 "",
strbeg=0x2cf63a0 "This is ", minend=1, sv=0x384bbc0, data=0x0, flags=0)
at regexec.c​:3498
#15 0x000000000064c198 in Perl_pp_split () at pp.c​:6022
#16 0x000000000055acf1 in Perl_runops_debug () at dump.c​:2249
#17 0x0000000000462aa2 in S_run_body (oldscope=1) at perl.c​:2526
#18 0x0000000000462093 in perl_run (my_perl=0x26cf010) at perl.c​:2449
#19 0x000000000041ef95 in main (argc=7, argv=0x7fff2ed029e8,
env=0x7fff2ed02a28) at perlmain.c​:123

Thanks. I did the same, and added an sv_dump to the caller of
Perl_croak_no_modify()

For some reason we are trying to process an INVLIST object, which
IIUIR should never be perl visible.

Karl, is this enough for you to dig in?

BTW, you dont need an installed perl to replicate this, I was able to
replicate with the following​:

TEXINFO_DEV_SOURCE=1 top_srcdir=".." top_builddir=".." gdb --args
/git_tree/perl/perl -I/git_tree/perl/lib ../tp/texi2any -I . -o
texinfo.info texinfo.texi

Yves

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
SV = INVLIST(0x919d00) at 0x592c4d0
REFCNT = 1
FLAGS = (POK,READONLY,PROTECT,pPOK)
PV = 0x5931a60
[0] 0x1100 .. 0x115F
[2] 0x2329 .. 0x232A
[4] 0x2E80 .. 0x2FFB
[6] 0x3000 .. 0x303E
[8] 0x3041 .. 0x4DB5
[10] 0x4E00 .. 0x9FBB
[12] 0xA000 .. 0xA4C6
[14] 0xAC00 .. 0xD7A3
[16] 0xF900 .. 0xFAD9
[18] 0xFE10 .. 0xFE19
[20] 0xFE30 .. 0xFE6B
[22] 0xFF01 .. 0xFF60
[24] 0xFFE0 .. 0xFFE6
[26] 0x20000 .. 0x2FFFD
[28] 0x30000 .. 0x3FFFD
CUR = 248
LEN = 258
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.

the code responsible is this​:

  sv_usepvn_flags(dest,
  (char *) array,
  src_byte_len - 1,

  /* This flag is documented to cause a copy to be avoided */
  SV_HAS_TRAILING_NUL);

dest is at this point an INVLIST with the readonly flag set.

Karl, is this enough for you to figure it out?

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @demerphq

On 9 November 2016 at 10​:12, Karl Williamson <khw@​cpan.org> wrote​:

User-defined \p properties, which InFullWidth is, that aren't available at
compile time are expanded at run time.

I dont get it. If InFullWidth is not loaded, how can people use it?

I would just make that a fatal exception.

It used to be that all \p were done
at run time. The problem here likely is that the changes in the blamed
commit are trying to modify a swash byproduct that has been set read-only.
I have looked at the code, and don't see how that is happening, which is why
a stack trace is needed. So, getting that back trace is the best way
forward.

I don't think the regex engine should call things like this twice. It
should do it once at compile time, and then never call it again. Like,
what happens if InFullWidth returned a different table every time?

Anyway, this is just my opinion, but IMO there is little difference
between \p{} and \N{} in this regard, and we take great care to ensure
\N{} callbacks happen once, and never happen again, even if the
pattern is embedded.

Feel free to point out the flaw in my reasoning.

cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @hvds

I was able to reduce it further to a standalone case​:

% cat lib/Unicode/EastAsianWidth.pm
package Unicode​::EastAsianWidth;
use strict;
use base 'Exporter';

our @​EXPORT = qw(InFullwidth);

sub InFullwidth {
  return <<END;
END
}

1;
__END__
% perl -Ilib -e 'A​::xx(); package A { use Unicode​::EastAsianWidth; sub xx { split /[^\s\p{InFullwidth}]/, "x" } }'
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted (core dumped)
%

Note that if the module is C<use>d before the package declaration, the assertion is not hit.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @demerphq

On 9 November 2016 at 20​:17, Hugo van der Sanden via RT
<perlbug-followup@​perl.org> wrote​:

I was able to reduce it further to a standalone case​:

% cat lib/Unicode/EastAsianWidth.pm
package Unicode​::EastAsianWidth;
use strict;
use base 'Exporter';

our @​EXPORT = qw(InFullwidth);

sub InFullwidth {
return <<END;
END
}

1;
__END__
% perl -Ilib -e 'A​::xx(); package A { use Unicode​::EastAsianWidth; sub xx { split /[^\s\p{InFullwidth}]/, "x" } }'
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted (core dumped)
%

Note that if the module is C<use>d before the package declaration, the assertion is not hit.

Oooh. ++ Even further​:

./perl -Ilib -e 'A​::xx(); package A; sub InFullwidth{ return "\n" }
sub xx { split /[^\s\p{InFullwidth}]/, "x" }'
SV = INVLIST(0x2391d00) at 0x2387178
  REFCNT = 1
  FLAGS = (READONLY,PROTECT)
  PV = 0x2464530
  CUR = 0
  LEN = 9
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted

If I replace the \s with its logical equivalent there is no assert​:

./perl -Ilib -e 'A​::xx(); package A { sub InFullwidth{ return "\n" }
sub xx { split /[^ \t\r\n\p{InFullwidth}]/, "x" } }'

and if I replace it with \w there is also no assert, but with \s, \h,
\d there is.

I have been poking around a bit more, and the place the invlist gets
its readonly flag turned on is the end of Perl__swash_to_invlist().

It looks to me like it gets compiled once, then for some reason we try
to compile it again, and it blows up.

Yves

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @khwilliamson

On 11/09/2016 09​:21 PM, demerphq wrote​:

On 9 November 2016 at 20​:17, Hugo van der Sanden via RT
<perlbug-followup@​perl.org> wrote​:

I was able to reduce it further to a standalone case​:

% cat lib/Unicode/EastAsianWidth.pm
package Unicode​::EastAsianWidth;
use strict;
use base 'Exporter';

our @​EXPORT = qw(InFullwidth);

sub InFullwidth {
return <<END;
END
}

1;
__END__
% perl -Ilib -e 'A​::xx(); package A { use Unicode​::EastAsianWidth; sub xx { split /[^\s\p{InFullwidth}]/, "x" } }'
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted (core dumped)
%

Note that if the module is C<use>d before the package declaration, the assertion is not hit.
Oooh. ++ Even further​:

./perl -Ilib -e 'A​::xx(); package A; sub InFullwidth{ return "\n" }
sub xx { split /[^\s\p{InFullwidth}]/, "x" }'
SV = INVLIST(0x2391d00) at 0x2387178
REFCNT = 1
FLAGS = (READONLY,PROTECT)
PV = 0x2464530
CUR = 0
LEN = 9
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted

If I replace the \s with its logical equivalent there is no assert​:

./perl -Ilib -e 'A​::xx(); package A { sub InFullwidth{ return "\n" }
sub xx { split /[^ \t\r\n\p{InFullwidth}]/, "x" } }'

and if I replace it with \w there is also no assert, but with \s, \h,
\d there is.

I have been poking around a bit more, and the place the invlist gets
its readonly flag turned on is the end of Perl__swash_to_invlist().

It looks to me like it gets compiled once, then for some reason we try
to compile it again, and it blows up.

Yves

I have enough info to debug now.

User-defined properties are implemented by the user defining
subroutines. A subroutine need not be defined at the lexical calling
point. That's why these properties need to be deferable until runtime.

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @mauke

Am 09.11.2016 um 21​:25 schrieb Karl Williamson​:

User-defined properties are implemented by the user defining
subroutines. A subroutine need not be defined at the lexical calling
point. That's why these properties need to be deferable until runtime.

That, by itself, isn't a very convincing rationale. Attributes are also
implemented by the user defining subroutines, and they have to be
defined at the calling point.

--
Lukas Mai <plokinom@​gmail.com>

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @khwilliamson

On 11/09/2016 09​:29 PM, Lukas Mai wrote​:

Am 09.11.2016 um 21​:25 schrieb Karl Williamson​:

User-defined properties are implemented by the user defining
subroutines. A subroutine need not be defined at the lexical calling
point. That's why these properties need to be deferable until runtime.

That, by itself, isn't a very convincing rationale. Attributes are
also implemented by the user defining subroutines, and they have to be
defined at the calling point.

This is the way it has already worked. Changing it would break this
module, and who knows how many others.

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @khwilliamson

On 11/09/2016 09​:32 PM, Karl Williamson wrote​:

On 11/09/2016 09​:29 PM, Lukas Mai wrote​:

Am 09.11.2016 um 21​:25 schrieb Karl Williamson​:

User-defined properties are implemented by the user defining
subroutines. A subroutine need not be defined at the lexical calling
point. That's why these properties need to be deferable until runtime.

That, by itself, isn't a very convincing rationale. Attributes are
also implemented by the user defining subroutines, and they have to
be defined at the calling point.

This is the way it has already worked. Changing it would break this
module, and who knows how many others.

s/already/always/

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2016

From @jkeenan

On Wed, 09 Nov 2016 16​:22​:58 GMT, hv wrote​:

I built and installed blead@​392582f8 with './Configure -des -Dcc=gcc
-Dprefix=/opt/blead-d0 -Doptimize='-g -O0' -DDEBUGGING -Dusedevel
-Uversiononly'.

On attempting the texinfo build, the first point it gives the error
can be simplified to this command​:
% ( cd doc && TEXINFO_DEV_SOURCE=1 top_srcdir=".." top_builddir=".."
${PERL} ../tp/texi2any -I . -o texinfo.info texinfo.texi )
Modification of a read-only value attempted at
../tp/Texinfo/Convert/ParagraphNonXS.pm line 327.
panic​: POPSTACK
%

After a quick grep for the error message (found in perl.h) and then
for PL_no_modify, I rebuilt and installed the perl with this patch​:
--- a/util.c
+++ b/util.c
@​@​ -1877,6 +1877,7 @​@​ paths reduces CPU cache pressure.
void
Perl_croak_no_modify(void)
{
+ assert(0);
Perl_croak_nocontext( "%s", PL_no_modify);
}

On reinvoking the above command, I get a core dump with the backtrace
below. I'll try to cut down the repro case, but the backtrace might
allow others to make some progress in the meantime.

Hugo

(gdb) where
#0 0x00007f60916d2cc9 in __GI_raise (sig=sig@​entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c​:56
#1 0x00007f60916d60d8 in __GI_abort () at abort.c​:89
#2 0x00007f60916cbb86 in __assert_fail_base (
fmt=0x7f609181c830 "%s%s%s​:%u​: %s%sAssertion `%s' failed.\n%n",
assertion=assertion@​entry=0x7de9ce "0", file=file@​entry=0x7dcd77
"util.c",
line=line@​entry=1880,
function=function@​entry=0x7e25b0 <__PRETTY_FUNCTION__.15620>
"Perl_croak_no_modify") at assert.c​:92
#3 0x00007f60916cbc32 in __GI___assert_fail (assertion=0x7de9ce "0",
file=0x7dcd77 "util.c", line=1880,
function=0x7e25b0 <__PRETTY_FUNCTION__.15620>
"Perl_croak_no_modify")
at assert.c​:101
#4 0x000000000056162e in Perl_croak_no_modify () at util.c​:1880
#5 0x00000000005ed6f1 in Perl_sv_force_normal_flags (sv=0x712f998,
flags=4)
at sv.c​:5248
#6 0x00000000005ec24f in Perl_sv_usepvn_flags (sv=0x712f998,
ptr=0x7134ac0 "", len=328, flags=256) at sv.c​:5087
#7 0x000000000050afb6 in S_invlist_replace_list_destroys_src
(dest=0x712f998,
src=0x712f9b0) at regcomp.c​:8492
#8 0x000000000050d26e in Perl__invlist_union_maybe_complement_2nd (
a=0x384be30, b=0x712f998, complement_b=false,
output=0x7fff2ed01748)
at regcomp.c​:9243
#9 0x00000000006fa6d9 in Perl__core_swash_init (pkg=0x7d024a "utf8",
name=0x7c5966 "", listsv=0x384bdb8, minbits=1, none=0,
invlist=0x384be30,
flags_p=0x7fff2ed01990 "\005") at utf8.c​:3401
#10 0x000000000053b7df in Perl__get_regclass_nonbitmap_data
(prog=0x38460b0,
node=0x3854e5c, doinit=true, listsvp=0x0,
only_utf8_locale_ptr=0x7fff2ed01ad0, output_invlist=0x0) at
regcomp.c​:18080
#11 0x00000000006f0464 in S_reginclass (prog=0x38460b0, n=0x3854e5c,
p=0x2cf63a0 "This is ", p_end=0x2cf63a1 "his is ",
utf8_target=false)
at regexec.c​:9338
#12 0x00000000006e33e6 in S_regmatch (reginfo=0x7fff2ed024a0,
startpos=0x2cf63a0 "This is ", prog=0x3854dc0) at regexec.c​:6342
#13 0x00000000006d8d83 in S_regtry (reginfo=0x7fff2ed024a0,
startposp=0x7fff2ed02308) at regexec.c​:3641
#14 0x00000000006d87fb in Perl_regexec_flags (rx=0x384bc80,
stringarg=0x2cf63a0 "This is ", strend=0x2cf63a8 "",
strbeg=0x2cf63a0 "This is ", minend=1, sv=0x384bbc0, data=0x0,
flags=0)
at regexec.c​:3498
#15 0x000000000064c198 in Perl_pp_split () at pp.c​:6022
#16 0x000000000055acf1 in Perl_runops_debug () at dump.c​:2249
#17 0x0000000000462aa2 in S_run_body (oldscope=1) at perl.c​:2526
#18 0x0000000000462093 in perl_run (my_perl=0x26cf010) at perl.c​:2449
#19 0x000000000041ef95 in main (argc=7, argv=0x7fff2ed029e8,
env=0x7fff2ed02a28) at perlmain.c​:123

Another backtrace​:

https://gist.github.com/jkeenan/6dbf40aa5d4301511b287961d8d8f7db

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 2016

From @jkeenan

On Wed, 09 Nov 2016 20​:22​:07 GMT, demerphq wrote​:

On 9 November 2016 at 20​:17, Hugo van der Sanden via RT
<perlbug-followup@​perl.org> wrote​:

I was able to reduce it further to a standalone case​:

% cat lib/Unicode/EastAsianWidth.pm
package Unicode​::EastAsianWidth;
use strict;
use base 'Exporter';

our @​EXPORT = qw(InFullwidth);

sub InFullwidth {
return <<END;
END
}

1;
__END__
% perl -Ilib -e 'A​::xx(); package A { use Unicode​::EastAsianWidth;
sub xx { split /[^\s\p{InFullwidth}]/, "x" } }'
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted (core dumped)
%

Note that if the module is C<use>d before the package declaration,
the assertion is not hit.

Oooh. ++ Even further​:

./perl -Ilib -e 'A​::xx(); package A; sub InFullwidth{ return "\n" }
sub xx { split /[^\s\p{InFullwidth}]/, "x" }'
SV = INVLIST(0x2391d00) at 0x2387178
REFCNT = 1
FLAGS = (READONLY,PROTECT)
PV = 0x2464530
CUR = 0
LEN = 9
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted

If I replace the \s with its logical equivalent there is no assert​:

./perl -Ilib -e 'A​::xx(); package A { sub InFullwidth{ return "\n" }
sub xx { split /[^ \t\r\n\p{InFullwidth}]/, "x" } }'

and if I replace it with \w there is also no assert, but with \s, \h,
\d there is.

I have been poking around a bit more, and the place the invlist gets
its readonly flag turned on is the end of Perl__swash_to_invlist().

It looks to me like it gets compiled once, then for some reason we try
to compile it again, and it blows up.

Yves

khw created a new branch​: origin/smoke-me/khw-kid51 and in commit 0d87c6c provided a fix for this problem.

In that same branch I adapted Yves' test above and pushed it in commit 1f3d9c6. (I haven't written too many regex tests in core, so if it's not in the right format, please feel free to adjust.)

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 14, 2016

@khwilliamson - Status changed from 'open' to 'pending release'

@p5pRT
Copy link
Author

p5pRT commented Mar 18, 2017

From @jkeenan

On Thu, 10 Nov 2016 02​:04​:08 GMT, jkeenan wrote​:

On Wed, 09 Nov 2016 20​:22​:07 GMT, demerphq wrote​:

On 9 November 2016 at 20​:17, Hugo van der Sanden via RT
<perlbug-followup@​perl.org> wrote​:

I was able to reduce it further to a standalone case​:

% cat lib/Unicode/EastAsianWidth.pm
package Unicode​::EastAsianWidth;
use strict;
use base 'Exporter';

our @​EXPORT = qw(InFullwidth);

sub InFullwidth {
return <<END;
END
}

1;
__END__
% perl -Ilib -e 'A​::xx(); package A { use Unicode​::EastAsianWidth;
sub xx { split /[^\s\p{InFullwidth}]/, "x" } }'
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted (core dumped)
%

Note that if the module is C<use>d before the package declaration,
the assertion is not hit.

Oooh. ++ Even further​:

./perl -Ilib -e 'A​::xx(); package A; sub InFullwidth{ return "\n" }
sub xx { split /[^\s\p{InFullwidth}]/, "x" }'
SV = INVLIST(0x2391d00) at 0x2387178
REFCNT = 1
FLAGS = (READONLY,PROTECT)
PV = 0x2464530
CUR = 0
LEN = 9
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted

If I replace the \s with its logical equivalent there is no assert​:

./perl -Ilib -e 'A​::xx(); package A { sub InFullwidth{ return "\n" }
sub xx { split /[^ \t\r\n\p{InFullwidth}]/, "x" } }'

and if I replace it with \w there is also no assert, but with \s, \h,
\d there is.

I have been poking around a bit more, and the place the invlist gets
its readonly flag turned on is the end of Perl__swash_to_invlist().

It looks to me like it gets compiled once, then for some reason we
try
to compile it again, and it blows up.

Yves

khw created a new branch​: origin/smoke-me/khw-kid51 and in commit
0d87c6c provided a fix for this
problem.

In that same branch I adapted Yves' test above and pushed it in commit
1f3d9c6. (I haven't written too many
regex tests in core, so if it's not in the right format, please feel
free to adjust.)

Thank you very much.

Karl, on Nov 14 you marked this ticket as "Pending Release," but there's nothing in the ticket indicating when blead was patched.

Does the attached diff demonstrate that blead has been fixed?

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Mar 18, 2017

From @jkeenan

130010.diff
diff --git a/t/re/pat_advanced.t b/t/re/pat_advanced.t
index 5eb2cc5..08f4f53 100644
--- a/t/re/pat_advanced.t
+++ b/t/re/pat_advanced.t
@@ -2433,6 +2433,15 @@ EOF
                         'No segfault [perl #126886]');
     }
 
+    {
+        # [perl 130010]  Downstream application texinfo started to report panics
+        # as of commit a5540cf.
+
+        runperl( prog => 'A::xx(); package A; sub InFullwidth{ return qq|\n| } sub xx { split /[^\s\p{InFullwidth}]/, q|x| }' );
+        ok(! $?, "User-defined pattern did not cause panic [perl 130010]");
+    }
+
+
     # !!! NOTE that tests that aren't at all likely to crash perl should go
     # a ways above, above these last ones.  There's a comment there that, like
     # this comment, contains the word 'NOTE'
diff --git a/utf8.c b/utf8.c
index 0a4466f..4dbefe5 100644
--- a/utf8.c
+++ b/utf8.c
@@ -3398,6 +3398,7 @@ Perl__core_swash_init(pTHX_ const char* pkg, const char* name, SV *listsv, I32 m
 		/* Add the passed-in inversion list, which invalidates the one
 		 * already stored in the swash */
 		invlist_in_swash_is_valid = FALSE;
+                SvREADONLY_off(swash_invlist);  /* Turned on again below */
 		_invlist_union(invlist, swash_invlist, &swash_invlist);
 	    }
 	    else {

@p5pRT
Copy link
Author

p5pRT commented Mar 20, 2017

From @iabyn

On Fri, Mar 17, 2017 at 05​:18​:47PM -0700, James E Keenan via RT wrote​:

On Thu, 10 Nov 2016 02​:04​:08 GMT, jkeenan wrote​:

On Wed, 09 Nov 2016 20​:22​:07 GMT, demerphq wrote​:

On 9 November 2016 at 20​:17, Hugo van der Sanden via RT
<perlbug-followup@​perl.org> wrote​:

I was able to reduce it further to a standalone case​:

% cat lib/Unicode/EastAsianWidth.pm
package Unicode​::EastAsianWidth;
use strict;
use base 'Exporter';

our @​EXPORT = qw(InFullwidth);

sub InFullwidth {
return <<END;
END
}

1;
__END__
% perl -Ilib -e 'A​::xx(); package A { use Unicode​::EastAsianWidth;
sub xx { split /[^\s\p{InFullwidth}]/, "x" } }'
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted (core dumped)
%

Note that if the module is C<use>d before the package declaration,
the assertion is not hit.

Oooh. ++ Even further​:

./perl -Ilib -e 'A​::xx(); package A; sub InFullwidth{ return "\n" }
sub xx { split /[^\s\p{InFullwidth}]/, "x" }'
SV = INVLIST(0x2391d00) at 0x2387178
REFCNT = 1
FLAGS = (READONLY,PROTECT)
PV = 0x2464530
CUR = 0
LEN = 9
perl​: util.c​:1880​: Perl_croak_no_modify​: Assertion `0' failed.
Aborted

If I replace the \s with its logical equivalent there is no assert​:

./perl -Ilib -e 'A​::xx(); package A { sub InFullwidth{ return "\n" }
sub xx { split /[^ \t\r\n\p{InFullwidth}]/, "x" } }'

and if I replace it with \w there is also no assert, but with \s, \h,
\d there is.

I have been poking around a bit more, and the place the invlist gets
its readonly flag turned on is the end of Perl__swash_to_invlist().

It looks to me like it gets compiled once, then for some reason we
try
to compile it again, and it blows up.

Yves

khw created a new branch​: origin/smoke-me/khw-kid51 and in commit
0d87c6c provided a fix for this
problem.

In that same branch I adapted Yves' test above and pushed it in commit
1f3d9c6. (I haven't written too many
regex tests in core, so if it's not in the right format, please feel
free to adjust.)

Thank you very much.

Karl, on Nov 14 you marked this ticket as "Pending Release," but there's nothing in the ticket indicating when blead was patched.

Well, Karl applied the suggested fix and test as v5.25.6-208-geee4c92 and
v5.25.6-209-g553fa53, so I assume the 'Pending Release' status is valid.

--
The warp engines start playing up a bit, but seem to sort themselves out
after a while without any intervention from boy genius Wesley Crusher.
  -- Things That Never Happen in "Star Trek" #17

@p5pRT
Copy link
Author

p5pRT commented Mar 27, 2017

From @iabyn

On Mon, Mar 20, 2017 at 01​:18​:24PM +0000, Dave Mitchell wrote​:

On Fri, Mar 17, 2017 at 05​:18​:47PM -0700, James E Keenan via RT wrote​:

Karl, on Nov 14 you marked this ticket as "Pending Release," but there's nothing in the ticket indicating when blead was patched.

Well, Karl applied the suggested fix and test as v5.25.6-208-geee4c92 and
v5.25.6-209-g553fa53, so I assume the 'Pending Release' status is valid.

so I've removed it from the blockers list

--
Never do today what you can put off till tomorrow.

@p5pRT
Copy link
Author

p5pRT commented May 30, 2017

From @khwilliamson

Thank you for filing this report. You have helped make Perl better.

With the release today of Perl 5.26.0, this and 210 other issues have been
resolved.

Perl 5.26.0 may be downloaded via​:
https://metacpan.org/release/XSAWYERX/perl-5.26.0

If you find that the problem persists, feel free to reopen this ticket.

@p5pRT
Copy link
Author

p5pRT commented May 30, 2017

@khwilliamson - Status changed from 'pending release' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant