Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiconcat breaks blead on VMS #16219

Closed
p5pRT opened this issue Nov 3, 2017 · 18 comments
Closed

multiconcat breaks blead on VMS #16219

p5pRT opened this issue Nov 3, 2017 · 18 comments

Comments

@p5pRT
Copy link

p5pRT commented Nov 3, 2017

Migrated from rt.perl.org#132390 (status was 'resolved')

Searchable as RT132390$

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2017

From @craigberry

Currently (v5.27.5-183-g78a6433791) miniperl is unable to run ConfigPM on VMS in a default configuration, which of course means blead does not build. Switching to -Duse64bitint (which is not the default) makes it work for some reason. Seems to happen with or without -DDEBUGGING. Whittling down ConfigPM to the offending line, I eventually ended up with a reproducer that looks like, in Unix parlance​:

./miniperl -Ilib -e 'my $s; $s .= q||;'

but below is how it looks on VMS with Perl's -Dt output and the VMS debugger output mixed together​:

$ MCR []dbgminiperl.exe "-I[.lib]" -"Dt" -e "my $s; $s .= q||;"
%DEBUG-W-DWNOT1PROC, the 1 process debugger cannot be run in DECwindows mode

  OpenVMS I64 Debug64 Version V8.4-001

%DEBUG-I-INITIAL, Language​: C, Module​: MINIPERLMAIN
%DEBUG-I-NOTATMAIN, Type GO to reach MAIN program

DBG> go
break at routine MINIPERLMAIN\main
182179​: PL_use_safe_putenv = FALSE;
DBG> set break/exception
DBG> go
(-e​:0) nextstate
(-e​:0) const(PV("lib/buildcustomize.pl"\0))
(-e​:0) padsv($f)
(-e​:0) sassign
(-e​:0) nextstate
(-e​:0) enter
(-e​:0) nextstate
(-e​:0) gvsv(main​::!)
(-e​:0) nextstate
(-e​:0) padsv($f)
(-e​:0) ftfile
(-e​:0) leave
(-e​:0) and
(-e​:0) padsv($f)
(-e​:0) dofile
(lib/buildcustomize.pl​:0) nextstate
(lib/buildcustomize.pl​:10) pushmark
(lib/buildcustomize.pl​:10) gv(main​::INC)
(lib/buildcustomize.pl​:10) rv2av
(lib/buildcustomize.pl​:10) const(IV(0))
(lib/buildcustomize.pl​:10) const(IV(1))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.cpan.AutoLoader.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.dist.Carp.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.dist.PathTools]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.dist.PathTools.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.cpan.ExtUtils-Install.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.cpan.ExtUtils-MakeMaker.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.cpan.ExtUtils-Manifest.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.cpan.File-Path.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.ext.re]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.dist.Term-ReadLine.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.dist.Exporter.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.ext.File-Find.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.cpan.Text-Tabs.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.dist.constant.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.cpan.version.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead.ext.VMS-Filespec.lib]"\0))
(lib/buildcustomize.pl​:10) const(PV("D0​:[CRAIG.blead]lib"\0))
(lib/buildcustomize.pl​:10) splice
(lib/buildcustomize.pl​:10) nextstate
(lib/buildcustomize.pl​:27) const(PV("VMS"\0))
(lib/buildcustomize.pl​:27) gvsv(main​::\x0f)
(lib/buildcustomize.pl​:27) sassign
(lib/buildcustomize.pl​:27) leaveeval
(-e​:0) or
(-e​:0) leavesub

EXECUTING...

(-e​:0) enter
(-e​:0) nextstate
(-e​:1) padsv($s)
(-e​:1) nextstate
(-e​:1) multiconcat("",0)
%DEBUG-I-DYNMODSET, setting module PP_HOT
%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual address=FFFFFFFF017DBAA4, PC=00000000004C5650, PS=0000001B
break on exception at PP_HOT\Perl_pp_multiconcat\%LINE 182317+112
182317​: SETTARG;
DBG> show calls
module name routine name line rel PC abs PC
*PP_HOT Perl_pp_multiconcat
  182317 000000000000CAC0 00000000004C5650
*DUMP Perl_runops_debug
  185617 00000000000205B0 00000000002E3EC0
*PERLMINI S_run_body 184837 000000000000F850 000000000004FF60
*PERLMINI perl_run 184760 000000000000E9B0 000000000004F0C0
*MINIPERLMAIN main 182219 0000000000000400 0000000000040400
*MINIPERLMAIN __main 182163 0000000000000130 0000000000040130
  AMAC$EMUL_CALL 000000000068C400 00000000006CC400
  LIB$INITIALIZE 000000000068CD90 00000000006CCD90
  FFFFFFFF80A39CB0 FFFFFFFF80A39CB0
DBG> examine targ
PP_HOT\Perl_pp_multiconcat\targ​: 29727456
DBG> examine/hex targ
PP_HOT\Perl_pp_multiconcat\targ​: 01C59AE0
DBG> set module dump
DBG> call Perl_sv_dump(targ)
SV = PV(0x17d9a40) at 0x1c59ae0
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x1c454b0 ""\0
  CUR = 0
  LEN = 10
value returned is 25004664
DBG> call Perl_sv_dump(sp)
SV = NULL(0x1c59ae0) at 0x17dbaa4
  REFCNT = 0
  FLAGS = ()
value returned is 25004664

A "reason mask" of 04 in an access violation means it's trying to write to memory to which it does not have access, in this case the stack pointer​:

DBG> examine/hex sp
PP_HOT\Perl_pp_multiconcat\sp​: 017DBAA4

I can't provide the output of perl -V since Config is not successfully built. Here are the config args​:

$ search config.sh config_args
config_args='-"Dusedevel" -"DDEBUGGING" -"Dusevmsdebug" -"des"'

I will try to think of what else to poke at over the next day or two, but something seems to have clobbered the sp pointer, which is written to via the SETTARG macro in Perl_pp_multiconcat.

________________________________________
Craig A. Berry

"... getting out of a sonnet is much more
difficult than getting in."
  Brad Leithauser

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2017

From @craigberry

On Thu, Nov 2, 2017 at 10​:57 PM, Craig A.Berry
<perlbug-followup@​perl.org> wrote​:

$ MCR []dbgminiperl.exe "-I[.lib]" -"Dt" -e "my $s; $s .= q||;"
<snip>
(-e​:1) multiconcat("",0)
%DEBUG-I-DYNMODSET, setting module PP_HOT
%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual address=FFFFFFFF017DBAA4, PC=00000000004C5650, PS=0000001B
break on exception at PP_HOT\Perl_pp_multiconcat\%LINE 182317+112
182317​: SETTARG;
DBG> show calls
module name routine name line rel PC abs PC
*PP_HOT Perl_pp_multiconcat
182317 000000000000CAC0 00000000004C5650

I started looking at Perl_pp_multiconcat and don't claim to understand
much of what's going on, but I did note that the comments say what it
does is "Concatenate one or more args," yet the number of arguments
(nargs) in the case that blows up is zero. So then I wondered about
this line​:

SP -= (nargs - 1);

which, when nargs is zero, actually adds to the stack pointer rather
than subtracting from it. Is that the intention?

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 2017

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Nov 4, 2017

From @iabyn

On Fri, Nov 03, 2017 at 04​:16​:57PM -0500, Craig A. Berry wrote​:

I started looking at Perl_pp_multiconcat and don't claim to understand
much of what's going on, but I did note that the comments say what it
does is "Concatenate one or more args," yet the number of arguments
(nargs) in the case that blows up is zero. So then I wondered about
this line​:

SP -= (nargs - 1);

which, when nargs is zero, actually adds to the stack pointer rather
than subtracting from it. Is that the intention?

Yes, that's intentional. Zero args are are passed on the stack, but
one arg is returned by SETTARG, In fact there's even special-cased stack
extending in this case​:

  if (!nargs)
  /* $lex .= "const" doesn't cause anything to be pushed */
  EXTEND(SP,1);

--
The Enterprise is captured by a vastly superior alien intelligence which
does not put them on trial.
  -- Things That Never Happen in "Star Trek" #10

@p5pRT
Copy link
Author

p5pRT commented Nov 4, 2017

From @nwc10

On Thu, Nov 02, 2017 at 08​:57​:05PM -0700, Craig A.Berry wrote​:

# New Ticket Created by Craig A. Berry
# Please include the string​: [perl #132390]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=132390 >

Currently (v5.27.5-183-g78a6433791) miniperl is unable to run ConfigPM on VMS in a default configuration, which of course means blead does not build. Switching to -Duse64bitint (which is not the default) makes it work for some reason. Seems to happen with or without -DDEBUGGING. Whittling down ConfigPM to the offending line, I eventually ended up with a reproducer that looks like, in Unix parlance​:

./miniperl -Ilib -e 'my $s; $s .= q||;'

Neither ASAN nor valgrind, for both x86 or x86_54, for threaded and unthreaded
showed anything for me on Linux, so it's a bit of a mystery as to what makes
VMS upset. (I think I tried most permutations, but maybe not quite all - the
x86 VM is relatively slow.)

I can't provide the output of perl -V since Config is not successfully built. Here are the config args​:

$ search config.sh config_args
config_args='-"Dusedevel" -"DDEBUGGING" -"Dusevmsdebug" -"des"'

However, I'm not familiar enough with what VMS defaults to (eg threading or
not) - could you provide a perl -V output for a fairly recent successful
build with that configuration?

Although that's more "for the benefit of others" because I don't really have
any time or insight to chase this further currently, given that it's proving
impossible to reproduce on another platform (when it feels like it should be
possible to.)

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Nov 4, 2017

From @craigberry

On Nov 4, 2017, at 6​:52 AM, Dave Mitchell via RT <perlbug-followup@​perl.org> wrote​:

On Fri, Nov 03, 2017 at 04​:16​:57PM -0500, Craig A. Berry wrote​:

I started looking at Perl_pp_multiconcat and don't claim to understand
much of what's going on, but I did note that the comments say what it
does is "Concatenate one or more args," yet the number of arguments
(nargs) in the case that blows up is zero. So then I wondered about
this line​:

SP -= (nargs - 1);

which, when nargs is zero, actually adds to the stack pointer rather
than subtracting from it. Is that the intention?

Yes, that's intentional. Zero args are are passed on the stack, but
one arg is returned by SETTARG, In fact there's even special-cased stack
extending in this case​:

   if \(\!nargs\)
       /\* $lex \.= "const" doesn't cause anything to be pushed \*/
       EXTEND\(SP\,1\);

Thanks. I had seen that EXTEND and wondered why it didn't do anything (never calls Perl_stack_grow). It thinks it already has enough stack space​:

stepped to PP_HOT\Perl_pp_multiconcat\%LINE 181635
181635​: EXTEND(SP,1);
DBG> examine PL_stack_max, sp
PP_HOT\PL_stack_max​: 12700044
PP_HOT\Perl_pp_multiconcat\sp​: 12699536

So it appears that the stack we think we have isn't actually writable. I guess it's time for a visit to Perl_init_stacks.

@p5pRT
Copy link
Author

p5pRT commented Nov 4, 2017

From @craigberry

On Nov 4, 2017, at 7​:09 AM, Nicholas Clark via RT <perlbug-followup@​perl.org> wrote​:

On Thu, Nov 02, 2017 at 08​:57​:05PM -0700, Craig A.Berry wrote​:

# New Ticket Created by Craig A. Berry
# Please include the string​: [perl #132390]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=132390 >

Currently (v5.27.5-183-g78a6433791) miniperl is unable to run ConfigPM on VMS in a default configuration, which of course means blead does not build. Switching to -Duse64bitint (which is not the default) makes it work for some reason. Seems to happen with or without -DDEBUGGING. Whittling down ConfigPM to the offending line, I eventually ended up with a reproducer that looks like, in Unix parlance​:

./miniperl -Ilib -e 'my $s; $s .= q||;'

Neither ASAN nor valgrind, for both x86 or x86_54, for threaded and unthreaded
showed anything for me on Linux, so it's a bit of a mystery as to what makes
VMS upset. (I think I tried most permutations, but maybe not quite all - the
x86 VM is relatively slow.)

Thanks for poking at this.

I can't provide the output of perl -V since Config is not successfully built. Here are the config args​:

$ search config.sh config_args
config_args='-"Dusedevel" -"DDEBUGGING" -"Dusevmsdebug" -"des"'

However, I'm not familiar enough with what VMS defaults to (eg threading or
not) - could you provide a perl -V output for a fairly recent successful
build with that configuration?

The following is perl -V from the commit immediately preceding e839e6e, "Add OP_MULTICONCAT op" but in the same configuration.

$ perl -"V"
Summary of my perl5 (revision 5 version 27 subversion 6) configuration​:
  Snapshot of​: c0acf91
  Platform​:
  osname=VMS
  osvers=V8.4
  archname=VMS_IA64
  uname='VMS alma V8.4 HP rx2660 (1.67GHz/9.0MB)'
  config_args='-"Dusedevel" -"DDEBUGGING" -"Dusevmsdebug" -"des"'
  hint=none
  useposix=false
  d_sigaction=define
  useithreads=undef
  usemultiplicity=undef
  use64bitint=undef
  use64bitall=undef
  uselongdouble=undef
  usemymalloc=undef
  default_inc_excludes_dot=define
  bincompat5005=undef
  Compiler​:
  cc='CC/DECC'
  ccflags ='/Include=[]/Standard=Relaxed_ANSI/Prefix=All/Obj=.obj /NOANSI_ALIAS/float=ieee/ieee=denorm/NAMES=(SHORTENED)/Define=_USE_STD_STAT=1'
  optimize='/List/Debug/NoOpt'
  cppflags='undef'
  ccversion='70390020'
  gccversion=''
  gccosandvers='undef'
  intsize=4
  longsize=4
  ptrsize=4
  doublesize=8
  byteorder=1234
  doublekind=3
  d_longlong=define
  longlongsize=8
  d_longdbl=define
  longdblsize=16
  longdblkind=1
  ivtype='long'
  ivsize=4
  nvtype='double'
  nvsize=8
  Off_t='off_t'
  lseeksize=8
  alignbytes=8
  prototype=define
  Linker and Libraries​:
  ld='Link/nodebug'
  ldflags ='/Debug/Trace/Map'
  libpth=/sys$share /sys$library
  libs=
  perllibs=
  libc=(DECCRTL)
  so=exe
  useshrplib=true
  libperl=undef
  gnulibc_version='undef'
  Dynamic Linking​:
  dlsrc=dl_vms.xs
  dlext=exe
  d_dlsymun=undef
  ccdlflags=''
  cccdlflags=''
  lddlflags='/Share'

Characteristics of this PERLSHR image​:
  Compile-time options​:
  DEBUGGING
  HAS_TIMES
  HAVE_INTERP_INTERN
  PERLIO_LAYERS
  PERL_COPY_ON_WRITE
  PERL_DONT_CREATE_GVSV
  PERL_EXTERNAL_GLOB
  PERL_MALLOC_WRAP
  PERL_OP_PARENT
  PERL_PRESERVE_IVUV
  PERL_USE_DEVEL
  USE_IEEE
  USE_LARGE_FILES
  USE_LOCALE
  USE_LOCALE_COLLATE
  USE_LOCALE_CTYPE
  USE_LOCALE_NUMERIC
  USE_LOCALE_TIME
  USE_PERLIO
  USE_PERL_ATOF
  VMS_DO_SOCKETS
  VMS_SHORTEN_LONG_SYMBOLS
  Built under VMS
  Compiled at Nov 4 2017 09​:02​:12
  %ENV​:
  PERLSHR="perl_root​:[000000]perlshr.exe"
  PERL_ROOT="DSA0​:[craig.c0acf911f6.]"
  @​INC​:
  /perl_root/lib/site_perl/VMS_IA64
  /perl_root/lib/site_perl
  /perl_root/lib/VMS_IA64/5_27_6
  /perl_root/lib

Although that's more "for the benefit of others" because I don't really have
any time or insight to chase this further currently, given that it's proving
impossible to reproduce on another platform (when it feels like it should be
possible to.)

I hacked miniperl to eliminate what gets loaded by buildcustomize.pl​:

Inline Patch
--- perlmini.c;-0	2017-11-01 22:40:54 -0500
+++ perlmini.c	2017-11-03 07:02:59 -0500
@@ -2247,7 +2247,7 @@ S_parse_body(pTHX_ char **env, XSINIT_t
 	AV *const inc = GvAV(PL_incgv);
 	SV **const inc0 = inc ? av_fetch(inc, 0, FALSE) : NULL;

-	if (inc0) {
+	if (0) {
             /* if lib/buildcustomize.pl exists, it should not fail. If it does,
                it should be reported immediately as a build failure.  */
 	    (void)Perl_av_create_and_unshift_one(aTHX_ &PL_preambleav,

which skips some platform-specific code, but that made no difference.  So if it's something platform-specific, I haven't found it.  I suppose it could also be some unusual combination of intsize, longsize, ptrsize, alignbytes and friends that isn't common elsewhere.

@p5pRT
Copy link
Author

p5pRT commented Nov 4, 2017

From @craigberry

On Nov 4, 2017, at 3​:24 PM, Nicholas Clark <nick@​ccl4.org> wrote​:

I did wonder - if you run ./miniperl -Dst -e 'my $s; $s .= q||;'
(ie add -Ds) on both something *nix (where it works) and VMS, does the
output diverge in any way that gives a clue?

I configured on macOS Sierra (10.12.6) like so​:

$ ./Configure -Dusedevel -DDEBUGGING -Uuse64bitint -des

(hmm, Configure ignored the -Uuse64bitint -- I guess 32-bit support is gone on this platform)

and that gives me​:

$ ./miniperl -Dst -e 'my $s; $s .= q||;'

EXECUTING...

  =>
(-e​:0) enter
  =>
(-e​:0) nextstate
  =>
(-e​:1) padsv($s)
  => UNDEF
(-e​:1) nextstate
  =>
(-e​:1) multiconcat("",0)
  => PV(""\0)
(-e​:1) leave

whereas on VMS I get​:

$ MCR []miniperl.exe -"Dst" -e "my $s; $s .= q||;"'

EXECUTING...

  =>
(-e​:0) enter
  =>
(-e​:0) nextstate
  =>
(-e​:1) padsv($s)
  => UNDEF
(-e​:1) nextstate
  =>
(-e​:1) multiconcat("",0)

which as far as I can see is exactly the same until it dies. It falls down so hard that it doesn't even signal the crash unless I run it under the debugger and explicitly ask it to break on exceptions.

@p5pRT
Copy link
Author

p5pRT commented Nov 4, 2017

From @nwc10

On Sat, Nov 04, 2017 at 10​:25​:42AM -0500, Craig A. Berry wrote​:

On Nov 4, 2017, at 7​:09 AM, Nicholas Clark via RT <perlbug-followup@​perl.org> wrote​:

On Thu, Nov 02, 2017 at 08​:57​:05PM -0700, Craig A.Berry wrote​:

./miniperl -Ilib -e 'my $s; $s .= q||;'

I hacked miniperl to eliminate what gets loaded by buildcustomize.pl​:

--- perlmini.c;-0 2017-11-01 22​:40​:54 -0500
+++ perlmini.c 2017-11-03 07​:02​:59 -0500
@​@​ -2247,7 +2247,7 @​@​ S_parse_body(pTHX_ char **env, XSINIT_t
AV *const inc = GvAV(PL_incgv);
SV **const inc0 = inc ? av_fetch(inc, 0, FALSE) : NULL;

- if (inc0) {
+ if (0) {
/* if lib/buildcustomize.pl exists, it should not fail. If it does,
it should be reported immediately as a build failure. */
(void)Perl_av_create_and_unshift_one(aTHX_ &PL_preambleav,

which skips some platform-specific code, but that made no difference. So if it's something platform-specific, I haven't found it. I suppose it could also be some unusual combination of intsize, longsize, ptrsize, alignbytes and friends that isn't common elsewhere.

Which made me try it on what I guessed was the most alignment-grumpy machine
that had access to (or at least which I thought I had access to), Power8 on
the GCC compiler farm, and yet I can't get it to fail (on Linux or AIX)

Sadly I don't have access to anything Sparc (and more) as that was always
pretty thorough on catching out any bad assumptions.

But no, still can't replicate.

I did wonder - if you run ./miniperl -Dst -e 'my $s; $s .= q||;'
(ie add -Ds) on both something *nix (where it works) and VMS, does the
output diverge in any way that gives a clue?

(And I got rid of the -Ilib - I think that that is about the same as your
if (0) patch above - it will fail to find buildcustomize.pl, which you've
demonstrated isn't needed to provoke the crash)

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Nov 4, 2017

From @iabyn

On Sat, Nov 04, 2017 at 03​:53​:33PM -0500, Craig A. Berry wrote​:

which as far as I can see is exactly the same until it dies. It falls
down so hard that it doesn't even signal the crash unless I run it under
the debugger and explicitly ask it to break on exceptions.

I'd suggest expand out these macro calls in pp_multiconcat​:

  EXTEND(SP,1);
  SETTARG;

into theie constituent parts (on separate lines if necessary), then
single-step through them and see exactly what happens to PL_sp_stack and
sp on entry, during the EXTEND, and when SETTARG is doing (*sp = targ).

PS - the -f option to miniperl stops it loading buildcustomize.pl (or
whatever its called).

--
O Unicef Clearasil!
Gibberish and Drivel!
  -- "Bored of the Rings"

@p5pRT
Copy link
Author

p5pRT commented Nov 6, 2017

From @craigberry

On Nov 4, 2017, at 5​:04 PM, Dave Mitchell <davem@​iabyn.com> wrote​:

On Sat, Nov 04, 2017 at 03​:53​:33PM -0500, Craig A. Berry wrote​:

which as far as I can see is exactly the same until it dies. It falls
down so hard that it doesn't even signal the crash unless I run it under
the debugger and explicitly ask it to break on exceptions.

I'd suggest expand out these macro calls in pp_multiconcat​:

EXTEND(SP,1);
SETTARG;

into theie constituent parts (on separate lines if necessary), then
single-step through them and see exactly what happens to PL_sp_stack and
sp on entry, during the EXTEND, and when SETTARG is doing (*sp = targ).

I have done so. sp and PL_stack_sp are equal to each other and are unchanged by the EXTEND. The only effect of the EXTEND is that PL_curstackinfo->si_stack_hwm gets bumped from 0 to 1.

The only thing SETTARG does is​:

(*sp = targ);

which on Itanium is the single instruction​:

st4 [r51] = targ

and that's where it blows up, flagging the value of sp as an address that can't be written to. But it can be​: I can readily execute a deposit command in the debugger that does the same thing as that statement and it works just fine. So I think there's something else hosed up and sp (in this case register 51) is just the last thing it knew about at the time of the exception.

AND, I've now tried it on Alpha with the same configuration and there is no problem there. So only Itanium has the problem. At the moment it's looking like more of a C stack corruption, but who knows. And that's the end of my weekend debugging time.

PS - the -f option to miniperl stops it loading buildcustomize.pl (or
whatever its called).

Good to know.

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 2017

From @craigberry

On Sun, Nov 5, 2017 at 6​:19 PM, Craig A. Berry <craigberry@​mac.com> wrote​:

On Nov 4, 2017, at 5​:04 PM, Dave Mitchell <davem@​iabyn.com> wrote​:

On Sat, Nov 04, 2017 at 03​:53​:33PM -0500, Craig A. Berry wrote​:

which as far as I can see is exactly the same until it dies. It falls
down so hard that it doesn't even signal the crash unless I run it under
the debugger and explicitly ask it to break on exceptions.

I'd suggest expand out these macro calls in pp_multiconcat​:

EXTEND(SP,1);
SETTARG;

into theie constituent parts (on separate lines if necessary), then
single-step through them and see exactly what happens to PL_sp_stack and
sp on entry, during the EXTEND, and when SETTARG is doing (*sp = targ).

I have done so. sp and PL_stack_sp are equal to each other and are unchanged by the EXTEND. The only effect of the EXTEND is that PL_curstackinfo->si_stack_hwm gets bumped from 0 to 1.

The only thing SETTARG does is​:

(*sp = targ);

which on Itanium is the single instruction​:

st4 [r51] = targ

and that's where it blows up, flagging the value of sp as an address that can't be written to. But it can be​: I can readily execute a deposit command in the debugger that does the same thing as that statement and it works just fine. So I think there's something else hosed up and sp (in this case register 51) is just the last thing it knew about at the time of the exception.

I've now finally gotten back to this and the fix is easy​:

Inline Patch
--- pp_hot.c;-0 2017-11-04 06:24:52 -0500
+++ pp_hot.c 2017-11-09 20:41:41 -0600
@@ -391,7 +391,7 @@ PP(pp_multiconcat)
     UNOP_AUX_item *aux;      /* PL_op->op_aux buffer */
     UNOP_AUX_item *const_lens; /* the segment length array part of aux */
     const char *const_pv;    /* the current segment of the const string buf */
-    UV nargs;                /* how many args were expected */
+    IV nargs;                /* how many args were expected; IV
enables negation */   UV stack\_adj; /\* how much to adjust SP on return \*/   STRLEN grow; /\* final size of destination string \(dsv\) \*/   UV targ\_count; /\* how many times targ has appeared on the RHS \*/

The rationale for that fix is a bit harder, but before I get into
that, is there any reason *not* to make nargs an IV? Or some other
signed type like SSize_t? It's not like we'll ever be able to handle
more than 2147483647 arguments!

Where things go wrong for the C compiler is at the line​:

  SP -= (nargs - 1);

when the unsigned nargs is zero.

The C++ compiler, which on OpenVMS I64 is a completely unrelated
compiler with a back end licensed from Intel, does ok with that line
but falls down on​:

  targ = SP[-nargs];

In both of these cases, the target of the assignment is a 32-bit
pointer sitting in the bottom half of a 64-bit register. The upper
half of that register should be all zeroes but a negative computation
involving the unsigned nargs causes data to be written to that upper
half. When SETTARG tries to reference the contents of that register
as an address, we get an access violation because the full 64-bit
value is not a valid address. I misread the error message I
originally posted​:

%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual
address=FFFFFFFF017DBAA4, PC=00000000004C5650, PS=0000001B

because 0x017DBAA4 is the correct address of sp and I ignored the
FFFFFFFF because I thought it didn't matter, sp being a 32-bit
pointer. But the SETTARG treats the contents of the entire register
as an address, and that its a bogus address.

I assume we are running aground on some interpretation of C99 6.2.5,
paragraph 9, which says, "A computation involving unsigned operands
can never overflow, because a result that cannot be represented by the
resulting unsigned integer type is reduced modulo the number that is
one greater than the largest value that can be represented by the
resulting unsigned integer type." A helpful Stack Overflow article
quotes K&R as saying, "The negative of an unsigned quantity is
computed by subtracting the promoted value from the largest value of
the promoted type and adding one."[1]

I'm having trouble wrapping my head around how you can represent a
value that is one larger than the largest value you can represent.
But "don't do negative computations with unsigned operands" is my pea
brain version of the above and I will make Perl_pp_multiconcat follow
that advice in a day or two if no one objects.

[1] https://stackoverflow.com/questions/8026694/c-unary-minus-operator-behavior-with-unsigned-operands

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 2017

From @khwilliamson

On 11/10/2017 07​:59 AM, Craig A. Berry wrote​:

In both of these cases, the target of the assignment is a 32-bit
pointer sitting in the bottom half of a 64-bit register

ilmari found some bugs in multiconcat in regard to this, and is smoking
a fix at

https://perl5.git.perl.org/perl.git/shortlog/refs/heads/smoke-me/ilmari/multiconcat-fix

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 2017

From @craigberry

On Nov 10, 2017, at 9​:08 AM, karl williamson via RT <perlbug-followup@​perl.org> wrote​:

On 11/10/2017 07​:59 AM, Craig A. Berry wrote​:

In both of these cases, the target of the assignment is a 32-bit
pointer sitting in the bottom half of a 64-bit register

ilmari found some bugs in multiconcat in regard to this, and is smoking
a fix at

https://perl5.git.perl.org/perl.git/shortlog/refs/heads/smoke-me/ilmari/multiconcat-fix

Thanks, but VMS is little endian and sizeof(UV) and sizeof(size_t) are both 4 bytes in the configurations relevant to this ticket, so I think that's actually an unrelated problem.

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 2017

From @mauke

Am 10.11.2017 um 15​:59 schrieb Craig A. Berry​:

I assume we are running aground on some interpretation of C99 6.2.5,
paragraph 9, which says, "A computation involving unsigned operands
can never overflow, because a result that cannot be represented by the
resulting unsigned integer type is reduced modulo the number that is
one greater than the largest value that can be represented by the
resulting unsigned integer type." A helpful Stack Overflow article
quotes K&R as saying, "The negative of an unsigned quantity is
computed by subtracting the promoted value from the largest value of
the promoted type and adding one."[1]

I'm having trouble wrapping my head around how you can represent a
value that is one larger than the largest value you can represent.

[1] https://stackoverflow.com/questions/8026694/c-unary-minus-operator-behavior-with-unsigned-operands

That's easy​: We're talking about "the largest value that can be
represented *by the resulting unsigned integer type*". The number one
greater than that does not need to be represented by that type. In fact,
it does not need to be represented at all; it exists just for the
purposes of defining the semantics.

Example​: Assume unsigned int is 16 bits wide and therefore UINT_MAX =
0xffff (65535). Then arithmetic with unsigned ints is done modulo
UINT_MAX+1 = 0x10000 (65536). The computer never needs to represent this
value explicitly because every value modulo 65536 is less than 65536
(i.e. fits in the range 0 .. UINT_MAX). The spec says you do every
arithmetic operation with infinite precision and then reduce the result
modulo UINT_MAX+1, but in practice that just means carry bits past the
width of the type are ignored.

In other words, unsigned integer overflow is guaranteed to wrap around
from UINT_MAX to 0 (and vice versa).

(Signed integer overflow has undefined behavior. This does not mean you
get some number back, you just don't know what that number is. It means
literally anything can happen; it's semantically equivalent to
dereferencing an invalid pointer or calling a function whose declared
type doesn't match its definition.)

This has been today's episode of​: Fun with Standard C.

--
Lukas Mai <plokinom@​gmail.com>

@p5pRT
Copy link
Author

p5pRT commented Nov 13, 2017

From @iabyn

On Fri, Nov 10, 2017 at 08​:59​:56AM -0600, Craig A. Berry wrote​:

I've now finally gotten back to this and the fix is easy​:

--- pp_hot.c;-0 2017-11-04 06​:24​:52 -0500
+++ pp_hot.c 2017-11-09 20​:41​:41 -0600
@​@​ -391,7 +391,7 @​@​ PP(pp_multiconcat)
UNOP_AUX_item *aux; /* PL_op->op_aux buffer */
UNOP_AUX_item *const_lens; /* the segment length array part of aux */
const char *const_pv; /* the current segment of the const string buf */
- UV nargs; /* how many args were expected */
+ IV nargs; /* how many args were expected; IV

I've now pushed a more general fix​:

commit ca84e88
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Mon Nov 13 11​:50​:14 2017 +0000
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Mon Nov 13 12​:23​:24 2017 +0000

  change OP_MULTICONCAT nargs from UV to SSize_t
 
  Change it from unsigned to unsigned since it makes the SP-adjusting code
  in pp_multiconcat easier without hitting undefined behaviour (RT #132390);
  and change its size from UV to SSize_t since it represents the number
  of args on the stack.

--
Art is anything that has a label (especially if the label is "untitled 1")

@p5pRT
Copy link
Author

p5pRT commented Nov 13, 2017

From @craigberry

On Nov 13, 2017, at 06​:37 AM, Dave Mitchell <davem@​iabyn.com> wrote​:

On Fri, Nov 10, 2017 at 08​:59​:56AM -0600, Craig A. Berry wrote​:

I've now finally gotten back to this and the fix is easy​:

Inline Patch
--- pp_hot.c;-0 2017-11-04 06:24:52 -0500
+++ pp_hot.c 2017-11-09 20:41:41 -0600
@@ -391,7 +391,7 @@ PP(pp_multiconcat)
UNOP_AUX_item *aux; /* PL_op->op_aux buffer */ UNOP\_AUX\_item \*const\_lens; /\* the segment length array part of aux \*/ const char \*const\_pv; /\* the current segment of the const string buf \*/ \- UV nargs; /\* how many args were expected \*/ \+ IV nargs; /\* how many args were expected; IV

I've now pushed a more general fix​:

commit ca84e88
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Mon Nov 13 11​:50​:14 2017 +0000
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Mon Nov 13 12​:23​:24 2017 +0000

change OP_MULTICONCAT nargs from UV to SSize_t

Change it from unsigned to unsigned since it makes the SP-adjusting code
in pp_multiconcat easier without hitting undefined behaviour (RT #132390);
and change its size from UV to SSize_t since it represents the number
of args on the stack.

Thanks.  That got things building again and I think this ticket can be closed.

@p5pRT
Copy link
Author

p5pRT commented Nov 13, 2017

@iabyn - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant