Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid handshake, static build, 5.21.6+, 32bit Linux, no threads, no debug #14624

Closed
p5pRT opened this issue Mar 28, 2015 · 9 comments
Closed

Comments

@p5pRT
Copy link

p5pRT commented Mar 28, 2015

Migrated from rt.perl.org#124201 (status was 'rejected')

Searchable as RT124201$

@p5pRT
Copy link
Author

p5pRT commented Mar 28, 2015

From zzgrim@gmail.com

For a perl built statically with the aid of Marc Lehmann's staticperl
( https://metacpan.org/pod/distribution/App-Staticperl/bin/staticperl ),
no debug, no threads.

$ ./perl -MDigest​::xxHash -E1
xxHash.c​: loadable library and perl binaries are mismatched (got handshake key 0x7bc0000, needed 0x7cc0000)

This is a problem that appeared between 5.21.5 and 5.21.6 and still exists in 5.21.10.
( verified by building all from 1 to 10 )
I can only reproduce on 32bit Linux, with different gccs (>= 4.7 - debian wheezy) and clangs.
The problem does not manifest on 64bit.
And also I could not reproduce on 32bit FreeBSD 10.1 with either gcc or clang.

I made a repo for easier reproducible builds on a few different VMs.

https://github.com/zgrim/perlbug-handshake.git

eg​:
git clone https://github.com/zgrim/perlbug-handshake.git pbtmp.git
cd pbtmp.git
make
./pvm5.21.10 -MDigest​::xxHash -E1 && echo ok

Build flags are specified in .staticperlrc.
Perl version to be built can be specified in the environment PV,
eg​: PV=5.21.5 make

HTH

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl 5.21.10:

Configured by zgrim at Tue Mar 24 14:54:20 EET 2015.

Summary of my perl5 (revision 5 version 21 subversion 10) configuration:

  Platform:
    osname=linux, osvers=3.19.0-1-arch, archname=x86_64-linux
    uname='linux salmoxis 3.19.0-1-arch #1 smp preempt mon feb 9 07:08:20 cet 2015 x86_64 gnulinux '
    config_args='-de -Dprefix=/home/zgrim/perl5/perlbrew/perls/perl-5.21.10 -Dusedevel -Aeval:scriptdir=/home/zgrim/perl5/perlbrew/perls/perl-5.21.10/bin'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2',
    optimize='-O2',
    cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    ccversion='', gccversion='4.9.2 20150304 (prerelease)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/include-fixed /usr/lib /lib/../lib /usr/lib/../lib /lib /lib64 /usr/lib64 /usr/local/lib64
    libs=-lpthread -lnsl -lnm -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-lpthread -lnsl -lnm -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.21.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.21'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'



@INC for perl 5.21.10:
    /home/zgrim/perl5/perlbrew/perls/perl-5.21.10/lib/site_perl/5.21.10/x86_64-linux
    /home/zgrim/perl5/perlbrew/perls/perl-5.21.10/lib/site_perl/5.21.10
    /home/zgrim/perl5/perlbrew/perls/perl-5.21.10/lib/5.21.10/x86_64-linux
    /home/zgrim/perl5/perlbrew/perls/perl-5.21.10/lib/5.21.10
    .


Environment for perl 5.21.10:
    HOME=/home/zgrim
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LANG_ALL=en_US.UTF-8
    LC_ALL=en_US.UTF-8
    LD_LIBRARY_PATH=/home/zgrim/apps/svm/lib:/home/zgrim/apps/svm/lib:
    LOGDIR (unset)
    PATH=/home/zgrim/bin:/home/zgrim/perl5/perlbrew/perls/perl-5.21.10/bin:/home/zgrim/go/bin:/home/zgrim/apps/golang/bin:/usr/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/usr/lib/plan9/bin
    PERLBREW_BASHRC_VERSION=0.73
    PERLBREW_HOME=/home/zgrim/.perlbrew
    PERLBREW_MANPATH=/home/zgrim/perl5/perlbrew/perls/perl-5.21.10/man
    PERLBREW_PATH=/home/zgrim/perl5/perlbrew/bin:/home/zgrim/perl5/perlbrew/perls/perl-5.21.10/bin
    PERLBREW_PERL=perl-5.21.10
    PERLBREW_ROOT=/home/zgrim/perl5/perlbrew
    PERLBREW_VERSION=0.73
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Mar 29, 2015

From @doughera88

On Sat, Mar 28, 2015 at 07​:47​:21AM -0700, zgrim wrote​:

# New Ticket Created by zgrim
# Please include the string​: [perl #124201]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=124201 >

This is a bug report for perl from zgrim <zzgrim@​gmail.com>,
generated with the help of perlbug 1.40 running under perl 5.21.10.

-----------------------------------------------------------------
[Please describe your issue here]

For a perl built statically with the aid of Marc Lehmann's staticperl
( https://metacpan.org/pod/distribution/App-Staticperl/bin/staticperl ),
no debug, no threads.

$ ./perl -MDigest​::xxHash -E1
xxHash.c​: loadable library and perl binaries are mismatched (got handshake key 0x7bc0000, needed 0x7cc0000)

Could you please include the output of

  ./perl -V

in the directory where that test fails? (the perlbug output included
with the ticket was from a different perl.) Also, is it possible to
reproduce the problem without the staticperl tool? The fewer
intermediary tools there are to deal with, the easier it is to
isolate the underlying problem.

Thanks,

--
  Andy Dougherty doughera@​lafayette.edu

@p5pRT
Copy link
Author

p5pRT commented Mar 29, 2015

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Mar 29, 2015

From @bulk88

On Sat Mar 28 07​:47​:21 2015, zzgrim@​gmail.com wrote​:

This is a bug report for perl from zgrim <zzgrim@​gmail.com>,
generated with the help of perlbug 1.40 running under perl 5.21.10.

-----------------------------------------------------------------
[Please describe your issue here]

For a perl built statically with the aid of Marc Lehmann's staticperl
( https://metacpan.org/pod/distribution/App-Staticperl/bin/staticperl
),
no debug, no threads.

$ ./perl -MDigest​::xxHash -E1
xxHash.c​: loadable library and perl binaries are mismatched (got
handshake key 0x7bc0000, needed 0x7cc0000)

This is a problem that appeared between 5.21.5 and 5.21.6 and still
exists in 5.21.10.
( verified by building all from 1 to 10 )
I can only reproduce on 32bit Linux, with different gccs (>= 4.7 -
debian wheezy) and clangs.
The problem does not manifest on 64bit.
And also I could not reproduce on 32bit FreeBSD 10.1 with either gcc
or clang.

I made a repo for easier reproducible builds on a few different VMs.

https://github.com/zgrim/perlbug-handshake.git

eg​:
git clone https://github.com/zgrim/perlbug-handshake.git pbtmp.git
cd pbtmp.git
make
./pvm5.21.10 -MDigest​::xxHash -E1 && echo ok

Build flags are specified in .staticperlrc.
Perl version to be built can be specified in the environment PV,
eg​: PV=5.21.5 make

HTH

"(got handshake key 0x7bc0000, needed 0x7cc0000)" means in your XS module, the definition of the perl's master interpreter struct is smaller than the interp engine expects (XS module says its 0x7bc bytes long, interp core says its 0x7cc bytes long). The layouts of the structs are not compatible. They must be the same otherwise memory corruption occurs.

This is probably not a bug in Perl but in your non p5p-code. The error message means that a .so or .o, is not being executed by the same perl (same config.h, same -Ds on cmd line, same Config.pm, same perl 5.X._, where X must match) it was compiled against. If it worked in the past, you were lucky it didn't SEGV. Unlike with pure Perl modules, you can not copy paste XS module .so'es between different perl builds instead of rebuilding them from source (with the 1 small exception of perl maintenance releases, aslong as they have an identical input to Configure/config.h/Config.pm).

Briefly looking at your repo,

https://github.com/zgrim/perlbug-handshake/blob/master/staticperl

has


# perl build variables
MAKE=make
PERL_VERSION=5.12.4 # 5.8.9 is also a good choice
PERL_CC=cc
PERL_CONFIGURE="" # additional Configure arguments
PERL_CCFLAGS="-g -DPERL_DISABLE_PMC -DPERL_ARENA_SIZE=16376 -DNO_PERL_MALLOC_ENV -D_GNU_SOURCE -DNDEBUG"
PERL_OPTIMIZE="-Os" # -Os -ffunction-sections -fdata-sections -finline-limit=8 -ffast-math"
ARCH="$(uname -m)"


but

https://github.com/zgrim/perlbug-handshake/blob/master/.staticperlrc

which shows a different perl build


BUILD_DEFAULT="5.21.10"
PERL_VERSION="${PV​:-$BUILD_DEFAULT}"
PERL_CONFIGURE="$PERL_CONFIGURE -Dusedevel -Duserelocatableinc -Dinc_version_list=none -Dman1dir=none -Dman3dir=none -Uversiononly"
PERL_MM_USE_DEFAULT=1
PERL_CCFLAGS="-DPERL_DISABLE_PMC -DPERL_ARENA_SIZE=16376 -DNO_PERL_MALLOC_ENV -D_GNU_SOURCE -DNDEBUG -D_REENTRANT -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DPERL_COPY_ON_WRITE -UDEBUGGING -DPL_OP_SLAB_ALLOC -DNO_MATHOMS -DSILENT_NO_TAINT_SUPPORT"
PERL_OPTIMIZE="-O2 -ffunction-sections -fdata-sections -fno-strict-aliasing -pipe -fomit-frame-pointer -finline-limit=8 -mpush-args -mno-inline-stringops-dynamically -mno-align-stringops"


2 different perls with different -Ds. I dont use unix or unix shell lang or the staticperl script, and this ticket is beyond my knowledge, but I am the author of that error message. I got sick of getting SEGVs when I switch which version of perl interp is in my PATH, and forget to do "make clean/perl Makefile.PL/make test" and instead just do "make test" and it instantly SEGVs in the XS module I am working on. Also when I upgrade my blead perl, I dont delete everything out of lazyness, I would much rather get error messages saying things are out of sync and to rebuild the XS module rather than a meaningless (without C debugger) SEGV.

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT
Copy link
Author

p5pRT commented Mar 29, 2015

From @bulk88

On Sat Mar 28 22​:38​:13 2015, bulk88 wrote​:

Briefly looking at your repo,

https://github.com/zgrim/perlbug-handshake/blob/master/staticperl

has
----------------------------------------------------------------
# perl build variables
MAKE=make
PERL_VERSION=5.12.4 # 5.8.9 is also a good choice
PERL_CC=cc
PERL_CONFIGURE="" # additional Configure arguments
PERL_CCFLAGS="-g -DPERL_DISABLE_PMC -DPERL_ARENA_SIZE=16376
-DNO_PERL_MALLOC_ENV -D_GNU_SOURCE -DNDEBUG"
PERL_OPTIMIZE="-Os" # -Os -ffunction-sections -fdata-sections
-finline-limit=8 -ffast-math"
ARCH="$(uname -m)"
----------------------------------------------------------------
but

https://github.com/zgrim/perlbug-handshake/blob/master/.staticperlrc

which shows a different perl build

----------------------------------------------------------------
BUILD_DEFAULT="5.21.10"
PERL_VERSION="${PV​:-$BUILD_DEFAULT}"
PERL_CONFIGURE="$PERL_CONFIGURE -Dusedevel -Duserelocatableinc
-Dinc_version_list=none -Dman1dir=none -Dman3dir=none -Uversiononly"
PERL_MM_USE_DEFAULT=1
PERL_CCFLAGS="-DPERL_DISABLE_PMC -DPERL_ARENA_SIZE=16376
-DNO_PERL_MALLOC_ENV -D_GNU_SOURCE -DNDEBUG -D_REENTRANT
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DPERL_COPY_ON_WRITE
-UDEBUGGING -DPL_OP_SLAB_ALLOC -DNO_MATHOMS -DSILENT_NO_TAINT_SUPPORT"
PERL_OPTIMIZE="-O2 -ffunction-sections -fdata-sections -fno-strict-
aliasing -pipe -fomit-frame-pointer -finline-limit=8 -mpush-args -mno-
inline-stringops-dynamically -mno-align-stringops"

----------------------------------------------------------------

2 different perls with different -Ds.

Your repo also has https://github.com/zgrim/perlbug-handshake/blob/master/ext/Digest-xxHash-1.02/Makefile.PL

Digest​::xxHash on CPAN does not have a Makefile.PL, and is Module​::Build based (has Build.PL, there is no Makefile.PL) https://metacpan.org/source/SANKO/Digest-xxHash-1.02 . Inside this custom Makefile.PL I see


#! perl
use ExtUtils​::MakeMaker;
WriteMakefile(
NAME => 'Digest​::xxHash',
AUTHOR => 'Sanko Robinson',
VERSION_FROM => 'lib/Digest/xxHash.pm',
ABSTRACT => q/Digest​::xxHash/,
MIN_PERL_VERSION => 5.10.1,
LINKTYPE => 'static',
CCFLAGS => '-Isrc',
OBJECT => q/$(O_FILES)/,
);


the "CCFLAGS => '-Isrc'," wipes all the -Ds. You have to normally append to CCFLAGS, not replace it and throw away all the -Ds that used to be in it.

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT
Copy link
Author

p5pRT commented Mar 30, 2015

From zzgrim@gmail.com

On 2015-03-29 01​:19​:51 -0700, bulk88 via RT wrote​:

the "CCFLAGS => '-Isrc'," wipes all the -Ds. You have to normally append to CCFLAGS, not replace it and throw away all the -Ds that used to be in it.

Hello.
And thank you, you caught it right away, the problem really was a CFLAGS mismatch
between perl and the XS module. Changing Makefile.PL in the xxhash module with

  CCFLAGS => $Config{ccflags},
  INC => '-Isrc',

makes the issue go away.

The reason I had to do a custom Makefile.PL with ExtUtils​::MakeMaker is that I was
unable to convince Module​::Build to work with static perls and gave up at some
point.

I've been using this whole "static" setup for 4+ years, makes mass deployments
much easier, with all perls from 5.10.1 till 5.20, luckily it did not segfault,
and wow, it has quite a few such cflags mismatches.

But i have to say, although raising the handshake exception seems like a good
strategy, it may be incomplete, it only seems to catch some platforms (32bit linux
from the environments i tested with).

Another side-effect is that lazily required XS code statically built with
mismatching flags is now dying at runtime (in rare codepaths that used to luckily
work, such as my case).

Could the enforcement be toggled at compile-time ?

@p5pRT
Copy link
Author

p5pRT commented Aug 5, 2015

From @bulk88

On Sun Mar 29 20​:07​:49 2015, zzgrim@​gmail.com wrote​:

I've been using this whole "static" setup for 4+ years, makes mass
deployments
much easier, with all perls from 5.10.1 till 5.20, luckily it did not
segfault,
and wow, it has quite a few such cflags mismatches.

But i have to say, although raising the handshake exception seems like
a good
strategy, it may be incomplete, it only seems to catch some platforms
(32bit linux
from the environments i tested with).

Another side-effect is that lazily required XS code statically built
with
mismatching flags is now dying at runtime (in rare codepaths that used
to luckily
work, such as my case).

You can't mix XS modules (.so/.dll/.o files) between different perl versions or build configs except for maint releases of perl. What dohappens when perl interp thinks a scalar's floating point number is 8 bytes long, and an XS module thinks it is 10 bytes long (long double build)? If it worked before, it was by chance you didn't run any XS code where the meaning of bits in a bitfield changed enough for it to be observable with side effects. Also since Perl 5.13.4 (basically 5.14.0), perl version numbers were checks to make sure they match between XS modules and the interp. Handshake code extends that compatibility check even further. If you are using the same .so or .o files between different interps how did you deal with "Perl API version %s of %s does not match %s" failures from 5.14 to 5.20?

The handshake check is done once, only when you use/require an XS module. I dont think loading an XS module is a rare thing to happen. Grep your private code base for every use()/require() line and write a test that use()/require()s every XS module in your software stack. That way you find out if you have handshake failures or not before a lazy loaded module is require()ed in a rare PP branch.

Could the enforcement be toggled at compile-time ?

You are choosing the quick and wrong way to fix the problem. Perl is open source software, nobody can stop you from removing the handshake checks from perl interp's C code in util.c, but if you do that, it is very wrong to later ask for support on why perl is SEGVing or causing "panic​:" errors or unreproducible on other systems bugs.

I think this bug is heading towards close or rejected, there is nothing for p5p to change here.

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT
Copy link
Author

p5pRT commented Feb 5, 2017

From @jkeenan

On Wed, 05 Aug 2015 23​:23​:39 GMT, bulk88 wrote​:

On Sun Mar 29 20​:07​:49 2015, zzgrim@​gmail.com wrote​:

I've been using this whole "static" setup for 4+ years, makes mass
deployments
much easier, with all perls from 5.10.1 till 5.20, luckily it did not
segfault,
and wow, it has quite a few such cflags mismatches.

But i have to say, although raising the handshake exception seems
like
a good
strategy, it may be incomplete, it only seems to catch some platforms
(32bit linux
from the environments i tested with).

Another side-effect is that lazily required XS code statically built
with
mismatching flags is now dying at runtime (in rare codepaths that
used
to luckily
work, such as my case).

You can't mix XS modules (.so/.dll/.o files) between different perl
versions or build configs except for maint releases of perl. What
dohappens when perl interp thinks a scalar's floating point number is
8 bytes long, and an XS module thinks it is 10 bytes long (long double
build)? If it worked before, it was by chance you didn't run any XS
code where the meaning of bits in a bitfield changed enough for it to
be observable with side effects. Also since Perl 5.13.4 (basically
5.14.0), perl version numbers were checks to make sure they match
between XS modules and the interp. Handshake code extends that
compatibility check even further. If you are using the same .so or .o
files between different interps how did you deal with "Perl API
version %s of %s does not match %s" failures from 5.14 to 5.20?

The handshake check is done once, only when you use/require an XS
module. I dont think loading an XS module is a rare thing to happen.
Grep your private code base for every use()/require() line and write a
test that use()/require()s every XS module in your software stack.
That way you find out if you have handshake failures or not before a
lazy loaded module is require()ed in a rare PP branch.

Could the enforcement be toggled at compile-time ?

You are choosing the quick and wrong way to fix the problem. Perl is
open source software, nobody can stop you from removing the handshake
checks from perl interp's C code in util.c, but if you do that, it is
very wrong to later ask for support on why perl is SEGVing or causing
"panic​:" errors or unreproducible on other systems bugs.

I think this bug is heading towards close or rejected, there is
nothing for p5p to change here.

Since there has been no further discussion in this RT in a year-and-a-half, I will implement bulk88's suggestion and close this ticket.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT p5pRT closed this as completed Feb 5, 2017
@p5pRT
Copy link
Author

p5pRT commented Feb 5, 2017

@jkeenan - Status changed from 'open' to 'rejected'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant