Thread race in dist/IO/IO.xs #14816

p5pRT · 2015-07-24T23:11:32Z

Migrated from rt.perl.org#125685 (status was 'open')

Searchable as RT125685$

p5pRT · 2015-07-24T23:11:32Z

From william@25thandClement.com

Created by william@25thandClement.com

This is a bug report for perl from william@25thandClement.com,
generated with the help of perlbug 1.40 running under perl 5.20.1.

-----------------------------------------------------------------
The _create_getline_subs routine in dist/IO/IO.xs manipulates PL_check in a
thread-unsafe manner. It's called when loading IO::Handle, at line 437 of
dist/IO/lib/IO/Handle.pm.

In our case this manifested as Perl interpreters in other threads randomly
throwing warnings about undefined constants. This appears to be a regression
starting with 5.16 and still exists in 5.22 and 5.23.1. Apache mod_perl
seems to be immune because they clone their interpreters, and presumably
IO::Handle is always loaded into the original.

Our workaround was to preload IO::Handle when initializing interpreters
while holding a global writer lock, and use a reader lock when executing
Perl code.

I'm not sure what a proper fix is. Simply using wrap_op_checker, or
otherwise taking a mutex from _create_getline_subs, is insufficient because
the interposition is intended to be temporary. I don't understand what the
function is trying to achieve, either.

Perl Info


Flags:
    category=library
    severity=high
    module=IO::Handle

Site configuration information for perl 5.20.1:

Configured by root at Thu Jan  1  0:00:00 UTC 1970.

Summary of my perl5 (revision 5 version 20 subversion 1) configuration:
   
  Platform:
    osname=openbsd, osvers=5.7, archname=amd64-openbsd
    uname='openbsd'
    config_args='-dsE -Dopenbsd_distribution=defined -Dccflags=-DNO_LOCALE_NUMERIC -DNO_LOCALE_COLLATE -Dmksymlinks'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-DNO_LOCALE_NUMERIC -DNO_LOCALE_COLLATE -fno-strict-aliasing -fno-delete-null-pointer-checks -pipe -fstack-protector -I/usr/local/include',
    optimize='-O2',
    cppflags='-DNO_LOCALE_NUMERIC -DNO_LOCALE_COLLATE -fno-strict-aliasing -fno-delete-null-pointer-checks -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.2.1 20070719 ', gccosandvers='openbsd5.7'
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags ='-Wl,-E  -fstack-protector -L/usr/local/lib'
    libpth=/usr/lib /usr/lib
    libs=-lm -lutil -lc
    perllibs=-lm -lutil -lc
    libc=/usr/lib/libc.a, so=so, useshrplib=true, libperl=libperl.so.17.0
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-R/usr/libdata/perl5/amd64-openbsd/5.20.1/CORE'
    cccdlflags='-DPIC -fPIC ', lddlflags='-shared -fPIC  -fstack-protector -L/usr/local/lib'



@INC for perl 5.20.1:
    /usr/local/libdata/perl5/site_perl/amd64-openbsd
    /usr/libdata/perl5/site_perl/amd64-openbsd
    /usr/local/libdata/perl5/site_perl
    /usr/libdata/perl5/site_perl
    /usr/libdata/perl5/amd64-openbsd/5.20.1
    /usr/local/libdata/perl5/amd64-openbsd/5.20.1
    /usr/libdata/perl5
    /usr/local/libdata/perl5
    .


Environment for perl 5.20.1:
    HOME=/home/william
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/usr/games:.
    PERL_BADLANG (unset)
    SHELL=/bin/ksh

p5pRT · 2017-01-05T20:24:26Z

From @jkeenan

On Fri, 24 Jul 2015 23:11:32 GMT, william@25thandClement.com wrote:

This is a bug report for perl from william@25thandClement.com,
generated with the help of perlbug 1.40 running under perl 5.20.1.

-----------------------------------------------------------------
The _create_getline_subs routine in dist/IO/IO.xs manipulates PL_check
in a
thread-unsafe manner. It's called when loading IO::Handle, at line 437
of
dist/IO/lib/IO/Handle.pm.

In our case this manifested as Perl interpreters in other threads
randomly
throwing warnings about undefined constants. This appears to be a
regression
starting with 5.16 and still exists in 5.22 and 5.23.1. Apache
mod_perl
seems to be immune because they clone their interpreters, and
presumably
IO::Handle is always loaded into the original.

Our workaround was to preload IO::Handle when initializing
interpreters
while holding a global writer lock, and use a reader lock when
executing
Perl code.

I'm not sure what a proper fix is. Simply using wrap_op_checker, or
otherwise taking a mutex from _create_getline_subs, is insufficient
because
the interposition is intended to be temporary. I don't understand what
the
function is trying to achieve, either.

1. Would it be possible to supply a small program demonstrating this problem?

2. Given the line you're pointing to in the source code and the point at which the problem first emerged, I suspect the problem began with:

#####
commit 986a805
Author: Father Chrysostomos <sprout@cpan.org>
Date: Sat Sep 17 12:09:22 2011 -0700

Make IO::Handle::getline(s) respect the open pragma
...
#####

But we'd need a reproducible failure to investigate further.

3. Can you supply the output of 'perl -V' from the system where you observed this problem? (The one supplied with the original post came from a non-threaded perl, which I suspect was not where the problem occured.)

Thank you very much.

--
James E Keenan (jkeenan@cpan.org)

p5pRT · 2017-01-05T20:24:26Z

The RT System itself - Status changed from 'new' to 'open'

p5pRT · 2017-01-05T21:23:50Z

From @dur-randir

On Thu, 05 Jan 2017 12:24:26 -0800, jkeenan wrote:

1. Would it be possible to supply a small program demonstrating this
problem?
use threads;

while (1) {
push @foo, threads->create(sub {
require IO::Handle;
});

$_->detach for(splice @foo);
}

This should output nothing, but is really noisy instead.

p5pRT · 2017-01-05T22:22:30Z

From @jkeenan

On Thu, 05 Jan 2017 21:23:50 GMT, randir wrote:

On Thu, 05 Jan 2017 12:24:26 -0800, jkeenan wrote:

1. Would it be possible to supply a small program demonstrating this
problem?
use threads;

while (1) {
push @foo, threads->create(sub {
require IO::Handle;
});
$\_\->detach for$splice @&#8203;foo$;
}

This should output nothing, but is really noisy instead.

Thanks. I can confirm that 986a805 is the problematic commit.

#####
$ ./perl -v

This is perl 5, version 15, subversion 2 (v5.15.2-464-g7d167fe) built for x86_64-linux-thread-multi

$ ./perl -Ilib -c ~/learn/perl/p5p/125685-threads.pl
/home/jkeenan/learn/perl/p5p/125685-threads.pl syntax OK

$ ./perl -Ilib ~/learn/perl/p5p/125685-threads.pl
[# no output]
^C

#####
$ ./perl -v

This is perl 5, version 15, subversion 2 (v5.15.2-465-g986a805) built for x86_64-linux-thread-multi

$ ./perl -Ilib -c ~/learn/perl/p5p/125685-threads.pl
/home/jkeenan/learn/perl/p5p/125685-threads.pl syntax OK

$ ./perl -Ilib ~/learn/perl/p5p/125685-threads.pl
[# very noisy output]
^C
#####

Father C: any thoughts?

--
James E Keenan (jkeenan@cpan.org)

p5pRT · 2017-01-26T00:05:09Z

From @hvds

On Thu, 05 Jan 2017 14:22:30 -0800, jkeenan wrote:

I can confirm that 986a805
is the problematic commit.

I took a quick look through, and the manipulation of PL_check[] looked thread-unsafe; checking a bit further I find in perlvars.h:
For thread safety, modules should not write directly to this array.
Instead, use the function L</wrap_op_checker>.

I don't see mention of a function to remove the inserted checker again though, I guess that's rather harder to do safely.

Hugo

p5pRT · 2019-09-20T16:09:16Z

From @niner

Hi,
When embedding perl in a multi threaded program, loading IO::Handle
concurrently in multiple threads (each with their own interpreter) will cause
compilation errors in other threads and in unrelated modules. The errors all
look like missing symbols that should be there, like undefined subroutines or
methods or imported subs not there causing errors like this:
String found where operator expected at /usr/lib/perl5/5.28.1/SelectSaver.pm
line 42, near "croak 'usage: SelectSaver->new( [FILEHANDLE] )'" (Do you need
to predeclare croak?) syntax error at /usr/lib/perl5/5.28.1/SelectSaver.pm
line 42, near "croak 'usage: SelectSaver->new( [FILEHANDLE] )'"

The errors can be reproduced using the attached C program which I compiled
with: gcc -Wall testthreads.c -Wl,-E -Wl,-rpath,/usr/lib/perl5/5.28.1/x86_64-
linux-thread-multi/CORE -L/usr/local/lib64 -fstack-protector-strong -L/
usr/lib/perl5/5.28.1/x86_64-linux-thread-multi/CORE -lperl -lm -ldl -lcrypt -
lpthread -D_REENTRANT -D_GNU_SOURCE -DPERL_USE_SAFE_PUTENV -fwrapv -fno-
strict-aliasing -pipe -fstack-protector-strong -D_LARGEFILE_SOURCE -
D_FILE_OFFSET_BITS=64 -I/usr/lib/perl5/5.28.1/x86_64-linux-thread-multi/CORE
-o testthreads -fPIC -g testthreads.c
--
Stefan

p5pRT · 2019-09-20T16:09:16Z

From @niner

#include <EXTERN.h>
#include <perl.h>
#include <XSUB.h>
#include <pthread.h>
#include <unistd.h>

static void xs_init (pTHX);
EXTERN_C void boot_DynaLoader (pTHX_ CV* cv);
EXTERN_C void boot_Socket (pTHX_ CV* cv);

EXTERN_C void xs_init(pTHX) {
    char *file = __FILE__;
    newXS("DynaLoader::boot_DynaLoader", boot_DynaLoader, file);
}

PerlInterpreter *p5_init_perl(int argc, char *argv[]) {
    PerlInterpreter *my_perl = perl_alloc();
    PERL_SET_CONTEXT(my_perl);
    PL_perl_destruct_level = 1;
    perl_construct( my_perl );
    perl_parse(my_perl, xs_init, argc, argv, NULL);
    PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
    perl_run(my_perl);

    return my_perl;
}

void p5_destruct_perl(PerlInterpreter *my_perl) {
    PERL_SET_CONTEXT(my_perl);
    PL_perl_destruct_level = 1;

    POPSTACK_TO(PL_mainstack);
    dounwind(-1);
    LEAVE_SCOPE(0);

    perl_destruct(my_perl);
    perl_free(my_perl);
}

void p5_eval_pv(PerlInterpreter *my_perl, const char* p, I32 croak_on_error) {
    PERL_SET_CONTEXT(my_perl);
    {
        dSP;
        ENTER;
        SAVETMPS;
        PUSHMARK(SP);

        eval_pv(p, croak_on_error);

        SPAGAIN;
        PUTBACK;
        FREETMPS;
        LEAVE;
    }
}

typedef struct {
    int argc;
    char **argv;
} args;

void *thread_code(void *data) {
    PerlInterpreter *my_perl = p5_init_perl(((args*)data)->argc, ((args*)data)->argv);
    p5_eval_pv(my_perl, "use IO::Handle; 1", 1);
    p5_destruct_perl(my_perl);
    return NULL;
}

#define thread_count 60

int main(int argc, char *argv[]) {
    args a = {argc, argv};
    PERL_SYS_INIT(&argc, &argv);

    pthread_t thread_id[thread_count];

    for(int i = 0; i < thread_count; i++) {
        int err = pthread_create(&thread_id[i], NULL, thread_code, &a);
        if (err != 0) abort();
    }

    for(int i = 0; i < thread_count; i++) {
        void *result;
        int err = pthread_join(thread_id[i], &result);
        if (err != 0) abort();
    }

    PERL_SYS_TERM();

    return 0;
}

p5pRT · 2019-09-21T10:22:28Z

From @niner

Removing this code from IO/Handle.pm makes the issue go away:

# Special XS wrapper to make them inherit lexical hints from the caller.
_create_getline_subs( <<'END' ) or die $@;
sub getline {
@_ == 1 or croak 'usage: $io->getline()';
my $this = shift;
return scalar <$this>;
}

sub getlines {
@_ == 1 or croak 'usage: $io->getlines()';
wantarray or
croak 'Can\'t call $io->getlines in a scalar context, use $io->getline';
my $this = shift;
return <$this>;
}
1; # return true for error checking
END

*gets = \&getline; # deprecated

p5pRT · 2019-09-21T10:22:28Z

From [Unknown Contact. See original ticket]

Removing this code from IO/Handle.pm makes the issue go away:

# Special XS wrapper to make them inherit lexical hints from the caller.
_create_getline_subs( <<'END' ) or die $@;
sub getline {
@_ == 1 or croak 'usage: $io->getline()';
my $this = shift;
return scalar <$this>;
}

sub getlines {
@_ == 1 or croak 'usage: $io->getlines()';
wantarray or
croak 'Can\'t call $io->getlines in a scalar context, use $io->getline';
my $this = shift;
return <$this>;
}
1; # return true for error checking
END

*gets = \&getline; # deprecated

p5pRT · 2019-09-21T12:38:36Z

From @niner

The issue is actually a regression caused by https://perl5.git.perl.org/perl.git/commit/986a805c4b2

_create_getline_subs is not thread safe as it assigns to PL_check directly:
PL_check[OP_LINESEQ] = io_ck_lineseq;
https://perl5.git.perl.org/perl.git/blob/HEAD:/dist/IO/IO.xs#l565

According to perlapi(1) "PL_check" is global to an entire process and must not be written to directly. Instead "wrap_op_checker" should be used.

But still, since PL_check is global that would affect concurrent compilation and I don't think io_ck_lineseq is prepared for that. It looks like it's really tailored for compilation of getline and getlines as it does not check in any way if it should apply its modifications to the OP tree.

p5pRT · 2019-09-21T12:38:36Z

From [Unknown Contact. See original ticket]

The issue is actually a regression caused by https://perl5.git.perl.org/perl.git/commit/986a805c4b2

_create_getline_subs is not thread safe as it assigns to PL_check directly:
PL_check[OP_LINESEQ] = io_ck_lineseq;
https://perl5.git.perl.org/perl.git/blob/HEAD:/dist/IO/IO.xs#l565

According to perlapi(1) "PL_check" is global to an entire process and must not be written to directly. Instead "wrap_op_checker" should be used.

But still, since PL_check is global that would affect concurrent compilation and I don't think io_ck_lineseq is prepared for that. It looks like it's really tailored for compilation of getline and getlines as it does not check in any way if it should apply its modifications to the OP tree.

p5pRT · 2019-09-22T15:42:00Z

From @jkeenan

On Fri, 20 Sep 2019 16:09:16 GMT, ss@atikon.com wrote:

Hi,
When embedding perl in a multi threaded program, loading IO::Handle
concurrently in multiple threads (each with their own interpreter)
will cause
compilation errors in other threads and in unrelated modules. The
errors all
look like missing symbols that should be there, like undefined
subroutines or
methods or imported subs not there causing errors like this:
String found where operator expected at
/usr/lib/perl5/5.28.1/SelectSaver.pm
line 42, near "croak 'usage: SelectSaver->new( [FILEHANDLE] )'" (Do
you need
to predeclare croak?) syntax error at
/usr/lib/perl5/5.28.1/SelectSaver.pm
line 42, near "croak 'usage: SelectSaver->new( [FILEHANDLE] )'"

The errors can be reproduced using the attached C program which I
compiled
with: gcc -Wall testthreads.c -Wl,-E -Wl,-
rpath,/usr/lib/perl5/5.28.1/x86_64-
linux-thread-multi/CORE -L/usr/local/lib64 -fstack-protector-strong
-L/
usr/lib/perl5/5.28.1/x86_64-linux-thread-multi/CORE -lperl -lm -ldl
-lcrypt -
lpthread -D_REENTRANT -D_GNU_SOURCE -DPERL_USE_SAFE_PUTENV -fwrapv
-fno-
strict-aliasing -pipe -fstack-protector-strong -D_LARGEFILE_SOURCE -
D_FILE_OFFSET_BITS=64 -I/usr/lib/perl5/5.28.1/x86_64-linux-thread-
multi/CORE
-o testthreads -fPIC -g testthreads.c

I built a threaded perl at blead and then tried to compile your test program. The compilation failed.

#####
$ gcc -Wall 134444-testthreads.c -Wl,-E -Wl,-rpath,/home/jkeenan/testing/threaded_blead/lib/perl5/5.31.5/x86_64-linux-thread-multi/CORE \

-L/usr/local/lib64 -fstack-protector-strong \
-L/home/jkeenan/testing/threaded_blead/lib/perl5/5.31.5/x86_64-linux-thread-multi/CORE -lperl -lm -ldl \
-lcrypt -lpthread -D_REENTRANT -D_GNU_SOURCE -DPERL_USE_SAFE_PUTENV -fwrapv -fno-strict-aliasing \
-pipe -fstack-protector-strong -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 \
-I/home/jkeenan/testing/threaded_blead/lib/perl5/5.31.5/x86_64-linux-thread-multi/CORE \
-o testthreads -fPIC -g 134444-testthreads.c
/tmp/ccCfdXuT.o: In function `p5_init_perl':
/home/jkeenan/learn/perl/p5p/134444-testthreads.c:16: multiple definition of `p5_init_perl'
/tmp/ccQrNdU4.o:/home/jkeenan/learn/perl/p5p/134444-testthreads.c:16: first defined here
/tmp/ccCfdXuT.o: In function `p5_destruct_perl':
/home/jkeenan/learn/perl/p5p/134444-testthreads.c:28: multiple definition of `p5_destruct_perl'
/tmp/ccQrNdU4.o:/home/jkeenan/learn/perl/p5p/134444-testthreads.c:28: first defined here
/tmp/ccCfdXuT.o: In function `p5_eval_pv':
/home/jkeenan/learn/perl/p5p/134444-testthreads.c:40: multiple definition of `p5_eval_pv'
/tmp/ccQrNdU4.o:/home/jkeenan/learn/perl/p5p/134444-testthreads.c:40: first defined here
/tmp/ccCfdXuT.o: In function `thread_code':
/home/jkeenan/learn/perl/p5p/134444-testthreads.c:62: multiple definition of `thread_code'
/tmp/ccQrNdU4.o:/home/jkeenan/learn/perl/p5p/134444-testthreads.c:62: first defined here
/tmp/ccCfdXuT.o: In function `main':
/home/jkeenan/learn/perl/p5p/134444-testthreads.c:71: multiple definition of `main'
/tmp/ccQrNdU4.o:/home/jkeenan/learn/perl/p5p/134444-testthreads.c:71: first defined here
collect2: error: ld returned 1 exit status
#####

Can you spot the problem?

Thank you very much.
--
James E Keenan (jkeenan@cpan.org)

p5pRT · 2019-09-22T15:42:00Z

The RT System itself - Status changed from 'new' to 'open'

p5pRT · 2019-09-22T17:24:48Z

From @niner

On Sun, 22 Sep 2019 08:42:00 -0700, jkeenan wrote:

$ gcc -Wall 134444-testthreads.c -Wl,-E -Wl,-
...
-o testthreads -fPIC -g 134444-testthreads.c

Can you spot the problem?

Indeed! You gave it the name of the source file twice.

I have since written a patch that seems to fix my problem. It moves PL_check to the interpreter struct which should avoid any threading related issues altogether. It's attached.

p5pRT · 2019-09-22T17:24:48Z

From @niner

PL_check-to-interp.diff

commit ab49dea882469093cd1bce8aa4ddadd76f99f15d
Author: Stefan Seifert <nine@detonation.org>
Date:   Sun Sep 22 11:09:29 2019 +0200

    Move PL_check to the interp vars to fix threading issues

diff --git a/embedvar.h b/embedvar.h
index 35cf8f2191..33a3fbd72c 100644
--- a/embedvar.h
+++ b/embedvar.h
@@ -71,6 +71,7 @@
 #define PL_body_roots		(vTHX->Ibody_roots)
 #define PL_bodytarget		(vTHX->Ibodytarget)
 #define PL_breakable_sub_gen	(vTHX->Ibreakable_sub_gen)
+#define PL_check		(vTHX->Icheck)
 #define PL_checkav		(vTHX->Icheckav)
 #define PL_checkav_save		(vTHX->Icheckav_save)
 #define PL_chopset		(vTHX->Ichopset)
@@ -385,8 +386,6 @@
 #define PL_GXPosix_ptrs		(my_vars->GXPosix_ptrs)
 #define PL_appctx		(my_vars->Gappctx)
 #define PL_Gappctx		(my_vars->Gappctx)
-#define PL_check		(my_vars->Gcheck)
-#define PL_Gcheck		(my_vars->Gcheck)
 #define PL_check_mutex		(my_vars->Gcheck_mutex)
 #define PL_Gcheck_mutex		(my_vars->Gcheck_mutex)
 #define PL_csighandlerp		(my_vars->Gcsighandlerp)
diff --git a/globvar.sym b/globvar.sym
index dcc65f2e29..1642c88060 100644
--- a/globvar.sym
+++ b/globvar.sym
@@ -10,7 +10,6 @@ PL_bitcount
 PL_block_type
 PL_c9_utf8_dfa_tab
 PL_charclass
-PL_check
 PL_core_reg_engine
 PL_extended_utf8_dfa_tab
 PL_fold
diff --git a/intrpvar.h b/intrpvar.h
index 47383ae818..11790b7a34 100644
--- a/intrpvar.h
+++ b/intrpvar.h
@@ -496,6 +496,7 @@ PERLVAR(I, endav,	AV *)		/* names of END subroutines */
 PERLVAR(I, unitcheckav,	AV *)		/* names of UNITCHECK subroutines */
 PERLVAR(I, checkav,	AV *)		/* names of CHECK subroutines */
 PERLVAR(I, initav,	AV *)		/* names of INIT subroutines */
+PERLVAR(I, check,	Perl_check_t *) /* functions to call during CHECK phase */
 
 /* subprocess state */
 PERLVAR(I, fdpid,	AV *)		/* keep fd-to-pid mappings for my_popen */
diff --git a/opcode.h b/opcode.h
index ba3bd9e668..23ac225f5f 100644
--- a/opcode.h
+++ b/opcode.h
@@ -1371,15 +1371,8 @@ EXT Perl_ppaddr_t PL_ppaddr[] /* or perlvars.h */
 ;
 #endif
 
-#ifdef PERL_GLOBAL_STRUCT_INIT
-#  define PERL_CHECK_INITED
+#ifdef PERL_IN_PERL_C
 static const Perl_check_t Gcheck[]
-#elif !defined(PERL_GLOBAL_STRUCT)
-#  define PERL_CHECK_INITED
-EXT Perl_check_t PL_check[] /* or perlvars.h */
-#endif
-#if (defined(DOINIT) && !defined(PERL_GLOBAL_STRUCT)) || defined(PERL_GLOBAL_STRUCT_INIT)
-#  define PERL_CHECK_INITED
 = {
 	Perl_ck_null,		/* null */
 	Perl_ck_null,		/* stub */
@@ -1778,11 +1771,8 @@ EXT Perl_check_t PL_check[] /* or perlvars.h */
 	Perl_ck_null,		/* lvrefslice */
 	Perl_ck_null,		/* lvavref */
 	Perl_ck_null,		/* anonconst */
-}
+};
 #endif
-#ifdef PERL_CHECK_INITED
-;
-#endif /* #ifdef PERL_CHECK_INITED */
 
 #ifndef PERL_GLOBAL_STRUCT_INIT
 
diff --git a/perl.c b/perl.c
index 5176da0db1..38f6cea24b 100644
--- a/perl.c
+++ b/perl.c
@@ -452,6 +452,15 @@ perl_construct(pTHXx)
 #ifdef USE_POSIX_2008_LOCALE
     PL_C_locale_obj = newlocale(LC_ALL_MASK, "C", NULL);
 #endif
+    {
+        const IV ncheck  = C_ARRAY_LENGTH(Gcheck);
+        PL_check  =
+            (Perl_check_t*)
+            PerlMem_malloc(ncheck  * sizeof(Perl_check_t));
+        if (!PL_check)
+            exit(1);
+        Copy(Gcheck,  PL_check,  ncheck,  Perl_check_t);
+    }
 
     ENTER;
     init_i18nl10n(1);
diff --git a/perlapi.h b/perlapi.h
index 4cfbafecdf..aee6468896 100644
--- a/perlapi.h
+++ b/perlapi.h
@@ -137,8 +137,6 @@ END_EXTERN_C
 #define PL_XPosix_ptrs		(*Perl_GXPosix_ptrs_ptr(NULL))
 #undef  PL_appctx
 #define PL_appctx		(*Perl_Gappctx_ptr(NULL))
-#undef  PL_check
-#define PL_check		(*Perl_Gcheck_ptr(NULL))
 #undef  PL_check_mutex
 #define PL_check_mutex		(*Perl_Gcheck_mutex_ptr(NULL))
 #undef  PL_csighandlerp
diff --git a/perlvars.h b/perlvars.h
index df5b2f8c64..bc16f2967d 100644
--- a/perlvars.h
+++ b/perlvars.h
@@ -150,7 +150,6 @@ PERLVAR(G, check_mutex,	perl_mutex)	/* Mutex for PL_check */
 #endif
 #ifdef PERL_GLOBAL_STRUCT 
 PERLVAR(G, ppaddr,	Perl_ppaddr_t *) /* or opcode.h */
-PERLVAR(G, check,	Perl_check_t *) /* or opcode.h */
 PERLVARA(G, fold_locale, 256, unsigned char) /* or perl.h */
 #endif
 
diff --git a/regen/opcode.pl b/regen/opcode.pl
index 672f55c368..44541a742d 100755
--- a/regen/opcode.pl
+++ b/regen/opcode.pl
@@ -1061,15 +1061,8 @@ print $oc <<'END';
 ;
 #endif
 
-#ifdef PERL_GLOBAL_STRUCT_INIT
-#  define PERL_CHECK_INITED
+#ifdef PERL_IN_PERL_C
 static const Perl_check_t Gcheck[]
-#elif !defined(PERL_GLOBAL_STRUCT)
-#  define PERL_CHECK_INITED
-EXT Perl_check_t PL_check[] /* or perlvars.h */
-#endif
-#if (defined(DOINIT) && !defined(PERL_GLOBAL_STRUCT)) || defined(PERL_GLOBAL_STRUCT_INIT)
-#  define PERL_CHECK_INITED
 = {
 END
 
@@ -1078,11 +1071,8 @@ for (@ops) {
 }
 
 print $oc <<'END';
-}
+};
 #endif
-#ifdef PERL_CHECK_INITED
-;
-#endif /* #ifdef PERL_CHECK_INITED */
 
 #ifndef PERL_GLOBAL_STRUCT_INIT
 
diff --git a/sv.c b/sv.c
index e088e5c419..8fc24fd7d3 100644
--- a/sv.c
+++ b/sv.c
@@ -15580,6 +15580,14 @@ perl_clone_using(PerlInterpreter *proto_perl, UV flags,
     PL_globalstash	= hv_dup(proto_perl->Iglobalstash, param);
     PL_curstname	= sv_dup_inc(proto_perl->Icurstname, param);
 
+    /* Add PL_check here */
+    PL_check  =
+        (Perl_check_t*)
+        PerlMem_malloc(PL_maxo  * sizeof(Perl_check_t));
+    if (!PL_check)
+        exit(1);
+    Copy(proto_perl->Icheck,  PL_check,  PL_maxo,  Perl_check_t);
+
     PL_beginav		= av_dup_inc(proto_perl->Ibeginav, param);
     PL_beginav_save	= av_dup_inc(proto_perl->Ibeginav_save, param);
     PL_checkav_save	= av_dup_inc(proto_perl->Icheckav_save, param);
diff --git a/util.c b/util.c
index 376cc8ab0f..0f0edcee97 100644
--- a/util.c
+++ b/util.c
@@ -4630,7 +4630,6 @@ Perl_init_global_struct(pTHX)
     struct perl_vars *plvarsp = NULL;
 # ifdef PERL_GLOBAL_STRUCT
     const IV nppaddr = C_ARRAY_LENGTH(Gppaddr);
-    const IV ncheck  = C_ARRAY_LENGTH(Gcheck);
     PERL_UNUSED_CONTEXT;
 #  ifdef PERL_GLOBAL_STRUCT_PRIVATE
     /* PerlMem_malloc() because can't use even safesysmalloc() this early. */
@@ -4659,13 +4658,7 @@ Perl_init_global_struct(pTHX)
 	PerlMem_malloc(nppaddr * sizeof(Perl_ppaddr_t));
     if (!plvarsp->Gppaddr)
         exit(1);
-    plvarsp->Gcheck  =
-	(Perl_check_t*)
-	PerlMem_malloc(ncheck  * sizeof(Perl_check_t));
-    if (!plvarsp->Gcheck)
-        exit(1);
     Copy(Gppaddr, plvarsp->Gppaddr, nppaddr, Perl_ppaddr_t); 
-    Copy(Gcheck,  plvarsp->Gcheck,  ncheck,  Perl_check_t); 
 #  endif
 #  ifdef PERL_SET_VARS
     PERL_SET_VARS(plvarsp);

p5pRT · 2019-09-23T11:10:26Z

From @nwc10

On Sun, 22 Sep 2019 10:24:48 -0700, nine@detonation.org wrote:

I have since written a patch that seems to fix my problem. It moves
PL_check to the interpreter struct which should avoid any threading
related issues altogether. It's attached.

As best I can work out from staring at history (for a while), PL_check was created in 2000 around the time all the non-static variables were renamed to start PL_*, and it seems to have been accident/oversight that it ended up as "global", not "per-interpreter".

Logically, yes, I think that it does need to be per interpreter.

But, what's somewhat perturbing me about your patch are two implications of this:

Inline Patch

diff --git a/perl.c b/perl.c
index 5176da0db1..38f6cea24b 100644
--- a/perl.c
+++ b/perl.c
@@ -452,6 +452,15 @@ perl_construct(pTHXx)
 #ifdef USE_POSIX_2008_LOCALE
     PL_C_locale_obj = newlocale(LC_ALL_MASK, "C", NULL);
 #endif
+    {
+        const IV ncheck  = C_ARRAY_LENGTH(Gcheck);
+        PL_check  =
+            (Perl_check_t*)
+            PerlMem_malloc(ncheck  * sizeof(Perl_check_t));
+        if (!PL_check)
+            exit(1);
+        Copy( ,  PL_check,  ncheck,  Perl_check_t);
+    }
 
     ENTER;
     init_i18nl10n(1);

First - I think that this as implemented is effectively a leak - there needs to be a corresponding free in perl_destruct

Second - should we really need to unconditionally allocate a block of memory hanging off the interpreter struct which is always present? Surely it should simply *be* part of the interpreter struct

At which point it becomes obvious that we're about to add 397 pointers to the structure, which I think is 3K on a 64-bit system. I'm not *sure* if this is a problem. It only really hurts things *using* threads (er, MULTIPLICITY), on the assumption that the copy source (Gcheck, if I read it correctly) is part of the read-only constant section of the executable (on any sane OS), so it will cause 3K more read/write memory to be allocated, but once startup is done, the CPU cache will now simply be looking in a different place for the values, and the original Gcheck will go cold.

Nicholas Clark

p5pRT · 2019-09-24T13:08:04Z

From @dur-randir

Btw, this is a duplicate of https://rt-archive.perl.org/perl5/Ticket/Display.html?id=125685. I'm glad that it's got picked up.

p5pRT · 2019-09-24T13:17:18Z

From @jkeenan

Merging into existing ticket per analysis by Sergey.
--
James E Keenan (jkeenan@cpan.org)

Fixes issue Perl#14816

niner · 2019-10-30T15:50:54Z

Thanks for the review!

From @nwc10

Second - should we really need to unconditionally allocate a block of memory hanging off the interpreter struct which is always present? Surely it should simply be part of the interpreter struct

That's....so obvious now that you say it. This fixes the memory leak and also simplifies the code quite a bit.
I've created a PR with the updated patch:
#17240

From @nwc10

At which point it becomes obvious that we're about to add 397 pointers to the structure, which I think is 3K on a 64-bit system. I'm not sure if this is a problem. It only really hurts things using threads (er, MULTIPLICITY), on the assumption that the copy source (Gcheck, if I read it correctly) is part of the read-only constant section of the executable (on any sane OS), so it will cause 3K more read/write memory to be allocated, but once startup is done, the CPU cache will now simply be looking in a different place for the values, and the original Gcheck will go cold.

I dare say correctness trumps speed. The issue has been reported twice at least, so it's clear that it's hurting real users. 3K memory is not nothing but also not huge. If it turns out to be an actual performance issue, we can try to find a better solution.

jkeenan · 2019-10-30T19:11:49Z

Thanks for the review!

From @nwc10

Second - should we really need to unconditionally allocate a block of memory hanging off the interpreter struct which is always present? Surely it should simply be part of the interpreter struct

That's....so obvious now that you say it. This fixes the memory leak and also simplifies the code quite a bit.
I've created a PR with the updated patch:
#17240

Shouldn't the pull request contain some new tests to demonstrate that the problem raised by the original poster has been resolved?

From @nwc10

At which point it becomes obvious that we're about to add 397 pointers to the structure, which I think is 3K on a 64-bit system. I'm not sure if this is a problem. It only really hurts things using threads (er, MULTIPLICITY), on the assumption that the copy source (Gcheck, if I read it correctly) is part of the read-only constant section of the executable (on any sane OS), so it will cause 3K more read/write memory to be allocated, but once startup is done, the CPU cache will now simply be looking in a different place for the values, and the original Gcheck will go cold.

I dare say correctness trumps speed. The issue has been reported twice at least, so it's clear that it's hurting real users. 3K memory is not nothing but also not huge. If it turns out to be an actual performance issue, we can try to find a better solution.

Thank you very much.
Jim Keenan

Fixes issue Perl#14816

niner · 2019-10-30T19:37:56Z

On Mittwoch, 30. Oktober 2019 20:11:57 CET James E Keenan wrote: Shouldn't the pull request contain some new tests to demonstrate that the problem raised by the original poster has been resolved?

Good point! I updated the pull request to contain a test file based on the code example posted by the original reporter of this issue. Since the bug is a race condition, the test does not fail 100 % reliably on an unpatched perl, but it does fail often enough so a regression could not go unnoticed for long.

jkeenan · 2019-10-30T20:50:04Z

On Mittwoch, 30. Oktober 2019 20:11:57 CET James E Keenan wrote: Shouldn't the pull request contain some new tests to demonstrate that the problem raised by the original poster has been resolved?
Good point! I updated the pull request to contain a test file based on the code example posted by the original reporter of this issue. Since the bug is a race condition, the test does not fail 100 % reliably on an unpatched perl, but it does fail often enough so a regression could not go unnoticed for long.

I created a smoke-me branch from your patch:
smoke-me/jkeenan/niner/gh-14816/niner-blead

However, I am getting test failures in io/handle.t.
See: https://travis-ci.org/Perl/perl5/jobs/605164560#L3370.

Also, the MANIFEST will have to be updated.

Thank you very much.
Jim Keenan

jkeenan · 2019-10-30T21:34:31Z

On Mittwoch, 30. Oktober 2019 20:11:57 CET James E Keenan wrote: Shouldn't the pull request contain some new tests to demonstrate that the problem raised by the original poster has been resolved?
Good point! I updated the pull request to contain a test file based on the code example posted by the original reporter of this issue. Since the bug is a race condition, the test does not fail 100 % reliably on an unpatched perl, but it does fail often enough so a regression could not go unnoticed for long.

I created a smoke-me branch from your patch:
smoke-me/jkeenan/niner/gh-14816/niner-blead

However, I am getting test failures in io/handle.t.
See: https://travis-ci.org/Perl/perl5/jobs/605164560#L3370.

Also, the MANIFEST will have to be updated.

Thank you very much.
Jim Keenan

The test failures in t/io/handle.t were due to the absence of a skip_all for unthreaded builds. In the smoke-me/jkeenan/niner/gh-14816/niner-blead branch I have corrected that and updated MANIFEST.

Thank you very much.
Jim Keenan

Fixes issue #14816

…4816. Re-implement getline() and getlines() as XS code. The underlying problem that we're trying to solve here is making getline() and getlines() in IO::Handle respect the open pragma. That bug was first addressed in Sept 2011 by commit 986a805: Make IO::Handle::getline(s) respect the open pragma However, that fix introduced a more subtle bug, hence this reworking. Including the entirety of the rest of that commit message because it explains both the bug the previous approach: See <https://rt.cpan.org/Ticket/Display.html?id=66474>. Also, this came up in <https://rt.perl.org/rt3/Ticket/Display.html?id=92728>. The <> operator, when reading from the magic ARGV handle, automatic- ally opens the next file. Layers set by the lexical open pragma are applied, if they are in scope at the point where <> is used. This works almost all the time, because the common convention is: use open ":utf8"; while(<>) { ... } IO::Handle’s getline and getlines methods are Perl subroutines that call <> themselves. But that happens within the scope of IO/Handle.pm, so the caller’s I/O layer settings are ignored. That means that these two expressions are not equivalent within in a ‘use open’ scope: <> *ARGV->getline The latter will open the next file with no layers applied. This commit solves that by putting PL_check hooks in place in IO::Handle before compiling the getline and getlines subroutines. Those hooks cause every state op (nextstate, or dbstate under the debugger) to have a custom pp function that saves the previous value of PL_curcop, calls the default pp function, and then restores PL_curcop. That means that getline and getlines run with the caller’s compile- time hints. Another way to see it is that getline and getlines’s own lexical hints are never activated. (A state op carries all the lexical pragmata. Every statement has one. When any op executes, it’s ‘pp’ function is called. pp_nextstate and pp_dbstate both set PL_curcop to the op itself. Any code that checks hints looks at PL_curcop, which contains the current run-time hints.) The problem with this approach is that the (current) design and implementation of PL_check hooks is actually not threadsafe. There's one array (as a global), which is used by all interpreters in the process. But as the code added to IO.xs demonstrates, realistically it needs to be possible to change the hook just for this interpreter. GH #14816 has a fix for that bug for blead. However, it will be tricky (to impossible) to backport to earlier perl versions. Hence it's also worthwhile to change IO.xs to use a different approach to solve the original bug. As described above, the bug is fixed by having the readline OP (that implements getline() and getlines()) see the caller's lexical state, not their "own". Unlike Perl subroutines, XS subroutines don't have any lexical hints of their own. getline() and getlines() are very simple, mostly parameter checking, ending with a one line that maps to a single core OP, whose values are directly returned. Hence "all" we need to do re-implement the Perl code as XS. This might look easy, but turns out to be trickier than expected. There isn't any API to be called for the OP in question, pp_readline(). The body of the OP inspects interpreter state, it directly calls pp_rv2gv() which also inspects state, and then it tail calls Perl_do_readline(), which inspects state. The easiest approach seems to be to set up enough state, and then call pp_readline() directly. This leaves us very tightly coupled to the internals, but so do all other approaches to try to tackle this bug. The current implementation of PL_check (and possibly other arrays) still needs to be addressed.

…rl#14816. Re-implement getline() and getlines() as XS code. The underlying problem that we're trying to solve here is making getline() and getlines() in IO::Handle respect the open pragma. That bug was first addressed in Sept 2011 by commit 986a805: Make IO::Handle::getline(s) respect the open pragma However, that fix introduced a more subtle bug, hence this reworking. Including the entirety of the rest of that commit message because it explains both the bug the previous approach: See <https://rt.cpan.org/Ticket/Display.html?id=66474>. Also, this came up in <https://rt.perl.org/rt3/Ticket/Display.html?id=92728>. The <> operator, when reading from the magic ARGV handle, automatic- ally opens the next file. Layers set by the lexical open pragma are applied, if they are in scope at the point where <> is used. This works almost all the time, because the common convention is: use open ":utf8"; while(<>) { ... } IO::Handle’s getline and getlines methods are Perl subroutines that call <> themselves. But that happens within the scope of IO/Handle.pm, so the caller’s I/O layer settings are ignored. That means that these two expressions are not equivalent within in a ‘use open’ scope: <> *ARGV->getline The latter will open the next file with no layers applied. This commit solves that by putting PL_check hooks in place in IO::Handle before compiling the getline and getlines subroutines. Those hooks cause every state op (nextstate, or dbstate under the debugger) to have a custom pp function that saves the previous value of PL_curcop, calls the default pp function, and then restores PL_curcop. That means that getline and getlines run with the caller’s compile- time hints. Another way to see it is that getline and getlines’s own lexical hints are never activated. (A state op carries all the lexical pragmata. Every statement has one. When any op executes, it’s ‘pp’ function is called. pp_nextstate and pp_dbstate both set PL_curcop to the op itself. Any code that checks hints looks at PL_curcop, which contains the current run-time hints.) The problem with this approach is that the (current) design and implementation of PL_check hooks is actually not threadsafe. There's one array (as a global), which is used by all interpreters in the process. But as the code added to IO.xs demonstrates, realistically it needs to be possible to change the hook just for this interpreter. GH Perl#14816 has a fix for that bug for blead. However, it will be tricky (to impossible) to backport to earlier perl versions. Hence it's also worthwhile to change IO.xs to use a different approach to solve the original bug. As described above, the bug is fixed by having the readline OP (that implements getline() and getlines()) see the caller's lexical state, not their "own". Unlike Perl subroutines, XS subroutines don't have any lexical hints of their own. getline() and getlines() are very simple, mostly parameter checking, ending with a one line that maps to a single core OP, whose values are directly returned. Hence "all" we need to do re-implement the Perl code as XS. This might look easy, but turns out to be trickier than expected. There isn't any API to be called for the OP in question, pp_readline(). The body of the OP inspects interpreter state, it directly calls pp_rv2gv() which also inspects state, and then it tail calls Perl_do_readline(), which inspects state. The easiest approach seems to be to set up enough state, and then call pp_readline() directly. This leaves us very tightly coupled to the internals, but so do all other approaches to try to tackle this bug. The current implementation of PL_check (and possibly other arrays) still needs to be addressed.

Fixes issue Perl#14816

Fixes issue #14816

tonycoz · 2019-12-12T00:41:02Z

should be fixed by #17351

nwc10 · 2019-12-12T12:22:55Z

On Wed, Dec 11, 2019 at 04:41:03PM -0800, Tony Cook wrote: should be fixed by #17351

(OK let's see how well the "reply to this e-mail" thing works... Not sure *where* the discussion on "this" really should be. I'd not tried to write "this" message that follows, as I didn't yet have a clear view on what the right solution is... "So this patch is still needed to fix the thread-safety issue." agree. But what I'd also realised was that, in turn, fixing the whole design mistake of PL_check being "global" *then* breaks everything that uses wrap_op_checker(). To quote the full wrap_op_checker docs: =for apidoc wrap_op_checker Puts a C function into the chain of check functions for a specified op type. This is the preferred way to manipulate the L</PL_check> array. C<opcode> specifies which type of op is to be affected. C<new_checker> is a pointer to the C function that is to be added to that opcode's check chain, and C<old_checker_p> points to the storage location where a pointer to the next function in the chain will be stored. The value of C<new_checker> is written into the L</PL_check> array, while the value previously stored there is written to C<*old_checker_p>. L</PL_check> is global to an entire process, and a module wishing to hook op checking may find itself invoked more than once per process, typically in different threads. To handle that situation, this function is idempotent. The location C<*old_checker_p> must initially (once per process) contain a null pointer. A C variable of static duration (declared at file scope, typically also marked C<static> to give it internal linkage) will be implicitly initialised appropriately, if it does not have an explicit initialiser. This function will only actually modify the check chain if it finds C<*old_checker_p> to be null. This function is also thread safe on the small scale. It uses appropriate locking to avoid race conditions in accessing L</PL_check>. When this function is called, the function referenced by C<new_checker> must be ready to be called, except for C<*old_checker_p> being unfilled. In a threading situation, C<new_checker> may be called immediately, even before this function has returned. C<*old_checker_p> will always be appropriately set before C<new_checker> is called. If C<new_checker> decides not to do anything special with an op that it is given (which is the usual case for most uses of op check hooking), it must chain the check function referenced by C<*old_checker_p>. Taken all together, XS code to hook an op checker should typically look something like this: static Perl_check_t nxck_frob; static OP *myck_frob(pTHX_ OP *op) { ... op = nxck_frob(aTHX_ op); ... return op; } BOOT: wrap_op_checker(OP_FROB, myck_frob, &nxck_frob); If you want to influence compilation of calls to a specific subroutine, then use L</cv_set_call_checker_flags> rather than hooking checking of all C<entersub> ops. =cut So, as-was (for an actually process global PL_check) that design works, even if the XS module is loaded independently in two child threads (the shared object is loaded once, and the static storage exists once) 1) first (i)thread to call wrap_op_checker() changes PL_check and the static variable (via *old_checker_p) 2) second (i)thread to call wrap_op_checker() does nothing, because *old_checker_p is not-NULL and both ithreads see PL_check changed to the value that they "want". What was have now is 1) first (i)thread to call wrap_op_checker() changes *that* interpreter's PL_check and the static variable (via *old_checker_p) 2) second (i)thread to call wrap_op_checker() does nothing, because *old_checker_p is not-NULL and hence the second ithread *doesn't* see PL_check changed. And I can't see how to "fix" this API in any way that actually works with the existing XS code. 1) I *think* that it needs to die. 2) I'm not sure what should replace it (maybe no API is needed. There's no *thread* race here now...) So sadly it looks like we need to fix all of CPAN... http://grep.cpan.me/?q=wrap_op_checker+-file%3Apport.h 14 references, 2 which are false positives, leaving code from ZEFRAM, ETHER, VPIT, RANDIR, ATOOMIC. Hence a flag-day change is possible here. I'm just not sure what design we want to put on our flag. Nicholas Clark

tonycoz · 2019-12-12T22:22:33Z

On Wed, Dec 11, 2019 at 04:41:03PM -0800, Tony Cook wrote: should be fixed by #17351
What was have now is

first (i)thread to call wrap_op_checker() changes that interpreter's PL_check and the static variable (via *old_checker_p)

second (i)thread to call wrap_op_checker() does nothing, because *old_checker_p is not-NULL and hence the second ithread doesn't see PL_check changed. And I can't see how to "fix" this API in any way that actually works with the existing XS code.

I hadn't considered the API, I guess this issue is a consequence of having modules manage the hook chaining rather than having perl manage it itself.

If we implemented our own chaining maybe wrap_op_checker() could work like:

wrap_op_checker() scans the current list of functions for this op, and if it isn't present adds it at the end.
wrap_op_checker() always sets *old_checker_p to point to a stub that indicates the next item in the chain should be called

The chain handling code may need to distinguish between checkers added with wrap_op_checker() and otherwise, we might use an interpreter variable to distinguish between call-next and don't-call-next for wrap_op_checker() hooks, but a return value might be more natural and require less management of the interpreter variable for newer style hooks.

An advantage of maintaining the list of checkers in core is we could have an API to remove a checker too, so the code that produced this report could add the checker and remove it more safely.[1]

But that's probably all too complicated.

Given Christmas/NY and the contentious changes freeze in January it might be difficult to get this into core for 5.32.

So I can see us reverting the PL_check change and just including your IO changes (which aren't merged just yet.)

[1] it isn't an issue here, but what if that code compiled by _create_getline_subs() had installed another checker

…4816. Re-implement getline() and getlines() as XS code. The underlying problem that we're trying to solve here is making getline() and getlines() in IO::Handle respect the open pragma. That bug was first addressed in Sept 2011 by commit 986a805: Make IO::Handle::getline(s) respect the open pragma However, that fix introduced a more subtle bug, hence this reworking. Including the entirety of the rest of that commit message because it explains both the bug the previous approach: See <https://rt.cpan.org/Ticket/Display.html?id=66474>. Also, this came up in <https://rt.perl.org/rt3/Ticket/Display.html?id=92728>. The <> operator, when reading from the magic ARGV handle, automatic- ally opens the next file. Layers set by the lexical open pragma are applied, if they are in scope at the point where <> is used. This works almost all the time, because the common convention is: use open ":utf8"; while(<>) { ... } IO::Handle’s getline and getlines methods are Perl subroutines that call <> themselves. But that happens within the scope of IO/Handle.pm, so the caller’s I/O layer settings are ignored. That means that these two expressions are not equivalent within in a ‘use open’ scope: <> *ARGV->getline The latter will open the next file with no layers applied. This commit solves that by putting PL_check hooks in place in IO::Handle before compiling the getline and getlines subroutines. Those hooks cause every state op (nextstate, or dbstate under the debugger) to have a custom pp function that saves the previous value of PL_curcop, calls the default pp function, and then restores PL_curcop. That means that getline and getlines run with the caller’s compile- time hints. Another way to see it is that getline and getlines’s own lexical hints are never activated. (A state op carries all the lexical pragmata. Every statement has one. When any op executes, it’s ‘pp’ function is called. pp_nextstate and pp_dbstate both set PL_curcop to the op itself. Any code that checks hints looks at PL_curcop, which contains the current run-time hints.) The problem with this approach is that the (current) design and implementation of PL_check hooks is actually not threadsafe. There's one array (as a global), which is used by all interpreters in the process. But as the code added to IO.xs demonstrates, realistically it needs to be possible to change the hook just for this interpreter. GH #14816 has a fix for that bug for blead. However, it will be tricky (to impossible) to backport to earlier perl versions. Hence it's also worthwhile to change IO.xs to use a different approach to solve the original bug. As described above, the bug is fixed by having the readline OP (that implements getline() and getlines()) see the caller's lexical state, not their "own". Unlike Perl subroutines, XS subroutines don't have any lexical hints of their own. getline() and getlines() are very simple, mostly parameter checking, ending with a one line that maps to a single core OP, whose values are directly returned. Hence "all" we need to do re-implement the Perl code as XS. This might look easy, but turns out to be trickier than expected. There isn't any API to be called for the OP in question, pp_readline(). The body of the OP inspects interpreter state, it directly calls pp_rv2gv() which also inspects state, and then it tail calls Perl_do_readline(), which inspects state. The easiest approach seems to be to set up enough state, and then call pp_readline() directly. This leaves us very tightly coupled to the internals, but so do all other approaches to try to tackle this bug. The current implementation of PL_check (and possibly other arrays) still needs to be addressed.

tonycoz · 2020-05-07T04:18:28Z

I believe this was fixed by f172023.

jkeenan · 2020-05-07T12:12:47Z

I believe this was fixed by f172023.

So when I build a threaded perl at blead and re-run the program submitted back in 2017 by @dur-randir,

while (1) {
  push @foo, threads->create(sub {
  require IO::Handle;
  });

  $_->detach for(splice @foo);
}

I now get a silent infinite loop. Is that what I should be expecting?

If so, then the original complaint in this ticket has been addressed and the ticket can be closed.

Thank you very much.
Jim Keenan

dur-randir · 2020-05-07T22:47:08Z

The original issue here is module loading, and it looks good now. wrap_op_checker is a different concern that needs it's own ticket.

jkeenan · 2020-05-07T23:29:48Z

On 5/7/20 6:47 PM, Sergey Aleynikov wrote: The original issue here is module loading, and it looks good now. wrap_op_checker is a different concern that needs it's own ticket.

Would you be able to open such a new ticket? Thanks. jimk

dur-randir · 2020-05-08T00:03:59Z

Looks like i've skipped some bits here, and the merged solution doesn't involve changing PL_check (and hence avoid wrap_op_check problems). But nevertheless that commit says "The current implementation of PL_check (and possibly other arrays) still needs to be addressed.".

But i'm not sure what was implied - reinstantiate the original change early enough to have time to change api + modules? Different approach? It'd be better to ask Nicholas.

jkeenan · 2020-05-10T13:39:47Z

Looks like i've skipped some bits here, and the merged solution doesn't involve changing PL_check (and hence avoid wrap_op_check problems). But nevertheless that commit says "The current implementation of PL_check (and possibly other arrays) still needs to be addressed.".

But i'm not sure what was implied - reinstantiate the original change early enough to have time to change api + modules? Different approach? It'd be better to ask Nicholas.

@nwc10, can you comment on the above? Thanks.

tonycoz · 2020-05-26T05:59:05Z

Open #17806 to cover the remaining issue here.

p5pRT added Severity High distro-openbsd type-library labels Oct 19, 2019

niner added a commit to niner/perl5 that referenced this issue Oct 30, 2019

Move PL_check to the interp vars to fix threading issues

b5e8c40

Fixes issue Perl#14816

niner mentioned this issue Oct 30, 2019

Move PL_check to the interp vars to fix threading issues #17240

Closed

niner added a commit to niner/perl5 that referenced this issue Oct 30, 2019

Move PL_check to the interp vars to fix threading issues

e30e36f

Fixes issue Perl#14816

jkeenan pushed a commit that referenced this issue Nov 1, 2019

Move PL_check to the interp vars to fix threading issues

c51b451

Fixes issue #14816

tonycoz pushed a commit to tonycoz/perl5 that referenced this issue Dec 9, 2019

Move PL_check to the interp vars to fix threading issues

d7bf706

Fixes issue Perl#14816

tonycoz pushed a commit to tonycoz/perl5 that referenced this issue Dec 11, 2019

Move PL_check to the interp vars to fix threading issues

45755f1

Fixes issue Perl#14816

tonycoz pushed a commit that referenced this issue Dec 12, 2019

Move PL_check to the interp vars to fix threading issues

e6c7056

Fixes issue #14816

Leont mentioned this issue Dec 15, 2019

BBC: Blead Breaks indirect #17362

Closed

jkeenan added the dist-IO issues in the dual-life blead-first IO distribution label Feb 3, 2020

tonycoz added the Closable? We might be able to close this ticket, but we need to check with the reporter label May 7, 2020

tonycoz mentioned this issue May 26, 2020

wrap_op_checker()/PL_check #17806

Open

tonycoz closed this as completed May 26, 2020

Thread race in dist/IO/IO.xs #14816

Thread race in dist/IO/IO.xs #14816

Comments

p5pRT commented Jul 24, 2015

p5pRT commented Jul 24, 2015

From william@25thandClement.com

Created by william@25thandClement.com

p5pRT commented Jan 5, 2017

From @jkeenan

p5pRT commented Jan 5, 2017

p5pRT commented Jan 5, 2017

From @dur-randir

p5pRT commented Jan 5, 2017

From @jkeenan

p5pRT commented Jan 26, 2017

From @hvds

p5pRT commented Sep 20, 2019

From @niner

p5pRT commented Sep 20, 2019

From @niner

p5pRT commented Sep 21, 2019

From @niner

p5pRT commented Sep 21, 2019

From [Unknown Contact. See original ticket]

p5pRT commented Sep 21, 2019

From @niner

p5pRT commented Sep 21, 2019

From [Unknown Contact. See original ticket]

p5pRT commented Sep 22, 2019

From @jkeenan

p5pRT commented Sep 22, 2019

p5pRT commented Sep 22, 2019

From @niner

p5pRT commented Sep 22, 2019

From @niner

p5pRT commented Sep 23, 2019

From @nwc10

p5pRT commented Sep 24, 2019

From @dur-randir

p5pRT commented Sep 24, 2019

From @jkeenan

niner commented Oct 30, 2019

From @nwc10

From @nwc10

jkeenan commented Oct 30, 2019

From @nwc10

From @nwc10

niner commented Oct 30, 2019 via email

jkeenan commented Oct 30, 2019

jkeenan commented Oct 30, 2019

tonycoz commented Dec 12, 2019

nwc10 commented Dec 12, 2019 via email

tonycoz commented Dec 12, 2019

tonycoz commented May 7, 2020

jkeenan commented May 7, 2020

dur-randir commented May 7, 2020

jkeenan commented May 7, 2020 via email

dur-randir commented May 8, 2020

jkeenan commented May 10, 2020

tonycoz commented May 26, 2020