Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can buffered read() be smarter about chunk size? #14995

Open
p5pRT opened this issue Oct 19, 2015 · 33 comments
Open

Can buffered read() be smarter about chunk size? #14995

p5pRT opened this issue Oct 19, 2015 · 33 comments

Comments

@p5pRT
Copy link

p5pRT commented Oct 19, 2015

Migrated from rt.perl.org#126403 (status was 'open')

Searchable as RT126403$

@p5pRT
Copy link
Author

p5pRT commented Oct 19, 2015

From @FGasper

Created by @FGasper

Currently read() will always take in data in 8,192-byte chunks
(or whatever chunk size PerlIO is set to use). This makes sense
when reading line-by-line, but not for something like​:

read( $fh, $buffer, 30000 )

... because the buffered read will actually *cost* system calls
rather than saving them.

It would seem more sensible to read in some multiple of the normal chunk size
at once, e.g., a single read(2) system call of 32,768 bytes.

People can, of course, use sysread(), but then (I am told, have not
verified) we lose other niceties of PerlIO read() like trying again
after EINTR.

Is there anything gained from NOT doing a single read in the case of
wanting 30,000 bytes from a filehandle?

Perl Info

Flags:
    category=core
    severity=wishlist

This perlbug was built using Perl 5.10.1 in the Fedora build system.
It is being executed now by Perl 5.10.1 - Sun Nov  6 00:37:43 GMT 2011.

Site configuration information for perl 5.10.1:

Configured by Red Hat, Inc. at Sun Nov  6 00:37:43 GMT 2011.

Summary of my perl5 (revision 5 version 10 subversion 1) configuration:
   
  Platform:
    osname=linux, osvers=2.6.32-44.2.el6.x86_64, archname=x86_64-linux-thread-multi
    uname='linux c6b5.bsys.dev.centos.org 2.6.32-44.2.el6.x86_64 #1 smp wed jul 21 12:48:32 edt 2010 x86_64 x86_64 x86_64 gnulinux '
    config_args='-des -Doptimize=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -DDEBUGGING=-g -Dversion=5.10.1 -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl5 -Dsitearch=/usr/local/lib64/perl5 -Dprivlib=/usr/share/perl5 -Darchlib=/usr/lib64/perl5 -Dvendorlib=/usr/share/perl5/vendor_perl -Dvendorarch=/usr/lib64/perl5/vendor_perl -Dinc_version_list=5.10.0 -Darchname=x86_64-linux-thread-multi -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64 -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_proto -Ud_sethostent_r_proto -Ud_endprotoent_r_proto -Ud_setprotoent_r_proto -Ud_endservent_r_proto -Ud_setservent_r_proto -Dscriptdir=/usr/bin'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.4.5 20110214 (Red Hat 4.4.5-6)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -fstack-protector'
    libpth=/usr/local/lib64 /lib64 /usr/lib64
    libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.12'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib64/perl5/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic'

Locally applied patches:
    


@INC for perl 5.10.1:
    /home/fgasper/perl5/lib/perl5
    /home/fgasper/perl5/lib/perl5/x86_64-linux-thread-multi
    /home/fgasper/perl/home/fgasper/perl5/lib/perl5/x86_64-linux-thread-multi
    /home/fgasper/perl/home/fgasper/perl5/lib/perl5
    /home/fgasper/perl/usr/local/lib64/perl5
    /home/fgasper/perl/usr/local/share/perl5
    /home/fgasper/perl/usr/lib64/perl5/vendor_perl
    /home/fgasper/perl/usr/share/perl5/vendor_perl
    /home/fgasper/perl/usr/lib64/perl5
    /home/fgasper/perl/usr/share/perl5
    /home/fgasper/perl5/lib/perl5/x86_64-linux-thread-multi
    /home/fgasper/perl5/lib/perl5
    /usr/local/lib64/perl5
    /usr/local/share/perl5
    /usr/lib64/perl5/vendor_perl
    /usr/share/perl5/vendor_perl
    /usr/lib64/perl5
    /usr/share/perl5
    .


Environment for perl 5.10.1:
    HOME=/home/fgasper
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/fgasper/perl5/bin:/usr/local/jdk/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11R6/bin:/opt/dell/srvadmin/bin:/usr/local/cpanel/3rdparty/bin:/home/fgasper/bin
    PERL5LIB=/home/fgasper/perl5/lib/perl5
    PERL_BADLANG (unset)
    PERL_LOCAL_LIB_ROOT=/home/fgasper/perl5
    PERL_MB_OPT=--install_base "/home/fgasper/perl5"
    PERL_MM_OPT=INSTALL_BASE=/home/fgasper/perl5
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Oct 20, 2015

From chansen@cpan.org

On Mon Oct 19 15​:55​:54 2015, felipe@​felipegasper.com wrote​:

Currently read() will always take in data in 8,192-byte chunks
(or whatever chunk size PerlIO is set to use). This makes sense
when reading line-by-line, but not for something like​:

read( $fh, $buffer, 30000 )

You can bypass the PerlIO buffer layer (perlio) by specifying a lower
layer in the open MODE​:

  open my $fh, '<​:unix', '/path/to/fh';

--
chansen

@p5pRT
Copy link
Author

p5pRT commented Oct 20, 2015

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Oct 20, 2015

From @FGasper

On 20 Oct 2015 11​:29 AM, Christian Hansen via RT wrote​:

On Mon Oct 19 15​:55​:54 2015, felipe@​felipegasper.com wrote​:

Currently read() will always take in data in 8,192-byte chunks
(or whatever chunk size PerlIO is set to use). This makes sense
when reading line-by-line, but not for something like​:

read( $fh, $buffer, 30000 )

You can bypass the PerlIO buffer layer (perlio) by specifying a lower
layer in the open MODE​:

 open my $fh\, '\<&#8203;:unix'\, '/path/to/fh';

Would it work, though, to make PerlIO smarter? Then we can still get the
benefits of buffering, too.

-FG

@p5pRT
Copy link
Author

p5pRT commented Oct 20, 2015

From chansen@cpan.org

On Tue Oct 20 10​:31​:29 2015, felipe@​felipegasper.com wrote​:

On 20 Oct 2015 11​:29 AM, Christian Hansen via RT wrote​:

On Mon Oct 19 15​:55​:54 2015, felipe@​felipegasper.com wrote​:

Currently read() will always take in data in 8,192-byte chunks
(or whatever chunk size PerlIO is set to use). This makes sense
when reading line-by-line, but not for something like​:

read( $fh, $buffer, 30000 )

You can bypass the PerlIO buffer layer (perlio) by specifying a lower
layer in the open MODE​:

 open my $fh\, '\<&#8203;:unix'\, '/path/to/fh';

Would it work, though, to make PerlIO smarter? Then we can still get the
benefits of buffering, too.

When Nick Ing-Simmons designed PerlIO for inclusion in Perl 5.8.0 (2002),
it made sense to limit the buffer size to 8192 (2**13), the question what
would a reasonable size for the internal buffer be today? From 2**13
to 2**15 or 2**16? What impact would it have on obscure OS/Platforms?

--
chansen

@p5pRT
Copy link
Author

p5pRT commented Oct 20, 2015

From @FGasper

On 20 Oct 2015 1​:13 PM, Christian Hansen via RT wrote​:

On Tue Oct 20 10​:31​:29 2015, felipe@​felipegasper.com wrote​:

On 20 Oct 2015 11​:29 AM, Christian Hansen via RT wrote​:

On Mon Oct 19 15​:55​:54 2015, felipe@​felipegasper.com wrote​:

Currently read() will always take in data in 8,192-byte chunks
(or whatever chunk size PerlIO is set to use). This makes sense
when reading line-by-line, but not for something like​:

read( $fh, $buffer, 30000 )

You can bypass the PerlIO buffer layer (perlio) by specifying a lower
layer in the open MODE​:

  open my $fh\, '\<&#8203;:unix'\, '/path/to/fh';

Would it work, though, to make PerlIO smarter? Then we can still get the
benefits of buffering, too.

When Nick Ing-Simmons designed PerlIO for inclusion in Perl 5.8.0 (2002),
it made sense to limit the buffer size to 8192 (2**13), the question what
would a reasonable size for the internal buffer be today? From 2**13
to 2**15 or 2**16? What impact would it have on obscure OS/Platforms?

Does PerlIO only know how to fill up its buffer then flush? How hard
would it be to do something like​:

0) PerlIO has 500 bytes in $fh cache

1) read( $fh, $buffer, 30000 )

2) flush $fh cache into $buffer. int bytes_needed = 25,000;

3) Read (floor(bytes_needed) * 8192) bytes DIRECTLY to $buffer.

3) Now do a “normal” 8,192-byte PerlIO read(). Send (bytes_needed %
8192) to $buffer, and save the remainder in the $fh cache.

-FG

@p5pRT
Copy link
Author

p5pRT commented Oct 20, 2015

From chansen@cpan.org

On Tue Oct 20 11​:13​:45 2015, chansen wrote​:

When Nick Ing-Simmons designed PerlIO for inclusion in Perl 5.8.0 (2002),
it made sense to limit the buffer size to 8192 (2**13), the question what
would a reasonable size for the internal buffer be today? From 2**13
to 2**15 or 2**16? What impact would it have on obscure OS/Platforms?

Perhaps we should use OS hints when Perl is compiled?

--
chansen

@p5pRT
Copy link
Author

p5pRT commented Oct 20, 2015

From chansen@cpan.org

On Tue Oct 20 11​:28​:03 2015, felipe@​felipegasper.com wrote​:

Does PerlIO only know how to fill up its buffer then flush? How hard
would it be to do something like​:

0) PerlIO has 500 bytes in $fh cache

1) read( $fh, $buffer, 30000 )

2) flush $fh cache into $buffer. int bytes_needed = 25,000;

3) Read (floor(bytes_needed) * 8192) bytes DIRECTLY to $buffer.

3) Now do a “normal” 8,192-byte PerlIO read(). Send (bytes_needed %
8192) to $buffer, and save the remainder in the $fh cache.

By design, the PerlIO implementation is completely decoupled from Perl.
You could think of PerlIO as a portable reimplementation of stdio.

--
chansen

@p5pRT
Copy link
Author

p5pRT commented Oct 20, 2015

From @Leont

On Tue, Oct 20, 2015 at 12​:55 AM, felipe@​felipegasper.com <
perlbug-followup@​perl.org> wrote​:

Currently read() will always take in data in 8,192-byte chunks
(or whatever chunk size PerlIO is set to use). This makes sense
when reading line-by-line, but not for something like​:

read( $fh, $buffer, 30000 )

... because the buffered read will actually *cost* system calls
rather than saving them.

It would seem more sensible to read in some multiple of the normal chunk
size
at once, e.g., a single read(2) system call of 32,768 bytes.

People can, of course, use sysread(), but then (I am told, have not
verified) we lose other niceties of PerlIO read() like trying again
after EINTR.

Is there anything gained from NOT doing a single read in the case of
wanting 30,000 bytes from a filehandle?

Actually, I think I actually wrote a patch to that effect some months ago,
but I never got around to finishing it. It's a bit more complicated than it
should be, in particular because it should also take into account that for
some other layers (such as crlf and encoding) the buffer-size restriction
is for now a necessity. I'll try to dig it up.

Leon

@p5pRT
Copy link
Author

p5pRT commented Oct 20, 2015

From @craigberry

On Tue, Oct 20, 2015 at 1​:27 PM, Felipe Gasper <felipe@​felipegasper.com> wrote​:

On 20 Oct 2015 1​:13 PM, Christian Hansen via RT wrote​:

When Nick Ing-Simmons designed PerlIO for inclusion in Perl 5.8.0 (2002),
it made sense to limit the buffer size to 8192 (2**13),

Actually, he made it 4096. Some years later I made it the larger of
8192 and BUFSIZ​:

<http​://perl5.git.perl.org/perl.git/commit/b83080de5c42543809ce9004bcdbcd3162a00e70?f>

which produced a significant performance improvement. I did what was
simplest and easiest and I believe pointed out at the time that there
is no one buffer size that is ideal for every application. It's
certainly an area ripe for (further) improvement.

the question what
would a reasonable size for the internal buffer be today? From 2**13
to 2**15 or 2**16? What impact would it have on obscure OS/Platforms?

There are various dragons here. I believe one of the BSDs had a
problem when we switched from 4096 to 8192 because writes through
pipes became split writes, which made something happen differently in
signal handlers (I think it was eintr.t that failed). Just as an
example of things that can go wrong from a change that looks dead
simple. And then there are the CRLF layer and encoding issues Leon
mentioned.

Does PerlIO only know how to fill up its buffer then flush? How hard would
it be to do something like​:

0) PerlIO has 500 bytes in $fh cache

1) read( $fh, $buffer, 30000 )

2) flush $fh cache into $buffer. int bytes_needed = 25,000;

3) Read (floor(bytes_needed) * 8192) bytes DIRECTLY to $buffer.

3) Now do a “normal” 8,192-byte PerlIO read(). Send (bytes_needed % 8192) to
$buffer, and save the remainder in the $fh cache.

This is interesting but I would favor having a buffer size selectable
from Perl. Something like​:

  open my $f, '<​:perlio(bufsiz=32767)', 'file.txt';

(though I really shouldn't be trusted with syntax). This way we would
give control to the developer to know his or her data and optimize
different filehandles for different purposes.

@p5pRT
Copy link
Author

p5pRT commented Oct 20, 2015

From @FGasper

On 20 Oct 2015 2​:19 PM, Craig A. Berry wrote​:

0) PerlIO has 500 bytes in $fh cache

1) read( $fh, $buffer, 30000 )

2) flush $fh cache into $buffer. int bytes_needed = 25,000;

3) Read (floor(bytes_needed) * 8192) bytes DIRECTLY to $buffer.

3) Now do a “normal” 8,192-byte PerlIO read(). Send (bytes_needed % 8192) to
$buffer, and save the remainder in the $fh cache.

This is interesting but I would favor having a buffer size selectable
from Perl. Something like​:

open my $f, '<​:perlio(bufsiz=32767)', 'file.txt';

(though I really shouldn't be trusted with syntax). This way we would
give control to the developer to know his or her data and optimize
different filehandles for different purposes.

This would also require rewriting existing code, and for Perl developers
to learn new syntax, in order to realize the speed gains.

If you optimize the read() built-in, though, then everything
automatically speeds up, and no one has to learn any new syntax.

-FG

@p5pRT
Copy link
Author

p5pRT commented Oct 20, 2015

From @Leont

On Tue, Oct 20, 2015 at 9​:19 PM, Craig A. Berry <craig.a.berry@​gmail.com>
wrote​:

On Tue, Oct 20, 2015 at 1​:27 PM, Felipe Gasper <felipe@​felipegasper.com>
wrote​:

On 20 Oct 2015 1​:13 PM, Christian Hansen via RT wrote​:

When Nick Ing-Simmons designed PerlIO for inclusion in Perl 5.8.0
(2002),
it made sense to limit the buffer size to 8192 (2**13),

Actually, he made it 4096. Some years later I made it the larger of
8192 and BUFSIZ​:

<
http​://perl5.git.perl.org/perl.git/commit/b83080de5c42543809ce9004bcdbcd3162a00e70?f

which produced a significant performance improvement. I did what was
simplest and easiest and I believe pointed out at the time that there
is no one buffer size that is ideal for every application. It's
certainly an area ripe for (further) improvement.

the question what
would a reasonable size for the internal buffer be today? From 2**13
to 2**15 or 2**16? What impact would it have on obscure OS/Platforms?

There are various dragons here. I believe one of the BSDs had a
problem when we switched from 4096 to 8192 because writes through
pipes became split writes, which made something happen differently in
signal handlers (I think it was eintr.t that failed). Just as an
example of things that can go wrong from a change that looks dead
simple. And then there are the CRLF layer and encoding issues Leon
mentioned.

Does PerlIO only know how to fill up its buffer then flush? How hard
would
it be to do something like​:

0) PerlIO has 500 bytes in $fh cache

1) read( $fh, $buffer, 30000 )

2) flush $fh cache into $buffer. int bytes_needed = 25,000;

3) Read (floor(bytes_needed) * 8192) bytes DIRECTLY to $buffer.

3) Now do a “normal” 8,192-byte PerlIO read(). Send (bytes_needed %
8192) to
$buffer, and save the remainder in the $fh cache.

This is interesting but I would favor having a buffer size selectable
from Perl. Something like​:

open my $f, '<​:perlio(bufsiz=32767)', 'file.txt';

(though I really shouldn't be trusted with syntax). This way we would
give control to the developer to know his or her data and optimize
different filehandles for different purposes.

See also​: https://metacpan.org/release/PerlIO-buffersize. It's not very
good syntax, but at least it works today (and on yesterday's perl).

Leon

@p5pRT
Copy link
Author

p5pRT commented Oct 20, 2015

From @FGasper

On 20 Oct 2015 2​:20 PM, Craig Berry via RT wrote​:

Does PerlIO only know how to fill up its buffer then flush? How hard would
it be to do something like​:

0) PerlIO has 500 bytes in $fh cache

1) read( $fh, $buffer, 30000 )

2) flush $fh cache into $buffer. int bytes_needed = 25,000;

3) Read (floor(bytes_needed) * 8192) bytes DIRECTLY to $buffer.

3) Now do a “normal” 8,192-byte PerlIO read(). Send (bytes_needed % 8192) to
$buffer, and save the remainder in the $fh cache.

FWIW, between the ====’s is pseudocode for what I have in mind. I regret
that I don’t have the familiarity with Perl guts to code up a working
example. :(

================
SSize_t
PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
{
  STDCHAR *buf = (STDCHAR *) vbuf;
  if (f) {
  if (!(PerlIOBase(f)->flags & PERLIO_F_CANREAD)) {
  PerlIOBase(f)->flags |= PERLIO_F_ERROR;
  SETERRNO(EBADF, SS_IVCHAN);
  return 0;
  }
  while (count > 0) {
  get_cnt​:
  {
  SSize_t avail = PerlIO_get_cnt(f);
  SSize_t take = 0;
  if (avail > 0)
  take = (((SSize_t) count >= 0) && ((SSize_t)count < avail)) ?
(SSize_t)count : avail;
  if (take > 0) {
  STDCHAR *ptr = PerlIO_get_ptr(f);
  Copy(ptr, buf, take, STDCHAR);
  PerlIO_set_ptrcnt(f, ptr + take, (avail -= take));
  count -= take;
  buf += take;
  if (avail == 0) /* set_ptrcnt could have reset avail */
  goto get_cnt;
  }

  //==================================================
  //if (count > bufsiz) {
  // read( f, vbuf, bufsiz * (count / bufsiz) )
  // ==> error check
  //}
  //==================================================

  if (count > 0 && avail <= 0) {
  if (PerlIO_fill(f) != 0)
  break;
  }
  }
  }
  return (buf - (STDCHAR *) vbuf);
  }
  return 0;
}

The above could be a bit more efficient, but it would at least reduce
the number of system calls for (I suspect) most applications that call
Perl’s read() built-in.

-FG

@p5pRT
Copy link
Author

p5pRT commented Oct 20, 2015

From @tonycoz

On Mon Oct 19 15​:55​:54 2015, felipe@​felipegasper.com wrote​:

Currently read() will always take in data in 8,192-byte chunks
(or whatever chunk size PerlIO is set to use). This makes sense
when reading line-by-line, but not for something like​:

read( $fh, $buffer, 30000 )

... because the buffered read will actually *cost* system calls
rather than saving them.

It would seem more sensible to read in some multiple of the normal
chunk size
at once, e.g., a single read(2) system call of 32,768 bytes.

People can, of course, use sysread(), but then (I am told, have not
verified) we lose other niceties of PerlIO read() like trying again
after EINTR.

Is there anything gained from NOT doing a single read in the case of
wanting 30,000 bytes from a filehandle?

I would have expected something like​:

Inline Patch
diff --git a/perlio.c b/perlio.c
index 8ab47e4..7636595 100644
--- a/perlio.c
+++ b/perlio.c
@@ -2032,6 +2032,7 @@ SSize_t
 PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
 {
     STDCHAR *buf = (STDCHAR *) vbuf;
+    PerlIOBuf * const b = PerlIOSelf(f, PerlIOBuf);
     if (f) {
         if (!(PerlIOBase(f)->flags & PERLIO_F_CANREAD)) {
            PerlIOBase(f)->flags |= PERLIO_F_ERROR;
@@ -2055,6 +2056,18 @@ PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
                if (avail == 0)         /* set_ptrcnt could have reset avail */
                    goto get_cnt;
            }
+            /* if the caller wants more than we can fit in the buffer,
+               avoid looping around filling the buffer and copying it
+               out */
+            if (count > b->bufsiz) {
+                PerlIO *n = PerlIONext(f);
+
+                SSize_t got = PerlIO_read(n, buf, count);
+                if (count <= 0)
+                    break;
+                count -= got;
+                buf += got;
+            }
            if (count > 0 && avail <= 0) {
                if (PerlIO_fill(f) != 0)
                    break;

to work, but it causes one test failure in ../ext/PerlIO-via/t/via.t

It does reduce the number of read(2) calls to 1 though​:

$ strace -e trace=read ./perl -e 'read(STDIN, $buf, 30000)' </dev/zero 2>&1 | grep 'read(0'
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 30000) = 30000
$

Tony

@p5pRT
Copy link
Author

p5pRT commented Oct 21, 2015

From @Leont

On Wed, Oct 21, 2015 at 12​:59 AM, Tony Cook via RT <
perlbug-followup@​perl.org> wrote​:

diff --git a/perlio.c b/perlio.c
index 8ab47e4..7636595 100644
--- a/perlio.c
+++ b/perlio.c
@​@​ -2032,6 +2032,7 @​@​ SSize_t
PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
{
STDCHAR *buf = (STDCHAR *) vbuf;
+ PerlIOBuf * const b = PerlIOSelf(f, PerlIOBuf);
if (f) {
if (!(PerlIOBase(f)->flags & PERLIO_F_CANREAD)) {
PerlIOBase(f)->flags |= PERLIO_F_ERROR;
@​@​ -2055,6 +2056,18 @​@​ PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t
count)
if (avail == 0) /* set_ptrcnt could have reset
avail */
goto get_cnt;
}
+ /* if the caller wants more than we can fit in the buffer,
+ avoid looping around filling the buffer and copying it
+ out */
+ if (count > b->bufsiz) {
+ PerlIO *n = PerlIONext(f);
+
+ SSize_t got = PerlIO_read(n, buf, count);
+ if (count <= 0)
+ break;
+ count -= got;
+ buf += got;
+ }
if (count > 0 && avail <= 0) {
if (PerlIO_fill(f) != 0)
break;

to work, but it causes one test failure in ../ext/PerlIO-via/t/via.t

I'm surprised that only fails only one test, and would consider that a
major hole in our testing. It should break any buffering layer that
modifies the bytestream, including :crlf, :encoding and :via;
PerlIOBase_read/PerlIOBuf_read
is used by all such layers. The buffer is supposed to contain processed
data, while reading directly from the underlying layer doesn't do that
processing.

Leon

@p5pRT
Copy link
Author

p5pRT commented Oct 21, 2015

From @FGasper

On 20 Oct 2015 7​:00 PM, Leon Timmermans via RT wrote​:

On Wed, Oct 21, 2015 at 12​:59 AM, Tony Cook via RT <
perlbug-followup@​perl.org> wrote​:

diff --git a/perlio.c b/perlio.c
index 8ab47e4..7636595 100644
--- a/perlio.c
+++ b/perlio.c
@​@​ -2032,6 +2032,7 @​@​ SSize_t
PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
{
STDCHAR *buf = (STDCHAR *) vbuf;
+ PerlIOBuf * const b = PerlIOSelf(f, PerlIOBuf);
if (f) {
if (!(PerlIOBase(f)->flags & PERLIO_F_CANREAD)) {
PerlIOBase(f)->flags |= PERLIO_F_ERROR;
@​@​ -2055,6 +2056,18 @​@​ PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t
count)
if (avail == 0) /* set_ptrcnt could have reset
avail */
goto get_cnt;
}
+ /* if the caller wants more than we can fit in the buffer,
+ avoid looping around filling the buffer and copying it
+ out */
+ if (count > b->bufsiz) {
+ PerlIO *n = PerlIONext(f);
+
+ SSize_t got = PerlIO_read(n, buf, count);
+ if (count <= 0)
+ break;
+ count -= got;
+ buf += got;
+ }
if (count > 0 && avail <= 0) {
if (PerlIO_fill(f) != 0)
break;

to work, but it causes one test failure in ../ext/PerlIO-via/t/via.t

I'm surprised that only fails only one test, and would consider that a
major hole in our testing. It should break any buffering layer that
modifies the bytestream, including :crlf, :encoding and :via;
PerlIOBase_read/PerlIOBuf_read
is used by all such layers. The buffer is supposed to contain processed
data, while reading directly from the underlying layer doesn't do that
processing.

Thank you both for looking at this! :)

Leon/Tony, maybe this could be active for :raw--or whatever other cases
exist where it causes no functional difference in what the caller gets
back in Perl?

-FG

@p5pRT
Copy link
Author

p5pRT commented Oct 21, 2015

From @craigberry

On Tue, Oct 20, 2015 at 6​:59 PM, Leon Timmermans <fawaka@​gmail.com> wrote​:

On Wed, Oct 21, 2015 at 12​:59 AM, Tony Cook via RT
<perlbug-followup@​perl.org> wrote​:

diff --git a/perlio.c b/perlio.c
index 8ab47e4..7636595 100644
--- a/perlio.c
+++ b/perlio.c
@​@​ -2032,6 +2032,7 @​@​ SSize_t
PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
{
STDCHAR *buf = (STDCHAR *) vbuf;
+ PerlIOBuf * const b = PerlIOSelf(f, PerlIOBuf);
if (f) {
if (!(PerlIOBase(f)->flags & PERLIO_F_CANREAD)) {
PerlIOBase(f)->flags |= PERLIO_F_ERROR;
@​@​ -2055,6 +2056,18 @​@​ PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t
count)
if (avail == 0) /* set_ptrcnt could have reset
avail */
goto get_cnt;
}
+ /* if the caller wants more than we can fit in the buffer,
+ avoid looping around filling the buffer and copying it
+ out */
+ if (count > b->bufsiz) {
+ PerlIO *n = PerlIONext(f);
+
+ SSize_t got = PerlIO_read(n, buf, count);
+ if (count <= 0)
+ break;
+ count -= got;
+ buf += got;
+ }
if (count > 0 && avail <= 0) {
if (PerlIO_fill(f) != 0)
break;

to work, but it causes one test failure in ../ext/PerlIO-via/t/via.t

I'm surprised that only fails only one test, and would consider that a major
hole in our testing. It should break any buffering layer that modifies the
bytestream, including :crlf, :encoding and :via;
PerlIOBase_read/PerlIOBuf_read is used by all such layers. The buffer is
supposed to contain processed data, while reading directly from the
underlying layer doesn't do that processing.

The only alternative I've thought of would be to make PerlIO_fill (and
all the layer-specific Fill routines it calls) take a second "size
requested" argument. Then if and only if the Fill routine is doing an
actual read (as opposed to a buffered pseudo-read from a lower layer),
it could check the requested size against the existing buffer size and
realloc the buffer if the former is larger (perhaps rounding up to the
nearest page or block boundary). In the example above, instead of
Tony's new code, we would just say PerlIO_fill(f, count) in place of
PerlIO_fill(f).

We'd probably have to use va_arg and friends to make the new argument
to Fill optional, but even so it might still break anyone's code who
has a custom layer. There are probably a lot of other considerations,
like whether we'd only do it for regular files, which might need an
extra stat somewhere. Perhaps there should be a maximum limit on the
size of buffer that will be allocated. The CRLF layer plays games
with the buffer size when the last character in the buffer is CR, so
we'd have to make sure that still works. And then there is the
potential confusion arising from the fact that Perl's C<read> operator
specifies the number of *characters* to read whereas these buffers are
obviously counted in bytes.

It might be possible to make this work, but nothing with PerlIO is as
easy as it initially looks.

@p5pRT
Copy link
Author

p5pRT commented Oct 26, 2015

From @tonycoz

On Tue Oct 20 17​:00​:24 2015, LeonT wrote​:

On Wed, Oct 21, 2015 at 12​:59 AM, Tony Cook via RT <
perlbug-followup@​perl.org> wrote​:

diff --git a/perlio.c b/perlio.c
index 8ab47e4..7636595 100644
--- a/perlio.c
+++ b/perlio.c
@​@​ -2032,6 +2032,7 @​@​ SSize_t
PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
{

I'm surprised that only fails only one test, and would consider that a
major hole in our testing. It should break any buffering layer that
modifies the bytestream, including :crlf, :encoding and :via;
PerlIOBase_read/PerlIOBuf_read
is used by all such layers. The buffer is supposed to contain processed
data, while reading directly from the underlying layer doesn't do that
processing.

The attached avoids modifying PerlIOBase_read or PerlIOBuf_read.

I wondered if I can use PERLIO_K_RAW to indicate that the direct read was safe, but that's set for the crlf layer, where it isn't safe.

Tony

@p5pRT
Copy link
Author

p5pRT commented Oct 26, 2015

From @tonycoz

0001-perl-126403-WIP-bypass-the-buffer-for-large-reads-th.patch
From de3a8ff5e2b972f17f475c27471b8a7cb6f46bf1 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Mon, 26 Oct 2015 12:06:53 +1100
Subject: [perl #126403] WIP bypass the buffer for large reads through only
 :perlio

---
 perlio.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 61 insertions(+), 1 deletion(-)

diff --git a/perlio.c b/perlio.c
index 8ab47e4..e64be2b 100644
--- a/perlio.c
+++ b/perlio.c
@@ -4287,7 +4287,67 @@ PerlIOBuf_dup(pTHX_ PerlIO *f, PerlIO *o, CLONE_PARAMS *param, int flags)
  return PerlIOBase_dup(aTHX_ f, o, param, flags);
 }
 
+static SSize_t
+PerlIOPerlIO_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
+{
+    STDCHAR *buf = (STDCHAR *) vbuf;
+    if (f) {
+        PerlIOBuf * const b = PerlIOSelf(f, PerlIOBuf);
+        if (!(PerlIOBase(f)->flags & PERLIO_F_CANREAD)) {
+	    PerlIOBase(f)->flags |= PERLIO_F_ERROR;
+	    SETERRNO(EBADF, SS_IVCHAN);
+	    PerlIO_save_errno(f);
+	    return 0;
+	}
+        PerlIO_debug("flags %x\n", PerlIOBase(f)->flags);
+	while (count > 0) {
+	 get_cnt:
+	  {
+	    SSize_t avail = PerlIO_get_cnt(f);
+	    SSize_t take = 0;
+	    if (avail > 0)
+		take = (((SSize_t) count >= 0) && ((SSize_t)count < avail)) ? (SSize_t)count : avail;
+	    if (take > 0) {
+		STDCHAR *ptr = PerlIO_get_ptr(f);
+		Copy(ptr, buf, take, STDCHAR);
+		PerlIO_set_ptrcnt(f, ptr + take, (avail -= take));
+		count -= take;
+		buf += take;
+		if (avail == 0)		/* set_ptrcnt could have reset avail */
+		    goto get_cnt;
+	    }
+            /* if the caller wants more than we can fit in the buffer,
+               avoid looping around filling the buffer and copying it
+               out */
+            if (count > b->bufsiz) {
+                PerlIO *n = PerlIONext(f);
+
+                SSize_t got = PerlIO_read(n, buf, count);
+                if (got <= 0) {
+                    if (got == 0) {
+                        PerlIOBase(f)->flags |= PERLIO_F_EOF;
+                        break;
+                    }
+                    else if (got < 0) {
+                        PerlIOBase(f)->flags |= PERLIO_F_ERROR;
+                        PerlIO_save_errno(f);
+                        break;
+                    }
+                }
+                count -= got;
+                buf += got;
+            }
 
+	    if (count > 0 && avail <= 0) {
+		if (PerlIO_fill(f) != 0)
+		    break;
+	    }
+	  }
+	}
+	return (buf - (STDCHAR *) vbuf);
+    }
+    return 0;
+}
 
 PERLIO_FUNCS_DECL(PerlIO_perlio) = {
     sizeof(PerlIO_funcs),
@@ -4301,7 +4361,7 @@ PERLIO_FUNCS_DECL(PerlIO_perlio) = {
     NULL,
     PerlIOBase_fileno,
     PerlIOBuf_dup,
-    PerlIOBuf_read,
+    PerlIOPerlIO_read,
     PerlIOBuf_unread,
     PerlIOBuf_write,
     PerlIOBuf_seek,
-- 
2.1.4

@p5pRT
Copy link
Author

p5pRT commented Oct 27, 2015

From @craigberry

On Sun, Oct 25, 2015 at 9​:57 PM, Tony Cook via RT
<perlbug-followup@​perl.org> wrote​:

On Tue Oct 20 17​:00​:24 2015, LeonT wrote​:

On Wed, Oct 21, 2015 at 12​:59 AM, Tony Cook via RT <
perlbug-followup@​perl.org> wrote​:

diff --git a/perlio.c b/perlio.c
index 8ab47e4..7636595 100644
--- a/perlio.c
+++ b/perlio.c
@​@​ -2032,6 +2032,7 @​@​ SSize_t
PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
{

I'm surprised that only fails only one test, and would consider that a
major hole in our testing. It should break any buffering layer that
modifies the bytestream, including :crlf, :encoding and :via;
PerlIOBase_read/PerlIOBuf_read
is used by all such layers. The buffer is supposed to contain processed
data, while reading directly from the underlying layer doesn't do that
processing.

The attached avoids modifying PerlIOBase_read or PerlIOBuf_read.

+ /* if the caller wants more than we can fit in the buffer,
+ avoid looping around filling the buffer and copying it
+ out */
+ if (count > b->bufsiz) {
+ PerlIO *n = PerlIONext(f);
+
+ SSize_t got = PerlIO_read(n, buf, count);

So after determining that the buffer isn't large enough to hold count
bytes, we now try to read count bytes into a buffer we know can't hold
it? What am I missing?

@p5pRT
Copy link
Author

p5pRT commented Oct 27, 2015

From @tonycoz

On Tue, Oct 27, 2015 at 12​:41​:13PM -0500, Craig A. Berry wrote​:

On Sun, Oct 25, 2015 at 9​:57 PM, Tony Cook via RT
<perlbug-followup@​perl.org> wrote​:

On Tue Oct 20 17​:00​:24 2015, LeonT wrote​:

On Wed, Oct 21, 2015 at 12​:59 AM, Tony Cook via RT <
perlbug-followup@​perl.org> wrote​:

diff --git a/perlio.c b/perlio.c
index 8ab47e4..7636595 100644
--- a/perlio.c
+++ b/perlio.c
@​@​ -2032,6 +2032,7 @​@​ SSize_t
PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
{

I'm surprised that only fails only one test, and would consider that a
major hole in our testing. It should break any buffering layer that
modifies the bytestream, including :crlf, :encoding and :via;
PerlIOBase_read/PerlIOBuf_read
is used by all such layers. The buffer is supposed to contain processed
data, while reading directly from the underlying layer doesn't do that
processing.

The attached avoids modifying PerlIOBase_read or PerlIOBuf_read.

+ /* if the caller wants more than we can fit in the buffer,
+ avoid looping around filling the buffer and copying it
+ out */
+ if (count > b->bufsiz) {
+ PerlIO *n = PerlIONext(f);
+
+ SSize_t got = PerlIO_read(n, buf, count);

So after determining that the buffer isn't large enough to hold count
bytes, we now try to read count bytes into a buffer we know can't hold
it? What am I missing?

It won't fit into the layer buffer, but buf and count refer to the
caller's buffer and the number of bytes the caller wants read into it.

If it won't fit there then the caller has messed up, since
PerlIOBuf_read() will also attempt to read count bytes into it, just
with more calls to the lower layer and more copying.

Tony

@p5pRT
Copy link
Author

p5pRT commented Oct 28, 2015

From @craigberry

On Tue, Oct 27, 2015 at 5​:39 PM, Tony Cook <tony@​develop-help.com> wrote​:

On Tue, Oct 27, 2015 at 12​:41​:13PM -0500, Craig A. Berry wrote​:

On Sun, Oct 25, 2015 at 9​:57 PM, Tony Cook via RT

The attached avoids modifying PerlIOBase_read or PerlIOBuf_read.

+ /* if the caller wants more than we can fit in the buffer,
+ avoid looping around filling the buffer and copying it
+ out */
+ if (count > b->bufsiz) {
+ PerlIO *n = PerlIONext(f);
+
+ SSize_t got = PerlIO_read(n, buf, count);

So after determining that the buffer isn't large enough to hold count
bytes, we now try to read count bytes into a buffer we know can't hold
it? What am I missing?

It won't fit into the layer buffer, but buf and count refer to the
caller's buffer and the number of bytes the caller wants read into it.

If it won't fit there then the caller has messed up, since
PerlIOBuf_read() will also attempt to read count bytes into it, just
with more calls to the lower layer and more copying.

Ah, yes. I saw "buf" but was thinking "b->buf". Have you tried to do
any benchmarking yet to see if it really speeds up reads?

@p5pRT
Copy link
Author

p5pRT commented Oct 29, 2015

From @tonycoz

On Wed Oct 28 05​:25​:58 2015, craig.a.berry@​gmail.com wrote​:

Ah, yes. I saw "buf" but was thinking "b->buf". Have you tried to do
any benchmarking yet to see if it really speeds up reads?

I hadn't.

On Linux it speeds up read() to around the same speed as sysread().

On Win32 it makes no difference, since the perlio layer isn't used, unless you explicitly add it​:

J​:\dev\perl\git\perl\win32>..\perl -le "open my $fh, '<', 'zeros' or die; binmode $fh, '​:raw'; print for PerlIO​::get_layers($fh)"
unix
crlf

J​:\dev\perl\git\perl\win32>..\perl -le "open my $fh, '<​:raw​:perlio', 'zeros' or die; binmode $fh, '​:raw'; print for PerlIO​::get_layers($fh)"
unix
crlf
perlio

J​:\dev\perl\git\perl\win32>..\perl -le "open my $fh, '<​:pop​:perlio', 'zeros' or die; binmode $fh, '​:raw'; print for PerlIO​::get_layers($fh)"
unix
perlio

For Linux​:

blead​:

Size of zeros is 10000000
Benchmark​: timing 2000 iterations of read, seekread, seekslurp, seeksysread, slurp, sysread...
  read​: 13.1377 wallclock secs ( 6.15 usr + 6.96 sys = 13.11 CPU) @​ 152.56/s (n=2000)
  seekread​: 13.0651 wallclock secs ( 6.20 usr + 6.85 sys = 13.05 CPU) @​ 153.26/s (n=2000)
seekslurp​: 13.1436 wallclock secs ( 6.16 usr + 6.96 sys = 13.12 CPU) @​ 152.44/s (n=2000)
seeksysread​: 12.1413 wallclock secs ( 0.04 usr + 12.09 sys = 12.13 CPU) @​ 164.88/s (n=2000)
  slurp​: 13.5237 wallclock secs ( 6.62 usr + 6.88 sys = 13.50 CPU) @​ 148.15/s (n=2000)
  sysread​: 12.2393 wallclock secs ( 0.06 usr + 12.16 sys = 12.22 CPU) @​ 163.67/s (n=2000)

with the patch​:

Size of zeros is 10000000
Benchmark​: timing 2000 iterations of read, seekread, seekslurp, seeksysread, slurp, sysread...
  read​: 12.1104 wallclock secs ( 0.08 usr + 12.00 sys = 12.08 CPU) @​ 165.56/s (n=2000)
  seekread​: 12.0167 wallclock secs ( 0.03 usr + 11.97 sys = 12.00 CPU) @​ 166.67/s (n=2000)
seekslurp​: 13.2835 wallclock secs ( 6.21 usr + 7.05 sys = 13.26 CPU) @​ 150.83/s (n=2000)
seeksysread​: 12.0765 wallclock secs ( 0.04 usr + 12.01 sys = 12.05 CPU) @​ 165.98/s (n=2000)
  slurp​: 13.3699 wallclock secs ( 6.38 usr + 6.96 sys = 13.34 CPU) @​ 149.93/s (n=2000)
  sysread​: 12.1688 wallclock secs ( 0.08 usr + 12.06 sys = 12.14 CPU) @​ 164.74/s (n=2000)

I'd hoped it would be more useful on Win32, since read() calls are fairly slow per call on Win32, but I guess it would require more work. Win32 has at least one other perlio issue, the :crlf layer is O(n**2) (n = size of the buffer) as pointed out by bulk88 a while back.

Tony

@p5pRT
Copy link
Author

p5pRT commented Oct 29, 2015

From @tonycoz

read-speed.pl

@p5pRT
Copy link
Author

p5pRT commented Oct 29, 2015

From Stromeko@nexgo.de

Am 29.10.2015 um 01​:51 schrieb Tony Cook via RT​:

I'd hoped it would be more useful on Win32, since read() calls are
fairly slow per call on Win32, but I guess it would require more
work. Win32 has at least one other perlio issue, the :crlf layer is
O(n**2) (n = size of the buffer) as pointed out by bulk88 a while
back.

On windows the read buffer size should be 64kiB if it's used to read
from files.

--
Achim.

(on the road :-)

@p5pRT
Copy link
Author

p5pRT commented Oct 29, 2015

From @craigberry

On Thu, Oct 29, 2015 at 2​:51 AM, Achim Gratz <Stromeko@​nexgo.de> wrote​:

Am 29.10.2015 um 01​:51 schrieb Tony Cook via RT​:

I'd hoped it would be more useful on Win32, since read() calls are
fairly slow per call on Win32, but I guess it would require more
work. Win32 has at least one other perlio issue, the :crlf layer is
O(n**2) (n = size of the buffer) as pointed out by bulk88 a while
back.

On windows the read buffer size should be 64kiB if it's used to read from
files.

Folks building their own Perl who want to experiment with larger
buffer sizes can configure like so​:

./Configure -Accflags=-DPERLIOBUF_DEFAULT_BUFSIZ=65535

or whatever size you want. Or add that to the defines in the Windows
configuration files.

I've been trying an alternate approach to Tony's that is more in line
with the OP's suggestion of making the buffer size correspond to the
specified size of the read request. It fails two Tie​::File tests
which I have not investigated yet but I suspect changing the buffer
size mid-stream has horked seek and tell somehow. Hopefully that can
be fixed. All other tests pass, though I haven't tried it on Windows.

What's here (inline below and attached) does illustrate, however, the
degree to which a bigger, self-adjusting read buffer improves read
speeds. Reads up to 8K are unchanged as expected. Reads requesting
32K are 10% faster, 64K reads are 11% faster, 128K reads are 17%
faster, and 512K reads are 12% faster on my recentish Mac OS X laptop
with flash storage. I've attached the benchmark script I used.

Not sure when or if I'll finish getting the kinks worked out, but this
is an approach that's minimally invasive, doesn't break any APIs, and
just might work.

From 2a6c01c0637b801a90aaa5b8536efbd17c91d6a5 Mon Sep 17 00​:00​:00 2001
From​: "Craig A. Berry" <craigberry@​mac.com>
Date​: Thu, 29 Oct 2015 09​:48​:27 -0500
Subject​: [PATCH] Grow perlio buffer to match requested read size.

WIP​: There should be an upper bound on how big we will grow it,
but I don't know what's reasonable. 512K? 512MB?

In PerlIOBase_read, when we have exhausted the buffer and are
about to refill, expand its size first to match the size of the
requested read if it is larger than the current buffer size.

This way we should satisfy the request on the next fill and
not end up back where we are, continually refilling a buffer
that is smaller than what we want to read into it.

Addresses [perl #126403].


perlio.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

Inline Patch
diff --git a/perlio.c b/perlio.c
index 8ab47e4..1f83f01 100644
--- a/perlio.c
+++ b/perlio.c
@@ -2056,6 +2056,19 @@ PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf,
Size_t count)   goto get\_cnt;   \}   if \(count > 0 && avail \<= 0\) \{ \+ if \(PerlIOBase\(f\)\->flags & PERLIO\_F\_RDBUF\) \{ \+ PerlIOBuf \* const b = PerlIOSelf\(f\, PerlIOBuf\); \+ if \(count > b\->bufsiz\) \{ \+ Size\_t requested = count \- \(count % 512\) \+ 512; \+ b\->bufsiz = requested; \+ Renew\(b\->buf\, b\->bufsiz\, STDCHAR\); \+ if \(\!b\->buf\) \{ \+ b\->buf = \(STDCHAR \*\) & b\->oneword; \+ b\->bufsiz = sizeof\(b\->oneword\); \+ \} \+ \} \+ \} \+   if \(PerlIO\_fill\(f\) \!= 0\)   break;   \} \-\- 2\.2\.1

@p5pRT
Copy link
Author

p5pRT commented Oct 29, 2015

From @craigberry

0001-Grow-perlio-buffer-to-match-requested-read-size.patch
From 2a6c01c0637b801a90aaa5b8536efbd17c91d6a5 Mon Sep 17 00:00:00 2001
From: "Craig A. Berry" <craigberry@mac.com>
Date: Thu, 29 Oct 2015 09:48:27 -0500
Subject: [PATCH] Grow perlio buffer to match requested read size.

WIP: There should be an upper bound on how big we will grow it,
but I don't know what's reasonable.  512K? 512MB?

In PerlIOBase_read, when we have exhausted the buffer and are
about to refill, expand its size first to match the size of the
requested read if it is larger than the current buffer size.

This way we should satisfy the request on the next fill and
not end up back where we are, continually refilling a buffer
that is smaller than what we want to read into it.

Addresses [perl #126403].
---
 perlio.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/perlio.c b/perlio.c
index 8ab47e4..1f83f01 100644
--- a/perlio.c
+++ b/perlio.c
@@ -2056,6 +2056,19 @@ PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
 		    goto get_cnt;
 	    }
 	    if (count > 0 && avail <= 0) {
+		if (PerlIOBase(f)->flags & PERLIO_F_RDBUF) {
+		    PerlIOBuf * const b = PerlIOSelf(f, PerlIOBuf);
+		    if (count > b->bufsiz) {
+			Size_t requested = count - (count % 512) + 512;
+			b->bufsiz = requested;
+			Renew(b->buf, b->bufsiz, STDCHAR);
+			if (!b->buf) {
+			    b->buf = (STDCHAR *) & b->oneword;
+			    b->bufsiz = sizeof(b->oneword);
+			}
+		    }
+		}
+
 		if (PerlIO_fill(f) != 0)
 		    break;
 	    }
-- 
2.2.1

@p5pRT
Copy link
Author

p5pRT commented Oct 29, 2015

From @craigberry

perlio_bufsiz_bench.pl

@p5pRT
Copy link
Author

p5pRT commented Nov 2, 2015

From @craigberry

On Thu, Oct 29, 2015 at 12​:11 PM, Craig A. Berry
<craig.a.berry@​gmail.com> wrote​:

I've been trying an alternate approach to Tony's that is more in line
with the OP's suggestion of making the buffer size correspond to the
specified size of the read request. It fails two Tie​::File tests

Those test failures were easily fixed by doing a flush to reset buffer
state before expanding the buffer size. The flush had no measurable
impact on performance. The revised patch attached here introduces the
flush and also allows the buffer size to grow no larger than 1 MiB. I
don't really know if that's the best value.

In any case, I think this is now a fairly complete solution that
introduces no test failures in a default configuration on Mac OS X,
Windows, and VMS.

What's here does illustrate the
degree to which a bigger, self-adjusting read buffer improves read
speeds. Reads up to 8K are unchanged as expected. Reads requesting
32K are 10% faster, 64K reads are 11% faster, 128K reads are 17%
faster, and 512K reads are 12% faster on my recentish Mac OS X laptop
with flash storage. I've attached the benchmark script I used.

There is also noticeable improvement on Windows. On a Perl built with
Visual Studio 2012 on a two-year-old core i7 laptop running Windows 7
and having spinning rust storage, read speed improved between 13% and
70% depending on read size. The higher end of that seems too good to
be true and I assume there's something I'm not accounting for in my
benchmark. Or maybe this minimizes how many times the crlf layer
scans for newlines and it really is saving a massive amount of
processing? More testing welcome, but an improvement of at least 13%
isn't bad.

N.B. This approach only addresses the case where a specific length is
specified as the third argument to read. Improving what is probably
the more common idiom for slurping in a file ($/ = undef) would
require changing the fixed-size 8K buffer in Perl_sv_gets.

@p5pRT
Copy link
Author

p5pRT commented Nov 2, 2015

From @craigberry

0001-Grow-perlio-buffer-to-match-requested-read-size.patch
From 3c48082e12795723f769efa1747f13757ca60d9d Mon Sep 17 00:00:00 2001
From: "Craig A. Berry" <craigberry@mac.com>
Date: Thu, 29 Oct 2015 09:48:27 -0500
Subject: [PATCH] Grow perlio buffer to match requested read size.

In PerlIOBase_read, when we have exhausted the buffer and are
about to refill, expand its size first to match the size of the
requested read if it is larger than the current buffer size.

This way we should satisfy the request on the next fill and
not end up back where we are, continually refilling a buffer
that is smaller than what we want to read into it.

The buffer is not allowed to grow larger than the value to which
the new macro PERLIOBUF_MAX_BUFSIZ is set, currently 1 MiB.

Addresses [perl #126403].
---
 perlio.c | 18 ++++++++++++++++++
 perlio.h |  6 ++++++
 2 files changed, 24 insertions(+)

diff --git a/perlio.c b/perlio.c
index 8ab47e4..405c776 100644
--- a/perlio.c
+++ b/perlio.c
@@ -2056,6 +2056,24 @@ PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
 		    goto get_cnt;
 	    }
 	    if (count > 0 && avail <= 0) {
+		if (PerlIOBase(f)->flags & PERLIO_F_RDBUF) {
+		    PerlIOBuf * const b = PerlIOSelf(f, PerlIOBuf);
+		    if (count > b->bufsiz && PerlIO_flush(f) == 0) {
+			Size_t requested = count - (count % 512) + 512;
+			if (requested > PERLIOBUF_MAX_BUFSIZ)
+			    requested = PERLIOBUF_MAX_BUFSIZ;
+			if (requested > b->bufsiz) {
+			    b->bufsiz = requested;
+			    Renew(b->buf, b->bufsiz, STDCHAR);
+			    if (!b->buf) {
+				b->buf = (STDCHAR *) & b->oneword;
+				b->bufsiz = sizeof(b->oneword);
+				return 0;
+			    }
+			}
+		    }
+		}
+
 		if (PerlIO_fill(f) != 0)
 		    break;
 	    }
diff --git a/perlio.h b/perlio.h
index 1a3d480..093fda9 100644
--- a/perlio.h
+++ b/perlio.h
@@ -149,6 +149,12 @@ PERL_CALLCONV void PerlIO_clone(pTHX_ PerlInterpreter *proto,
 #define PERLIOBUF_DEFAULT_BUFSIZ (BUFSIZ > 8192 ? BUFSIZ : 8192)
 #endif
 
+/* The maximum buffer size for the perlio buffering layer */
+/* One MiB is a bit arbitrary, but what should it be? */
+#ifndef PERLIOBUF_MAX_BUFSIZ
+#define PERLIOBUF_MAX_BUFSIZ 1024 * 1024
+#endif
+
 #ifndef SEEK_SET
 #define SEEK_SET 0
 #endif
-- 
2.2.1

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2015

From @tonycoz

On Sun Nov 01 18​:23​:17 2015, craig.a.berry@​gmail.com wrote​:

On Thu, Oct 29, 2015 at 12​:11 PM, Craig A. Berry
<craig.a.berry@​gmail.com> wrote​:

I've been trying an alternate approach to Tony's that is more in line
with the OP's suggestion of making the buffer size correspond to the
specified size of the read request. It fails two Tie​::File tests

Those test failures were easily fixed by doing a flush to reset buffer
state before expanding the buffer size. The flush had no measurable
impact on performance. The revised patch attached here introduces the
flush and also allows the buffer size to grow no larger than 1 MiB. I
don't really know if that's the best value.

In any case, I think this is now a fairly complete solution that
introduces no test failures in a default configuration on Mac OS X,
Windows, and VMS.

What's here does illustrate the
degree to which a bigger, self-adjusting read buffer improves read
speeds. Reads up to 8K are unchanged as expected. Reads requesting
32K are 10% faster, 64K reads are 11% faster, 128K reads are 17%
faster, and 512K reads are 12% faster on my recentish Mac OS X laptop
with flash storage. I've attached the benchmark script I used.

There is also noticeable improvement on Windows. On a Perl built with
Visual Studio 2012 on a two-year-old core i7 laptop running Windows 7
and having spinning rust storage, read speed improved between 13% and
70% depending on read size. The higher end of that seems too good to
be true and I assume there's something I'm not accounting for in my
benchmark. Or maybe this minimizes how many times the crlf layer
scans for newlines and it really is saving a massive amount of
processing? More testing welcome, but an improvement of at least 13%
isn't bad.

N.B. This approach only addresses the case where a specific length is
specified as the third argument to read. Improving what is probably
the more common idiom for slurping in a file ($/ = undef) would
require changing the fixed-size 8K buffer in Perl_sv_gets.

This check after Renew​:

+ Renew(b->buf, b->bufsiz, STDCHAR);
+ if (!b->buf) {
+ b->buf = (STDCHAR *) & b->oneword;
+ b->bufsiz = sizeof(b->oneword);
+ return 0;
+ }

isn't needed. Unless you set PL_nomemok = TRUE, which is only done briefly by hv.c, memory allocation failures cause perl to exit.

My main issue with pushing the buffer sizes up is, at least for reads with only :unix and :perlio, we're still copying data into the buffer just to copy it again into the caller's buffer.

Tony

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2015

From @craigberry

On Sun, Nov 8, 2015 at 5​:04 PM, Tony Cook via RT
<perlbug-followup@​perl.org> wrote​:

On Sun Nov 01 18​:23​:17 2015, craig.a.berry@​gmail.com wrote​:

N.B. This approach only addresses the case where a specific length is
specified as the third argument to read. Improving what is probably
the more common idiom for slurping in a file ($/ = undef) would
require changing the fixed-size 8K buffer in Perl_sv_gets.

Since I wrote that, I've realized it's not actually how it works if
we're using perlio. sv_gets is doing "buffer snooping," i.e., reading
one character at a time but then peeking in the read-ahead buffer.
For some scenarios it copies one byte at a time out of that buffer,
for others it does a Copy of a range. That whole scheme was developed
when stdio was the only thing going and it assumes that the buffer is
fixed size and not under our control. I'm not sure we want a gigantic
buffer in all cases, but for the $/ = undef case at least, I think it
would help to pre-extend the perlio buffer. We already pre-extend the
user buffer to the size of the entire file.

This check after Renew​:

+ Renew(b->buf, b->bufsiz, STDCHAR);
+ if (!b->buf) {
+ b->buf = (STDCHAR *) & b->oneword;
+ b->bufsiz = sizeof(b->oneword);
+ return 0;
+ }

isn't needed. Unless you set PL_nomemok = TRUE, which is only done briefly by hv.c, memory allocation failures cause perl to exit.

Good to know, thanks. I'm afraid that was copy and paste from
PerlIOBuf_get_base, which likely predates the way New and friends work
now.

My main issue with pushing the buffer sizes up is, at least for reads with only :unix and :perlio, we're still copying data into the buffer just to copy it again into the caller's buffer.

That's true, but short of abandoning the whole layer concept, is there
any other way? It sure is tempting to try to read() directly into
some range of the user's buffer, but it looks to me like that would be
pretty tricky. For one thing, we currently scan for record separators
and do utf8 upgrades on one of the intermediate buffers before copying
to the user's buffer. Switching those operations to after the fact
without breaking something sounds like more fun than I want to have.

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2015

From @tonycoz

On Sun, Nov 08, 2015 at 10​:00​:57PM -0600, Craig A. Berry wrote​:

On Sun, Nov 8, 2015 at 5​:04 PM, Tony Cook via RT

My main issue with pushing the buffer sizes up is, at least for reads with only :unix and :perlio, we're still copying data into the buffer just to copy it again into the caller's buffer.

That's true, but short of abandoning the whole layer concept, is there
any other way? It sure is tempting to try to read() directly into
some range of the user's buffer, but it looks to me like that would be
pretty tricky. For one thing, we currently scan for record separators
and do utf8 upgrades on one of the intermediate buffers before copying
to the user's buffer. Switching those operations to after the fact
without breaking something sounds like more fun than I want to have.

That's for intermediate buffers, which should be filling by calling
XX_fill() (and then peeking its buffer) or XX_read() on the lower
level buffer.

A layer which fiddles with the stream's contents, like :encoding or
:crlf, shouldn't be doing this sort of bypass.

If a layer wants the lower layer to fill its buffer, then it can call
its Fill handler so it can peek at it "fast gets" access.

Consider that the :unix layer already has this direct read()
behaviour.

Tony

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants