Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perl 6 text file line read is much slower than Perl 5 #5765

Open
p6rt opened this issue Oct 22, 2016 · 6 comments
Open

Perl 6 text file line read is much slower than Perl 5 #5765

p6rt opened this issue Oct 22, 2016 · 6 comments

Comments

@p6rt
Copy link

p6rt commented Oct 22, 2016

Migrated from rt.perl.org#129941 (status was 'open')

Searchable as RT129941$

@p6rt
Copy link
Author

p6rt commented Oct 22, 2016

From @tbrowder

See <https://github.com/tbrowder/perl6-read-write-tests> for a suite
of tests that show the differences.

For example (from the link above)​:

Results of recent file read tests

Date | Rakudo Version | File Size (lines) | Perl 5
RT | Perl 6 RT | P6/P5

2016-10-18 | 2016.10-16-geb6907e | 1_000_000 | 1.39s |
  12.61s | 25.2
2016-10-18 | 2016.10-16-geb6907e | 6_000_000_000 | 75.47s |
737.63s | 18.2
2016-10-18 | 2016.10-16-geb6907e | 10_000_000_000 | 121.33s |
1233.29s | 25.1

Notes​:

RT - run time

@p6rt
Copy link
Author

p6rt commented Oct 22, 2016

From @lizmat

Would you believe it used to be a lot slower still?

Anyways, what does P6/P5 mean?? If it’s the runtimes divided, I get values between 9 and 10 or so. Which would be less surprising to me.

On 22 Oct 2016, at 13​:24, Tom Browder (via RT) <perl6-bugs-followup@​perl.org> wrote​:

# New Ticket Created by Tom Browder
# Please include the string​: [perl #​129941]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl6/Ticket/Display.html?id=129941 >

See <https://github.com/tbrowder/perl6-read-write-tests> for a suite
of tests that show the differences.

For example (from the link above)​:

Results of recent file read tests

Date | Rakudo Version | File Size (lines) | Perl 5
RT | Perl 6 RT | P6/P5

2016-10-18 | 2016.10-16-geb6907e | 1_000_000 | 1.39s |
12.61s | 25.2
2016-10-18 | 2016.10-16-geb6907e | 6_000_000_000 | 75.47s |
737.63s | 18.2
2016-10-18 | 2016.10-16-geb6907e | 10_000_000_000 | 121.33s |
1233.29s | 25.1

Notes​:

RT - run time

@p6rt
Copy link
Author

p6rt commented Oct 22, 2016

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Oct 24, 2016

From @tbrowder

On Sat Oct 22 04​:24​:15 2016, tbrowder wrote​:

See <https://github.com/tbrowder/perl6-read-write-tests> for a suite
of tests that show the differences.

Suite has been updated considerably.

@p6rt
Copy link
Author

p6rt commented Sep 12, 2017

From @jnthn

On Mon, 24 Oct 2016 03​:27​:55 -0700, tbrowder wrote​:

On Sat Oct 22 04​:24​:15 2016, tbrowder wrote​:

See <https://github.com/tbrowder/perl6-read-write-tests> for a suite
of tests that show the differences.

Suite has been updated considerably.

In a benchmark on my local machine, after many improvements, I now see Perl 6 coming out slightly ahead of Perl 5 when the UTF-8 encoding is being used​:

$ time perl6 -e 'my $fh = open "longfile"; my $chars = 0; for $fh.lines { $chars = $chars + .chars }; $fh.close; say $chars'
60000000

real 0m1.081s
user 0m1.168s
sys 0m0.032s

$ time perl -e 'open my $fh, "<​:encoding(UTF-8)", "longfile"; my $chars = 0; while ($_ = <$fh>) { chomp; $chars = $chars + length($_) }; close $fh; print "$chars\n"'
60000000

real 0m1.110s
user 0m1.088s
sys 0m0.020s

The situation with ASCII/latin-1 is still not quite so rosy​:

$ time perl -e 'open my $fh, "<", "longfile"; my $chars = 0; while ($_ = <$fh>) { chomp; $chars = $chars + length($_) }; close $fh; print "$chars\n"'
60000000

real 0m0.277s
user 0m0.260s
sys 0m0.016s

$ time ./perl6-m -e 'my $fh = open "longfile", :enc<ascii>; my $chars = 0; for $fh.lines { $chars = $chars + .chars }; $fh.close; say $chars'
60000000

real 0m0.988s
user 0m1.028s
sys 0m0.068s

Though that's now down to a factor of 3.5x, which is hugely better than the factor of 9 or 10 before.

What are the conditions for resolving this issue? Clearly the UTF-8 case is good enough because Perl 6 is winning there, but "much slower" is a bit subjective, so hard to know when we're there (unless we somehow manage to win in the ASCII case too...) :-)

/jnthn

@p6rt
Copy link
Author

p6rt commented Sep 12, 2017

From @tbrowder

On Tue, Sep 12, 2017 at 09​:23 jnthn@​jnthn.net via RT <
perl6-bugs-followup@​perl.org> wrote​:

On Mon, 24 Oct 2016 03​:27​:55 -0700, tbrowder wrote​:

On Sat Oct 22 04​:24​:15 2016, tbrowder wrote​:

See <https://github.com/tbrowder/perl6-read-write-tests> for a suite
of tests that show the differences.

Suite has been updated considerably.

In a benchmark on my local machine, after many improvements, I now see
Perl 6 coming out slightly ahead of Perl 5 when the UTF-8 encoding is being
used​:

...

The situation with ASCII/latin-1 is still not quite so rosy​:

...

Though that's now down to a factor of 3.5x, which is hugely better than
the factor of 9 or 10 before.

What are the conditions for resolving this issue? Clearly the UTF-8 case
is good enough because Perl 6 is winning there, but "much slower" is a bit
subjective, so hard to know when we're there (unless we somehow manage to
win in the ASCII case too...) :-)

Jonathan, thanks for your continued work in this area. I agree that "much
slower" is not very specific. Unless you are aware of some code remaining
to work on someday, I guess we are at the "good enough" point, especially
given that p6 is beating p5 in urf8!

Let me run my tests again to see how it "feels" in my world.

Best,

-Tom

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant