New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lib/unicore/mktables takes too long #16401
Comments
From @dur-randirCreated by @dur-randirHigh-parallel builds of perl are currently stranded by the git clean -dfx && ./Configure -de -Dusedevel && time make -j20 test_prep ./miniperl -Ilib lib/unicore/mktables -C lib/unicore -P pod -maketest If I include configure step in the total time measurement, it's still ~27% bash -c './Configure -de -Dusedevel && make -j20 test_prep' 185.76s This happened only during this development cycle, this step has been Perl Info
|
From @jkeenanOn Thu, 01 Feb 2018 21:21:59 GMT, randir wrote:
I don't find compelling evidence of this trend. I built perl at v5.24.0, v5.26.0 and HEAD twice in each of three environments. See the attachment for results. I focus on 'real' time because with bisection the clock time measures the time a human is waiting for results. The data is obviously not statistically significant, but my impressions are that: * there is more variation in timings between different 'make' runs for the same Perl version than there is between different Perl versions; * if one machine is inherently faster than another (typically, more cores), then the percentage of total clock time taken up by ./Configure is greater on that machine than the other; ('make' flies like a rocket on dromedary where 'nproc -all' returns 24); * if you're running Porting/bisect.pl with the '--module' switch, the total time spent in building and testing prerequisites swamps that taken during 'make'. -- |
From @jkeenan$ git clean -dfx && regular_configure && /usr/bin/time -f "\t%e real,\t%U user,\t%S sys" make -j${TEST_JOBS} test_prep # zareason # v5.26.0 (95388f2) # HEAD (1654584) ##### # Linode # v5.26.0 (95388f2) # HEAD (1654584) ##### # dromedary # v5.26.0 (95388f2) # HEAD (1654584) |
The RT System itself - Status changed from 'new' to 'open' |
From @khwilliamsonOn 02/01/2018 06:20 PM, James E Keenan via RT wrote:
I also haven't anecdotally noticed any marked decrease in speed. I Note that my goal is to make mktables not run very frequently. It And I have a question for you git people. I was looking at the blame https://perl5.git.perl.org/perl.git/history/HEAD:/lib/unicore/mktables It includes https://perl5.git.perl.org/perl.git/commit/da4e040f42421764ef069371d77c008e6b801f45 in the list of recent changes. But I don't see that that commit |
From @demerphqOn 2 February 2018 at 06:36, Karl Williamson <public@khwilliamson.com> wrote:
Personally I have never understood why we treat the output of mktables IMO we should just check the output files into the repo, and only run Yves -- |
From @dur-randirOn Thu, 01 Feb 2018 17:20:00 -0800, jkeenan wrote:
Yes, this depends on a machine load. My measurements have been taken during low-load times, so the were repeatable. What you've measured is only total build time, not what it constitutes of, but I'm focusing on the latter here.
For me, as a human, only real-world waiting time matters - as it's about how fast can I move on with doing different things.
But why should we make building perl artificially slower? As of now, mktables step already taken half as much as Configure step, - I've shown this in my timings.
That's not about --module, that's for example about bisecting fuzzer findings. And, again, not making perl build artificially slower. 2018-02-02 8:36 GMT+03:00 Karl Williamson <public@khwilliamson.com>
I don't want to be in a situation that, after some years and Unicode 20, mktables takes ages to complete due to standard being bloated. So I've raised this question sooner. |
From @dur-randirOn Thu, 01 Feb 2018 21:37:27 -0800, public@khwilliamson.com wrote:
Each merge-commit in git (and mercurial) have at least two parents. So, for da4e040, they are commit da4e040 and now we can look at separate diffs between merge-commit and each of it's parents: git diff b2cd5cb da4e040 And the second diff does touch mktables. |
From @khwilliamsonI ran across this thread which happened in a previous incarnation of this issue. It may be useful |
From @craigberryOn Thu, Feb 1, 2018 at 11:36 PM, Karl Williamson
I believe it's because that merge was not a fast-forward merge so it This IMO is reason enough not to do merges that way and there was some |
From @tonycozOn Thu, 01 Feb 2018 21:37:27 -0800, public@khwilliamson.com wrote:
I tried running it under NYTProf. I managed to find a roughly 10% speed up by re-implementing get_invalid_code_point() to avoid inverting the range list each time. You can see the profile results after that change at: http://files.develop-help.com/nytprof/index.html Tony |
From @tonycoz0001-perl-132800-slightly-speed-up-mktables.patchFrom 1eac41dd9489898aadbc753de0bb63b4c6621ac7 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Thu, 8 Mar 2018 10:42:59 +1100
Subject: [PATCH] (perl #132800) slightly speed up mktables
get_invalid_code_point() inverted the supplied range list and then
called get_valid_code_point() on the result.
Inverting is cheap for simple range lists, but it's more expensive
for more complex lists, and in most cases only first few entries (or
only the first) ranges need to be checked for a code point.
---
lib/unicore/mktables | 27 +++++++++++++++++++++++++--
1 file changed, 25 insertions(+), 2 deletions(-)
diff --git a/lib/unicore/mktables b/lib/unicore/mktables
index 88d9c036a3..81914f0600 100644
--- a/lib/unicore/mktables
+++ b/lib/unicore/mktables
@@ -5297,8 +5297,31 @@ sub trace { return main::trace(@_); }
my $self = shift;
Carp::carp_extra_args(\@_) if main::DEBUG && @_;
- # Just find a valid code point of the inverse, if any.
- return Range_List->new(Initialize => ~ $self)->get_valid_code_point;
+ # On first pass, don't choose less desirable code points; if no good
+ # one is found, repeat, allowing a less desirable one to be selected.
+ for my $try_hard (0, 1) {
+ my $end = $try_hard ? $MAX_WORKING_CODEPOINT : $MAX_UNICODE_CODEPOINT;
+
+ # Look through all the gaps between ranges for a usable code point.
+ for my $set (reverse $self->ranges) {
+ # Try the edge cases first, starting with the end point of the
+ # gap.
+ my $start = $set->end+1;
+ return $end if $end >= $start && is_code_point_usable($end, $try_hard);
+
+ # End point didn't, work. Start at the beginning and try
+ # every one until find one that does work.
+ for my $trial ($start .. $end - 1) {
+ return $trial if is_code_point_usable($trial, $try_hard);
+ }
+ $end = $set->start-1;
+ }
+ my $start = 0;
+ for my $trial ($start .. $end) {
+ return $trial if is_code_point_usable($trial, $try_hard);
+ }
+ }
+ return (); # If none found, give up.
}
} # end closure for Range_List
--
2.11.0
|
It appears that this past November @khwilliamson applied the patch originally submitted by @tonycoz in 2018. What are the criteria which have to be met in order to close this ticket? Thank you very much. |
On 2/2/20 7:26 PM, James E Keenan wrote:
It appears that this past November @khwilliamson
<https://github.com/khwilliamson> applied the patch originally submitted
by @tonycoz <https://github.com/tonycoz> in 2018.
What are the criteria which have to be met in order to close this ticket?
It takes too long. The real solution is not to run it unless the
underlying data changes. That is something that can be done during code
freeze for the next release.
…
Thank you very much.
Jim Keenan
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16401?email_source=notifications&email_token=AAA2DH5MTOXFZH3E54IQHPDRA56DXA5CNFSM4KO53MEKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKSJLHA#issuecomment-581211548>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAA2DHYDSKOIU4BEX4RPV6TRA56DXANCNFSM4KO53MEA>.
|
On Mon, 3 Feb 2020 at 06:25, Karl Williamson <notifications@github.com>
wrote:
On 2/2/20 7:26 PM, James E Keenan wrote:
> It appears that this past November @khwilliamson
> <https://github.com/khwilliamson> applied the patch originally
submitted
> by @tonycoz <https://github.com/tonycoz> in 2018.
>
> What are the criteria which have to be met in order to close this ticket?
It takes too long. The real solution is not to run it unless the
underlying data changes. That is something that can be done during code
freeze for the next release.
Yay! I've wanted us to do this for YEARS. But i'm not clear why it we need
to wait for the code freeze.
Isn't it just a matter of checking in the generated files?
Yves
|
This step takes now 28 seconds to complete on that machine, which is slower than in the report. So, while that optimization might've improved it, it's still regressing, as Unicode standard grows. |
On 2/2/20 10:33 PM, Yves Orton wrote:
On Mon, 3 Feb 2020 at 06:25, Karl Williamson ***@***.***>
wrote:
> On 2/2/20 7:26 PM, James E Keenan wrote:
> > It appears that this past November @khwilliamson
> > <https://github.com/khwilliamson> applied the patch originally
> submitted
> > by @tonycoz <https://github.com/tonycoz> in 2018.
> >
> > What are the criteria which have to be met in order to close this
ticket?
>
> It takes too long. The real solution is not to run it unless the
> underlying data changes. That is something that can be done during code
> freeze for the next release.
>
Yay! I've wanted us to do this for YEARS. But i'm not clear why it we need
to wait for the code freeze.
Isn't it just a matter of checking in the generated files?
If it were that easy, I would have done it a long time ago. It's
somewhat of a big deal to do, and I am working on user-visible changes
as our deadlines approach, I can't spend time on it now. It would take
a lot longer for someone else to get up to speed on it.
|
Migrated from rt.perl.org#132800 (status was 'open')
Searchable as RT132800$
The text was updated successfully, but these errors were encountered: