Skip Menu |
Report information
Id: 72784
Status: open
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: nicholas <nick [at] ccl4.org>
Cc: chip <rev.chip [at] gmail.com>
AdminCc:

Operating System: (no value)
PatchStatus: (no value)
Severity: low
Type: meta
Perl Version: (no value)
Fixed In: (no value)

Attachments
0001-add-a-directory-of-tests-to-run-with-large-available.patch
0002-rt-100514-regression-test-for-read-with-a-2Gib-offse.patch



Subject: misuse of I32
Date: Sat, 13 Feb 2010 15:50:17 +0000
To: perlbug [...] perl.org
From: Nicholas Clark <nick [...] ccl4.org>
Download (untitled) / with headers
text/plain 327b
There's a lot of inappropriate use of the I32 type throughout the core. Likely most should be something else, one of U32, STRLEN, SSize_t, IV or UV. The misuse of I32 causes lots of bugs (panics, SEGVs, silent data errors) if strings go over 2GB. This is a meta-ticket for collating tickets relating to these sorts of bugs.
Subject: misuse of I32
Date: Sat, 13 Feb 2010 15:50:17 +0000
To: perlbug [...] perl.org
From: Nicholas Clark <nick [...] ccl4.org>
Download (untitled) / with headers
text/plain 328b
There's a lot of inappropriate use of the I32 type throughout the core. Likely most should be something else, one of U32, STRLEN, SSize_t, IV or UV. The misuse of I32 causes lots of bugs (panics, SEGVs, silent data errors) if strings go over 2GB. This is a meta-ticket for collating tickets relating to these sorts of bugs.
This is the same as 72784.
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 519b
As a general issue for testing this type of bug - we need to allocate significant amounts of memory (2Gb or much more in some cases) to regression test this type of bug, which would be unfriendly in the common build-and-test cases. Should we define (say) an environment variable to indicate that a large amount of memory can be used, and roughly how much? eg. PERL_TEST_MEMORY=2 # we can use up to 2Gb of memory for large memory tests It might be desirable to skip these tests when performing a parallel test. Tony
CC: perl5-porters [...] perl.org
Subject: Re: [perl #72784] [META] misuse of I32
Date: Wed, 14 Mar 2012 08:05:48 +0100
To: perlbug-followup [...] perl.org
From: Steffen Mueller <smueller [...] cpan.org>
Download (untitled) / with headers
text/plain 658b
On 03/14/2012 06:46 AM, Tony Cook via RT wrote: Show quoted text
> As a general issue for testing this type of bug - we need to allocate > significant amounts of memory (2Gb or much more in some cases) to > regression test this type of bug, which would be unfriendly in the > common build-and-test cases. > > Should we define (say) an environment variable to indicate that a large > amount of memory can be used, and roughly how much? > > eg. > PERL_TEST_MEMORY=2 # we can use up to 2Gb of memory for large memory tests > > It might be desirable to skip these tests when performing a parallel test.
Or at least run that particular subset of tests in sequence. --Steffen
CC: perlbug-followup [...] perl.org, perl5-porters [...] perl.org
Subject: Re: [perl #72784] [META] misuse of I32
Date: Wed, 14 Mar 2012 19:36:27 +1100
To: Steffen Mueller <smueller [...] cpan.org>
From: Tony Cook <tony [...] develop-help.com>
Download (untitled) / with headers
text/plain 883b
On Wed, Mar 14, 2012 at 08:05:48AM +0100, Steffen Mueller wrote: Show quoted text
> On 03/14/2012 06:46 AM, Tony Cook via RT wrote:
> >As a general issue for testing this type of bug - we need to allocate > >significant amounts of memory (2Gb or much more in some cases) to > >regression test this type of bug, which would be unfriendly in the > >common build-and-test cases. > > > >Should we define (say) an environment variable to indicate that a large > >amount of memory can be used, and roughly how much? > > > >eg. > > PERL_TEST_MEMORY=2 # we can use up to 2Gb of memory for large memory tests > > > >It might be desirable to skip these tests when performing a parallel test.
> > Or at least run that particular subset of tests in sequence.
Make a t/bigmem/ directory for large memory tests, perhaps. If I understand t/harness it should be easy to ensure those are run sequentially. Tony
CC: perlbug-followup [...] perl.org, perl5-porters [...] perl.org
Subject: Re: [perl #72784] [META] misuse of I32
Date: Sat, 24 Mar 2012 13:56:29 +1100
To: Steffen Mueller <smueller [...] cpan.org>
From: Tony Cook <tony [...] develop-help.com>
Download (untitled) / with headers
text/plain 1.2k
On Wed, Mar 14, 2012 at 07:36:27PM +1100, Tony Cook wrote: Show quoted text
> On Wed, Mar 14, 2012 at 08:05:48AM +0100, Steffen Mueller wrote:
> > On 03/14/2012 06:46 AM, Tony Cook via RT wrote:
> > >As a general issue for testing this type of bug - we need to allocate > > >significant amounts of memory (2Gb or much more in some cases) to > > >regression test this type of bug, which would be unfriendly in the > > >common build-and-test cases. > > > > > >Should we define (say) an environment variable to indicate that a large > > >amount of memory can be used, and roughly how much? > > > > > >eg. > > > PERL_TEST_MEMORY=2 # we can use up to 2Gb of memory for large memory tests > > > > > >It might be desirable to skip these tests when performing a parallel test.
> > > > Or at least run that particular subset of tests in sequence.
> > Make a t/bigmem/ directory for large memory tests, perhaps. > > If I understand t/harness it should be easy to ensure those are run > sequentially.
Attached, two changes: 0001-add-a-directory-of-tests-to-run-with-large-available.patch modify t/TEST and t/harness to run t/bigmem/*.t if $ENV{PERL_TEST_MEMORY} is true. 0002-rt-100514-regression-test-for-read-with-a-2Gib-offse.patch regression test for RT #100514 Tony
CC: perlbug-followup [...] perl.org, perl5-porters [...] perl.org
Subject: Re: [perl #72784] [META] misuse of I32
Date: Sat, 24 Mar 2012 13:58:38 +1100
To: Steffen Mueller <smueller [...] cpan.org>
From: Tony Cook <tony [...] develop-help.com>
Download (untitled) / with headers
text/plain 391b
On Sat, Mar 24, 2012 at 01:56:29PM +1100, Tony Cook wrote: Show quoted text
> Attached, two changes: > > 0001-add-a-directory-of-tests-to-run-with-large-available.patch > > modify t/TEST and t/harness to run t/bigmem/*.t if > $ENV{PERL_TEST_MEMORY} is true. > > 0002-rt-100514-regression-test-for-read-with-a-2Gib-offse.patch > > regression test for RT #100514
Actually attaching them might help. Tony

Message body is not shown because sender requested not to inline it.

Message body is not shown because sender requested not to inline it.

RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 533b
On Fri Mar 23 19:59:21 2012, tonyc wrote: Show quoted text
> On Sat, Mar 24, 2012 at 01:56:29PM +1100, Tony Cook wrote:
> > Attached, two changes: > > > > 0001-add-a-directory-of-tests-to-run-with-large-available.patch > > > > modify t/TEST and t/harness to run t/bigmem/*.t if > > $ENV{PERL_TEST_MEMORY} is true. > > > > 0002-rt-100514-regression-test-for-read-with-a-2Gib-offse.patch > > > > regression test for RT #100514
> > Actually attaching them might help.
Thank you. Applied as ff5db609afa5 and 9fda09b68d4. -- Father Chrysostomos
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 2.4k
On Sat Feb 13 07:51:05 2010, nicholas wrote: Show quoted text
> There's a lot of inappropriate use of the I32 type throughout the core. > Likely most should be something else, one of U32, STRLEN, SSize_t, IV
or UV. Show quoted text
> > The misuse of I32 causes lots of bugs (panics, SEGVs, silent data
errors) if Show quoted text
> strings go over 2GB. > > This is a meta-ticket for collating tickets relating to these sorts of
bugs. Show quoted text
>
What should the maximum string and array lengths be? In various places in the perl source, we have overflow checks that use different sizes; in other places things just overflow, but at different thresholds. Take sv_setpvn for example. If the length is greater than IV_MAX it croaks. safesysmalloc croaks if the length is greater than the maximum SSize_t can hold, but only on debugging builds. Malloc takes a size_t as its argument. So that means 2**63-1 is the longest string supported on 64-bit platforms via sv_setpvn. If you use sv_grow and write to the string buffer directly, you can go beyond that, if the malloc implementation lets you, except under debugging builds, where the limit is still 2**63-1. On 32-bit platforms, if -Duse64bitint is not used, the situation is the same, but with 2**31-1 for the maximum. On 32-bit platforms when -Duse64bitint *is* used, sv_setpvn allows values all the way up to 2**32-1, but sv_grow will croak on anything above 2**31-1 under debugging builds. On non-debugging builds both methods (sv_setpvn and sv_grow+direct write to PVX) allow strings up to 2**32-1. So nothing is consistent. For all practical purposes, one is limited to 31-bit string lengths on 32-bit platforms, because, unless you really know perl’s internals well, your 3GB string *will* be copied, and you will immediately run out of memory. I suggest we make 2**(PTRSIZE-1)-1 (*) the maximum string length and make everything consistent with that. That is already the maximum array length. Regular expressions are currently limited to I32_MAX. Changing that would break the regular expression plugin interface. I don’t think it’s unreasonable to keep it at its current limit. Note I am talking about regular expressions themselves, not the string they match against. In 6174b39a8 I allowed pos() to record positions up to 2**PTRSIZE-2. I think that was a mistake, but harmless, as pos() values are always truncated to the length of the string. * I.e., SSize_t_MAX, but we have no such macro currently. -- Father Chrysostomos
CC: perl5-porters [...] perl.org
Subject: Re: [perl #72784] [META] misuse of I32
Date: Sun, 28 Jul 2013 13:08:00 +1000
To: Father Chrysostomos via RT <perlbug-followup [...] perl.org>
From: Tony Cook <tony [...] develop-help.com>
Download (untitled) / with headers
text/plain 1013b
On Sat, Jul 27, 2013 at 07:33:28PM -0700, Father Chrysostomos via RT wrote: Show quoted text
> On Sat Feb 13 07:51:05 2010, nicholas wrote:
> > There's a lot of inappropriate use of the I32 type throughout the core. > > Likely most should be something else, one of U32, STRLEN, SSize_t, IV
> or UV.
> > > > The misuse of I32 causes lots of bugs (panics, SEGVs, silent data
> errors) if
> > strings go over 2GB. > > > > This is a meta-ticket for collating tickets relating to these sorts of
> bugs.
> >
> > What should the maximum string and array lengths be?
I think it should be Ssize_t_max. Or more C standardly - ptrdiff_t-max. It might be possible in theory to support longer strings on a 32-bit platform, but with all the bits and pieces that get allocated from the address space* I don't think it's likely that there's an over 2G hole to allocate such an object in. Supporting larger objects on 32-bit platforms could complicate the code for very little benefit. Tony * stack, kernel space, mapped libraries, etc
CC: nick [...] ccl4.org
Subject: Re: [perl #72784] [META] misuse of I32
Date: Mon, 29 Jul 2013 22:18:47 -0700
To: perlbug-followup [...] perl.org
From: Chip Salzenberg <chip [...] pobox.com>
Download (untitled) / with headers
text/plain 170b
On 7/27/2013 7:33 PM, Father Chrysostomos via RT wrote: Show quoted text
> What should the maximum string and array lengths be?
Why would the answer ever be anything other than size_t?
CC: Perl5 Porteros <perl5-porters [...] perl.org>
Subject: Re: [perl #72784] [META] misuse of I32
Date: Tue, 30 Jul 2013 13:26:41 +0200
To: reneeb via RT <perlbug-followup [...] perl.org>
From: demerphq <demerphq [...] gmail.com>
Download (untitled) / with headers
text/plain 842b
On 28 July 2013 04:33, Father Chrysostomos via RT <perlbug-followup@perl.org> wrote: Show quoted text
> Regular expressions are currently limited to I32_MAX. Changing that > would break the regular expression plugin interface.
IMO this is not a sound argument for rejecting a change. There are very few consumers of that interface, and it has never really been "blessed" as stable. This has come up a few times and I am firmly of the opinion that if changes for the better of perl are needed then the regex engine plug in interface should not be an obstacle. Show quoted text
> I don’t think it’s > unreasonable to keep it at its current limit.
Maybe, maybe not. But the plug in interface shouldn't be part of the justification for your position at all. Cheers, and thanks for all your work on this stuff! Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 414b
On Mon Jul 29 22:19:43 2013, chip wrote: Show quoted text
> On 7/27/2013 7:33 PM, Father Chrysostomos via RT wrote:
> > What should the maximum string and array lengths be?
> > Why would the answer ever be anything other than size_t?
Because AvFILLp needs to be set to -1 when the array is empty. We *could* use size_t but reserve ~0 as a special marker, but that complicates the code for little gain. -- Father Chrysostomos
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 525b
On Tue Jul 30 06:33:28 2013, sprout wrote: Show quoted text
> On Mon Jul 29 22:19:43 2013, chip wrote:
> > On 7/27/2013 7:33 PM, Father Chrysostomos via RT wrote:
> > > What should the maximum string and array lengths be?
> > > > Why would the answer ever be anything other than size_t?
> > Because AvFILLp needs to be set to -1 when the array is empty. We > *could* use size_t but reserve ~0 as a special marker, but that > complicates the code for little gain.
Also, the HV APIs use negative lengths for utf8. -- Father Chrysostomos
CC: Perl5 Porteros <perl5-porters [...] perl.org>
Subject: Re: [perl #72784] [META] misuse of I32
Date: Wed, 31 Jul 2013 02:34:45 +0200
To: reneeb via RT <perlbug-followup [...] perl.org>
From: demerphq <demerphq [...] gmail.com>
Download (untitled) / with headers
text/plain 702b
On 30 July 2013 17:20, Father Chrysostomos via RT <perlbug-followup@perl.org> wrote: Show quoted text
> On Tue Jul 30 06:33:28 2013, sprout wrote:
>> On Mon Jul 29 22:19:43 2013, chip wrote:
>> > On 7/27/2013 7:33 PM, Father Chrysostomos via RT wrote:
>> > > What should the maximum string and array lengths be?
>> > >> > Why would the answer ever be anything other than size_t?
>> >> Because AvFILLp needs to be set to -1 when the array is empty. We >> *could* use size_t but reserve ~0 as a special marker, but that >> complicates the code for little gain.
> > Also, the HV APIs use negative lengths for utf8.
Yes, but that API is insane. Utterly insane. Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 232b
On Sat Jul 27 20:08:37 2013, tonyc wrote: Show quoted text
> I think it should be Ssize_t_max.
Is this a correct way to define it, or am I doing something non-portable here? #define SSize_t_MAX (SSize_t)(~(size_t)0 >> 1) -- Father Chrysostomos
CC: nick [...] ccl4.org
Subject: Re: [perl #72784] [META] misuse of I32
Date: Wed, 31 Jul 2013 16:39:29 -0700
To: perlbug-followup [...] perl.org
From: Chip Salzenberg <chip [...] pobox.com>
Download (untitled) / with headers
text/plain 586b
On 7/30/2013 6:33 AM, Father Chrysostomos via RT wrote: Show quoted text
> On Mon Jul 29 22:19:43 2013, chip wrote:
>> On 7/27/2013 7:33 PM, Father Chrysostomos via RT wrote:
>>> What should the maximum string and array lengths be?
>> Why would the answer ever be anything other than size_t?
> Because AvFILLp needs to be set to -1 when the array is empty. We > *could* use size_t but reserve ~0 as a special marker, but that > complicates the code for little gain.
Your emphasis is incorrect. Rather, size_t adds very little complication for significant systemic gain: Correctness and consistency.
CC: nick [...] ccl4.org
Subject: Re: [perl #72784] [META] misuse of I32
Date: Wed, 31 Jul 2013 16:40:57 -0700
To: perlbug-followup [...] perl.org
From: Chip Salzenberg <chip [...] pobox.com>
Download (untitled) / with headers
text/plain 702b
On 7/30/2013 8:20 AM, Father Chrysostomos via RT wrote: Show quoted text
> On Tue Jul 30 06:33:28 2013, sprout wrote:
>> On Mon Jul 29 22:19:43 2013, chip wrote:
>>> On 7/27/2013 7:33 PM, Father Chrysostomos via RT wrote:
>>>> What should the maximum string and array lengths be?
>>> Why would the answer ever be anything other than size_t?
>> Because AvFILLp needs to be set to -1 when the array is empty. We >> *could* use size_t but reserve ~0 as a special marker, but that >> complicates the code for little gain.
> Also, the HV APIs use negative lengths for utf8.
Those specific APIs can continue to use signed integers if you prefer; but the signedness need not escape those specific parameter lists if so. Show quoted text
>


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org