Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fstat st_size overflow #810

Closed
p5pRT opened this issue Nov 3, 1999 · 90 comments
Closed

fstat st_size overflow #810

p5pRT opened this issue Nov 3, 1999 · 90 comments

Comments

@p5pRT
Copy link

p5pRT commented Nov 3, 1999

Migrated from rt.perl.org#1738 (status was 'resolved')

Searchable as RT1738$

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 1999

From chris@gouda.netmonger.net

$ ls -l log.dat
-rw-rw-r-- 1 chris web 2200944678 Nov 2 18​:16 log.dat
$ perl -le 'print -s "log.dat"'
-2094022618

I'm afraid I don't have enough background to know what the correct
solution is. On FreeBSD, st_size is an off_t, which is a "long long",
64-bit signed int. I don't know what a negative file size could be,
but I guess it's probably just an unintended consequence of using
the same type as lseek.

In any case, I have some code that needs to know the right size in
order to do a look-style binary search. Being somewhat unfamiliar
with Perl guts, I just did this​:

--- pp_sys.c 1999/05/02 14​:18​:18 1.1.1.2
+++ pp_sys.c 1999/11/03 20​:53​:50
@​@​ -2220,7 +2220,7 @​@​
#else
  PUSHs(sv_2mortal(newSVpv("", 0)));
#endif
- PUSHs(sv_2mortal(newSViv((I32)PL_statcache.st_size)));
+ PUSHs(sv_2mortal(newSVnv((U32)PL_statcache.st_size)));
#ifdef BIG_TIME
  PUSHs(sv_2mortal(newSVnv((U32)PL_statcache.st_atime)));
  PUSHs(sv_2mortal(newSVnv((U32)PL_statcache.st_mtime)));
@​@​ -2349,7 +2349,7 @​@​
  djSP; dTARGET;
  if (result < 0)
  RETPUSHUNDEF;
- PUSHi(PL_statcache.st_size);
+ PUSHn(PL_statcache.st_size);
  RETURN;
}

I'm not sure if that's right even as a bandaid, and it will obviously
fail again after another 2GB. Further, I suspect this issue is
already well-known, but in the hopes that I won't have to maintain
local modifications to Perl in order to keep this program working, I
am sending this report.

Perl Info


Site configuration information for perl 5.00503:

Configured by markm at $Date$.

Summary of my perl5 (5.0 patchlevel 5 subversion 3) configuration:
  Platform:
    osname=freebsd, osvers=4.0-current, archname=i386-freebsd
    uname='freebsd freefall.freebsd.org 4.0-current freebsd 4.0-current #0: $Date$'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef useperlio=undef d_sfio=undef
  Compiler:
    cc='cc', optimize='undef', gccversion=egcs-2.91.66 19990314 (egcs-1.1.2 release)
    cppflags=''
    ccflags =''
    stdchar='char', d_stdstdio=undef, usevfork=true
    intsize=4, longsize=4, ptrsize=4, doublesize=8
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    alignbytes=4, usemymalloc=n, prototype=define
  Linker and Libraries:
    ld='cc', ldflags ='-Wl,-E'
    libpth=/usr/lib
    libs=-lm -lc -lcrypt
    libc=/usr/lib/libc.so, so=so, useshrplib=true, libperl=libperl.so.3
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
    cccdlflags='-DPIC -fpic', lddlflags='-shared'

Locally applied patches:
    


@INC for perl 5.00503:
    /usr/libdata/perl/5.00503/mach
    /usr/libdata/perl/5.00503
    /usr/local/lib/perl5/site_perl/5.005/i386-freebsd
    /usr/local/lib/perl5/site_perl/5.005
    .


Environment for perl 5.00503:
    HOME=/home/chris
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/usr/local/bin:/home/chris/bin
    PERL_BADLANG (unset)
    SHELL=/usr/local/bin/zsh

@p5pRT
Copy link
Author

p5pRT commented Nov 3, 1999

From @jhi

Christopher Masto writes​:

I'm afraid I don't have enough background to know what the correct
solution is. On FreeBSD, st_size is an off_t, which is a "long long",
64-bit signed int. I don't know what a negative file size could be,
but I guess it's probably just an unintended consequence of using
the same type as lseek.

The next major release of Perl, called 5.6, will be able to handle
large files (>2GB) (and >4GB). Currently it seems that one must
Configure and build perl separately for that, though--largefile
awareness won't be on default because of backward compatibility.

If you want and you have the time you can try the latest developer release​:

  http​://www.cpan.org/src/5.0/perl5.005_62.tar.gz

Do "./Configure -Duselargefiles -de;make all test" BUT DO NOT INSTALL into
production use because this *is* a developer release. You can try
your test with the resulting 'perl' executable, though.

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 4, 1999

From [Unknown Contact. See original ticket]

On Thu, Nov 04, 1999 at 02​:42​:20AM +0200, Jarkko Hietaniemi wrote​:

If you want and you have the time you can try the latest developer release​:

http&#8203;://www\.cpan\.org/src/5\.0/perl5\.005\_62\.tar\.gz

Do "./Configure -Duselargefiles -de;make all test" BUT DO NOT INSTALL into
production use because this *is* a developer release. You can try
your test with the resulting 'perl' executable, though.

Just for the record, 5.005_62 does seem to work properly with
-Duselargefiles on FreeBSD-current. Yay.
--
Christopher Masto Senior Network Monkey NetMonger Communications
chris@​netmonger.net info@​netmonger.net http​://www.netmonger.net

Free yourself, free your machine, free the daemon -- http​://www.freebsd.org/

@p5pRT
Copy link
Author

p5pRT commented Nov 6, 1999

From @jhi

Christopher Masto writes​:

Just for the record, 5.005_62 does seem to work properly with
-Duselargefiles on FreeBSD-current. Yay.

Yay, indeed. So I have made some progress. Thanks for testing.

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

Jarkko Hietaniemi <jhi@​iki.fi> wrote

The next major release of Perl, called 5.6, will be able to handle
large files (>2GB) (and >4GB). Currently it seems that one must
Configure and build perl separately for that, though--largefile
awareness won't be on default because of backward compatibility.

Can you expand on this? What's the compatibility problem?
Isn't large file support a Good Thing, for those platforms that
support it?

Or is it just a question of untested / experimental code?
In which case, why not turn it on by default in _6x, so it gets tested?
And take the decision on stability just before the 5.6 release.

Mike Guy

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @jhi

M.J.T. Guy writes​:

Jarkko Hietaniemi <jhi@​iki.fi> wrote

The next major release of Perl, called 5.6, will be able to handle
large files (>2GB) (and >4GB). Currently it seems that one must
Configure and build perl separately for that, though--largefile
awareness won't be on default because of backward compatibility.

Can you expand on this? What's the compatibility problem?

Wanting large files means that you also want 64 bits. Quadness can
mean nasty incompatibility problems​: there are pure 32 bit
applications, 32/64 applications, and 64 bit applications, which may
or may not be binary compatible. Depending on which libraries
(which libc using which bitness) you linked with you may have
limitations on which other programs/libraries you can use.

Isn't large file support a Good Thing, for those platforms that support it?

Yes, it is.

Or is it just a question of untested / experimental code?
In which case, why not turn it on by default in _6x, so it gets tested?

That's a possibility.

And take the decision on stability just before the 5.6 release.

Mike Guy

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @AlanBurlison

Jarkko Hietaniemi wrote​:

Wanting large files means that you also want 64 bits. Quadness can
mean nasty incompatibility problems​: there are pure 32 bit
applications, 32/64 applications, and 64 bit applications, which may
or may not be binary compatible. Depending on which libraries
(which libc using which bitness) you linked with you may have
limitations on which other programs/libraries you can use.

I'm not sure that is entirely correct. On Solaris it's perfectly
acceptable to have a 32 bit application that supports large files (64
bit file offsets) - the two things are not the same. In fact all but a
handful of commands on 64 bit Solaris are 32 bit applications. There's
a good reason for this - 64 bit apps take a performance hit, and unless
you really need a 64 bit address space 32 bit apps are a better choice.
I think it will be will be important in the next version of perl to
distinguish between 64 bit integer (largefile) support and 64 bit
address space (pointer) support. For my particular environment I'd want
64 bit integers (large files) but can't really see that I'd ever want a
64-bit address space.

Alan Burlison

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

I'm not sure that is entirely correct. On Solaris it's perfectly
acceptable to have a 32 bit application that supports large files (64
bit file offsets) - the two things are not the same.

What is a "32 bit application"? Do you mean a program
compiled where all these are true​:

  sizeof(int) == 4
  sizeof(int *) == 4
  sizeof(off_t) == 4

And what's a "64 bit application"? One in which all of those is 8?

Or is there some compiler or O/S flag that makes them different? Is it
a program linked with a different library, or an a.out with a different
magic number?

As you essentially pointed out, one should be able to have a 4-byte int
and a larger off_t without conflict.

It's so easy to make

  sizeof(int) ==
  sizeof(long) ==
  sizeof(int *) ==
  sizeof(char *) ==
  sizeof(off_t) == 4

But "All the World's a VAX" really has to go away -- some day. :-)

--tom

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

For my particular environment I'd want
64 bit integers (large files) but can't really see that I'd ever want a
64-bit address space.

Me too. In fact, in PowerMAX OS, that is the only combo you can get. We
support "long long" for 64 bit integers, and you can use them in 64 bit file
operations, but there is no 64 bit address space and no 64 bit pointers
(not right now, anyway).
--
Tom.Horsley@​mail.ccur.com \\\\ Will no one rid me of
Concurrent Computers, Ft. Lauderdale, FL \\\\ this troublesome
Me​: http​://home.att.net/~Tom.Horsley/ \\\\ autoconf?
Project Vote Smart​: http​://www.vote-smart.org \\\\ !!!!!

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @jhi

Alan Burlison writes​:

Jarkko Hietaniemi wrote​:

Wanting large files means that you also want 64 bits. Quadness can
mean nasty incompatibility problems​: there are pure 32 bit
applications, 32/64 applications, and 64 bit applications, which may
or may not be binary compatible. Depending on which libraries
(which libc using which bitness) you linked with you may have
limitations on which other programs/libraries you can use.

I'm not sure that is entirely correct. On Solaris it's perfectly
acceptable to have a 32 bit application that supports large files (64
bit file offsets) - the two things are not the same. In fact all but a
handful of commands on 64 bit Solaris are 32 bit applications. There's
a good reason for this - 64 bit apps take a performance hit, and unless
you really need a 64 bit address space 32 bit apps are a better choice.
I think it will be will be important in the next version of perl to
distinguish between 64 bit integer (largefile) support and 64 bit
address space (pointer) support. For my particular environment I'd want
64 bit integers (large files) but can't really see that I'd ever want a
64-bit address space.

What -Duselargefiles now does is that it implicitly switches on -Duse64bits,
too, which in turn means that IVs and UVs become 64 bits wide.
If that turns on also 64 bit pointers, well, that's not *my* fault :-)

Alan Burlison

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

At 12​:14 PM 11/9/99 -0700, Tom Christiansen wrote​:

I'm not sure that is entirely correct. On Solaris it's perfectly
acceptable to have a 32 bit application that supports large files (64
bit file offsets) - the two things are not the same.

What is a "32 bit application"? Do you mean a program
compiled where all these are true​:

sizeof(int) == 4
sizeof(int *) == 4
sizeof(off_t) == 4

Probably.

And what's a "64 bit application"? One in which all of those is 8?

Depends. If any one of the above is true the marketroids call it 64-bit. In
our case it's sizeof(int), since you can get 64-bit integers on 32-bit
machines if gcc supports it.

As Alan's pointed out, 64-bit ints are more useful, generally speaking,
than 64-bit pointers, though at some point someone'll probably complain
about perl breaking when it tries to allocate more than 4G of memory...

Or is there some compiler or O/S flag that makes them different? Is it
a program linked with a different library, or an a.out with a different
magic number?

Depends on the platform. Some, like the alphas, go full-blown 64-bit, both
pointers and integers. Intel machines, OTOH, get 64-bit integers but 32-bit
pointers and gcc does some magic.

The big problem from the perl level is programs with use integer in effect
that count on particular overflow effects, or ones that assume left shifts
fall off the end of the world at bit 31. (Arguably broken, but that
argument probably won't buy us much with the folks whose code breaks)

But "All the World's a VAX" really has to go away -- some day. :-)

And it has. Well, at least for us VMS folks... :-)

  Dan

----------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
dan@​sidhe.org have teddy bears and even
  teddy bears get drunk

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @jhi

As you essentially pointed out, one should be able to have a 4-byte int
and a larger off_t without conflict.

In C, yes. In Perl, we have IVs and UVs, tucked into SVs, and unless
we do some serious magic, those are the datatypes that must accommodate
both the 4-byte int and the 32+ bit off_t.

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @jhi

The big problem from the perl level is programs with use integer in effect
that count on particular overflow effects, or ones that assume left shifts
fall off the end of the world at bit 31. (Arguably broken, but that
argument probably won't buy us much with the folks whose code breaks)

And just don't get me started on the incestuous NV <=> IV stuffing :-)
(a.k.a. op/misc #4)

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @jhi

I'm not sure that is entirely correct. On Solaris it's perfectly
acceptable to have a 32 bit application that supports large files (64
bit file offsets) - the two things are not the same. In fact all but a
handful of commands on 64 bit Solaris are 32 bit applications. There's

The 64 bit box I am mainly worried about is IRIX, they have all the three
combinations I described. I'll ping Scott Henry about this, how serious
would the incompatibilities be, if any.

For example DEC^WDigital^WTru64 has no problems because everything is
64 bits (well, IVs and off_ts, has always been), pointers too, nominally
(physically they are 43 bits, IIRC).

How about the places where gcc emulates long longs with software, like ix86?
Will turning on 64 bitness be a performance hit?

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

Jarkko Hietaniemi <jhi@​iki.fi> writes​:

As you essentially pointed out, one should be able to have a 4-byte int
and a larger off_t without conflict.

In C, yes. In Perl, we have IVs and UVs, tucked into SVs, and unless
we do some serious magic,
  IV,
  UV,
+
  OV => off_t

There would be some merit in having 'long long' "available" but not the
default - at least on the large number of existing 32-bit machines
having IV = long long would be a performance hit, but moving to 64-bit
when required would be useful. Probably more pain than it merits though ...

those are the datatypes that must accommodate
both the 4-byte int and the 32+ bit off_t.
--
Nick Ing-Simmons

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @jhi

Nick Ing-Simmons writes​:

Jarkko Hietaniemi <jhi@​iki.fi> writes​:

As you essentially pointed out, one should be able to have a 4-byte int
and a larger off_t without conflict.

In C, yes. In Perl, we have IVs and UVs, tucked into SVs, and unless
we do some serious magic,
IV,
UV,
+
OV => off_t

There would be some merit in having 'long long' "available" but not the
default -

You mean like, say, "use quad;"? Yuckety. I'm not saying that would
necessarily be the interface, mind. I thought Perl would be above the
mess C's non-standardized integer sizes have driven us (witness the
"long long" itself...)

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

At 10​:06 PM 11/9/99 +0200, Jarkko Hietaniemi wrote​:

For example DEC^WDigital^WTru64 has no problems because everything is
64 bits (well, IVs and off_ts, has always been), pointers too, nominally
(physically they are 43 bits, IIRC).

Well, you get 43 pins out of most Alpha CPUs, but the pointers are a full
64 bits. A quick calc puts that at around 170 cubic feet of RAM, so it's
not *too* likely that anyone'd hit that limit for a while...

How about the places where gcc emulates long longs with software, like ix86?
Will turning on 64 bitness be a performance hit?

It'd pretty much have to, I expect. The math'd all be done in software
instead of in hardware. That'll probably hurt a bunch.

  Dan

----------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
dan@​sidhe.org have teddy bears and even
  teddy bears get drunk

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

At 10​:31 PM 11/9/99 +0200, Jarkko Hietaniemi wrote​:

I thought Perl would be above the
mess C's non-standardized integer sizes have driven us (witness the
"long long" itself...)

Now, now, they are standardized. An int's guaranteed to be at least as big
as a char, and no bigger than a long. The standard's not necessarily
*useful*, mind, but that's a separate problem...

  Dan

----------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
dan@​sidhe.org have teddy bears and even
  teddy bears get drunk

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

From​: Jarkko Hietaniemi <jhi@​iki.fi>
Nick Ing-Simmons writes​:

Jarkko Hietaniemi <jhi@​iki.fi> writes​:

As you essentially pointed out, one should be able to have a 4-byte
int
and a larger off_t without conflict.

In C, yes. In Perl, we have IVs and UVs, tucked into SVs, and unless
we do some serious magic,
IV,
UV,
+
OV => off_t

There would be some merit in having 'long long' "available" but not the
default -

You mean like, say, "use quad;"? Yuckety. I'm not saying that would
necessarily be the interface, mind. I thought Perl would be above the
mess C's non-standardized integer sizes have driven us (witness the
"long long" itself...)

While 64-bit on demand (or perhaps even autoresizing (SV_tBIGINT, anyone?)
needed to prevent overflow) would be way cool, there are a few problems.
Like, for instance, how do you print a gcc-ish "long long" on a system with
no native 64-bit support (ie, no %q or %Ld or whatever)?

-- BKS

______________________________________________________
Get Your Private, Free Email at http​://www.hotmail.com

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

"Benjamin Stuhl" <sho_pi@​hotmail.com> writes​:

While 64-bit on demand (or perhaps even autoresizing (SV_tBIGINT, anyone?)
needed to prevent overflow) would be way cool, there are a few problems.
Like, for instance, how do you print a gcc-ish "long long" on a system with
no native 64-bit support (ie, no %q or %Ld or whatever)?

The autopromote could promote to arbitrary precision arithmetic when
necessary. On capable systems, it could use longlong for bits greater
than 32 and less than 65, then arbitrary precision higher. Or it
could jump straight to arbitrary precision at 33+ bits for systems
that are 32bit only.

Math​::BigInt isn't up to the task since it's written in Perl only.
We'd need a fast set of C routines. I did a quick port of GNU's GMP
library as Math​::GMP but I suspect the Artistic License and LGPL might
conflict if we had to distribute GMP with Perl (not to mention size
issues and other messiness).

Perhaps we could implement our own bigints in C and allow auto-
promotion? There would be performance hits but perhaps it's arguable
that end users not worrying about integer size is worth the
performance hit?

Chip

--
Chip Turner chip@​ZFx.com
  Programmer, ZFx, Inc. www.zfx.com
  PGP key available at wwwkeys.us.pgp.net

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @jhi

Just to clarify my position​: I've nothing against large files.
I've nothing against 64 bits. In fact, I *looooooove* 64 bits.
Why do you think I would've been hacking for 64 bit support for
the last months if I absolutely hated 64 bits? :-) I'm just being
conservative. What goes BANG! if we turn on 64-bitness? What doesn't
go bang but gets slower? If the collective we think that the benefits
outweight the costs, we can turn on largefileness and 64-bitness,
great, finally!

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @jhi

Math​::BigInt isn't up to the task since it's written in Perl only.
We'd need a fast set of C routines. I did a quick port of GNU's GMP
library as Math​::GMP but I suspect the Artistic License and LGPL might
conflict if we had to distribute GMP with Perl (not to mention

Correct.

size issues and other messiness).

That, too.

Perhaps we could implement our own bigints in C and allow auto-
promotion? There would be performance hits but perhaps it's arguable

Something like that is in the (very) long term plan, but I think
restructuring of that magnitude for 5.6 isn't realistic.

that end users not worrying about integer size is worth the
performance hit?

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

At 03​:48 PM 11/9/99 -0500, Chip Turner wrote​:

Perhaps we could implement our own bigints in C and allow auto-
promotion? There would be performance hits but perhaps it's arguable
that end users not worrying about integer size is worth the
performance hit?

Autopromotion does hurt, though. You need to check every math operation to
see if you overflowed (or will overflow) and need to promote. Not
necessarily a bad thing in general, but it does cost.

  Dan

----------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
dan@​sidhe.org have teddy bears and even
  teddy bears get drunk

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

Autopromotion does hurt, though. You need to check every math operation to
see if you overflowed (or will overflow) and need to promote. Not
necessarily a bad thing in general, but it does cost.

We already have lexical "use integer". Perhaps we could
have a lexical "use bigint", assuming that we could get an
implementation that were fast enough.

--tom

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

At 02​:04 PM 11/9/99 -0700, Tom Christiansen wrote​:

Autopromotion does hurt, though. You need to check every math operation to
see if you overflowed (or will overflow) and need to promote. Not
necessarily a bad thing in general, but it does cost.

We already have lexical "use integer". Perhaps we could
have a lexical "use bigint", assuming that we could get an
implementation that were fast enough.

I was thinking of that. We could, I suppose, have a second set of 'bigint'
math opcodes so the normal ones wouldn't be slowed down any, and add
another flag to the SV structure. (What the heck, we don't have enough in
there anyway... :)

  Dan

----------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
dan@​sidhe.org have teddy bears and even
  teddy bears get drunk

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @gsar

On Tue, 09 Nov 1999 21​:58​:51 +0200, Jarkko Hietaniemi wrote​:

As you essentially pointed out, one should be able to have a 4-byte int
and a larger off_t without conflict.

In C, yes. In Perl, we have IVs and UVs, tucked into SVs, and unless
we do some serious magic, those are the datatypes that must accommodate
both the 4-byte int and the 32+ bit off_t.

Umm, according to my records, the following change (in 5.005_57) added
large file support for Solaris under the 32-bit universe​:

  [ 3311] By​: gsar on 1999/05/06 05​:37​:55
  Log​: From​: Damon Atkins <n107844@​sysmgtdev.nabaus.com.au>
  Date​: Tue, 30 Mar 1999 11​:26​:11 +1000 (EST)
  Message-Id​: <199903300126.LAA20870@​sysmgtdev.nabaus.com.au>
  Subject​: Largefiles for Solaris
  Branch​: perl
  ! hints/solaris_2.sh

I think it makes tons of sense to keep that support if people are not
explicitly asking for 64-bit everything.

Sarathy
gsar@​ActiveState.com

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @gsar

On Tue, 09 Nov 1999 22​:45​:45 +0200, Jarkko Hietaniemi wrote​:

Just to clarify my position​: I've nothing against large files.
I've nothing against 64 bits. In fact, I *looooooove* 64 bits.
Why do you think I would've been hacking for 64 bit support for
the last months if I absolutely hated 64 bits? :-) I'm just being
conservative. What goes BANG! if we turn on 64-bitness? What doesn't
go bang but gets slower? If the collective we think that the benefits
outweight the costs, we can turn on largefileness and 64-bitness,
great, finally!

We really can't decide until we have some numbers on how good/bad it gets,
but my gut feeling is that we would enable largefiles-without-64-bits by
default where that's possible (as was the case on Solaris in 5.005_57).

Sarathy
gsar@​ActiveState.com

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

Dan Sugalski writes​:

How about the places where gcc emulates long longs with software, like ix86?
Will turning on 64 bitness be a performance hit?

It'd pretty much have to, I expect. The math'd all be done in software
instead of in hardware. That'll probably hurt a bunch.

Anyone ready for a bench?

Ilya

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @jhi

Gurusamy Sarathy writes​:

On Tue, 09 Nov 1999 21​:58​:51 +0200, Jarkko Hietaniemi wrote​:

As you essentially pointed out, one should be able to have a 4-byte int
and a larger off_t without conflict.

In C, yes. In Perl, we have IVs and UVs, tucked into SVs, and unless
we do some serious magic, those are the datatypes that must accommodate
both the 4-byte int and the 32+ bit off_t.

Umm, according to my records, the following change (in 5.005_57) added
large file support for Solaris under the 32-bit universe​:

\[  3311\] By&#8203;: gsar                                  on 1999/05/06  05&#8203;:37&#8203;:55
    Log&#8203;: From&#8203;: Damon Atkins \<n107844@&#8203;sysmgtdev\.nabaus\.com\.au>
     Date&#8203;: Tue\, 30 Mar 1999 11&#8203;:26&#8203;:11 \+1000 \(EST\)
     Message\-Id&#8203;: \<199903300126\.LAA20870@&#8203;sysmgtdev\.nabaus\.com\.au>
     Subject&#8203;: Largefiles for Solaris
 Branch&#8203;: perl
       \! hints/solaris\_2\.sh

I think it makes tons of sense to keep that support if people are not
explicitly asking for 64-bit everything.

According to my records, the most important of which is the current
Solaris hints file, that change was later removed as I rewrote large
parts of the 64-bit support, including the hints. I guess I'm just
too paranoid​: I'm afraid the doubles won't preserve all the bits of
file offsets.

That patch *always* turned on largefileness if available. If that's
what we want, okay, let's.

Sarathy
gsar@​ActiveState.com

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

At 04​:48 PM 11/9/99 -0500, Ilya Zakharevich wrote​:

Dan Sugalski writes​:

How about the places where gcc emulates long longs with software, like
ix86?
Will turning on 64 bitness be a performance hit?

It'd pretty much have to, I expect. The math'd all be done in software
instead of in hardware. That'll probably hurt a bunch.

Anyone ready for a bench?

The results would certainly be interesting. Anyone know what version of gcc
for Intel Linux brought out the 64-bit integer stuff? (I did one for _61 or
_60, IIRC. Whichever had the thread-speedup patch. But it was for VMS on
Alphas, and we have native 64-bit integers, so the numbers aren't really
applicable here)

  Dan

----------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
dan@​sidhe.org have teddy bears and even
  teddy bears get drunk

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

Jarkko Hietaniemi writes​:

For a start the size of the mantissa and exponent parts of a
double differ between platforms, so the precision can't be generally
stated with any certainty.

So what? Same happens for numerics, and nobody complained yet.

Losing precision is the nature of floating point numbers.
Losing precision is the death of file offsets.

Losing precision will not happen for many years to come, and if done
correct, will lead to a fatal error.

Ilya

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

On Tue, Nov 09, 1999 at 04​:56​:49PM -0700, Tom Christiansen wrote​:

Well, "you cannot be foolproof, since they produce better and better
fools each year".

Only a fool would expect that the size of a file would be a floating
point number. Are we supposed to expect this?

Size of a file is a *number*. And you are supposed to expect this.
And you are supposed to know that

  printf '%d', $number

produces "wrong" results right and left.

Ilya

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @gsar

On Tue, 09 Nov 1999 23​:44​:07 GMT, Alan Burlison wrote​:

Gurusamy Sarathy wrote​:

I don't see how that matters, since the off_t values are never held in an
IV internally. And NVs are "big enough" for practical sizes.

I disagree (strongly). Putting 64 bit ints inside doubles is a bad, bad
idea. For a start the size of the mantissa and exponent parts of a
double differ between platforms, so the precision can't be generally
stated with any certainty.

But this hasn't stopped us from using NVs for all sorts of things
currently.

I've had to do this sort of 64 bit int -> double bodge in my
Solaris​::Kstat module because the timestamps on the system statistics
are expressed as nanoseconds since boot, stored in a 64 bit integer.
I've worked around the problem by expressing the time as fractional
seconds, i.e. by dividing by 10e9, but it's still a grotesque hack.
Doing what you are suggesting is far, far worse, IMHO. It will also
lead to unpredictable results for folks who do file offset arithmetic
(yes I know they shouldn't, but they are bound to).

Please don't do this. I'd rather have to put up with 32 bit file
offsets than this abomination. Heck, I'd even rather write an XSUB to
do it properly for me...

By all means, enable 64-bits if you are able. (You obviously seem to
prefer that route. :)

But let's not *prevent* people with 32-bit systems from accessing their
largefile API just for some vague puritanical reasons. Let's add a
warning to alert those two users who will be hitting the NV precision
problem this side of the millennium.

I would agree with you if the "largefiles"-via-NVs was the only option.
It most certainly isn't.

Sarathy
gsar@​ActiveState.com

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @gsar

On Tue, 09 Nov 1999 16​:35​:22 MST, Tom Christiansen wrote​:

I don't see how that matters, since the off_t values are never held in an
IV internally. And NVs are "big enough" for practical sizes.

Really? Something seems wrong then​:

DB<1> $x = 2 ** 40

DB<2> print $x
1099511627776
DB<3> printf "%d", $x
-1
DB<4> printf "%.0f", $x
1099511627776

People are going to want to store the size in a perl number, and
print it out as in integer. I doubt they'll think to zero-float it.

Sure, the above has been the case for over 10 years and people on
32-bit platforms will continue to run into such indignities.

The question here is actually something different​: on 32-bit platforms,
is it harmful in practice to return an NV instead of an IV for file
sizes/offsets, where practical?

Sarathy
gsar@​ActiveState.com

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

Tom Christiansen writes​:

Really? Something seems wrong then​:

DB<1> $x = 2 ** 40

DB<2> print $x
1099511627776
DB<3> printf "%d", $x
-1
DB<4> printf "%.0f", $x
1099511627776

I want to investigate this from a different POV​: *why* should not '%d'
be equivalent to '%.0f' if argument is an out-of-range NV?

Ilya

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @gsar

On Tue, 09 Nov 1999 19​:20​:12 EST, Ilya Zakharevich wrote​:

Tom Christiansen writes​:

Really? Something seems wrong then​:

DB<1> $x = 2 ** 40

DB<2> print $x
1099511627776
DB<3> printf "%d", $x
-1
DB<4> printf "%.0f", $x
1099511627776

I want to investigate this from a different POV​: *why* should not '%d'
be equivalent to '%.0f' if argument is an out-of-range NV?

Backwardness aside, I don't see any good reason why not.

Sarathy
gsar@​ActiveState.com

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From [Unknown Contact. See original ticket]

On Tue, Nov 09, 1999 at 04​:31​:37PM -0800, Gurusamy Sarathy wrote​:

DB<1> $x = 2 ** 40

DB<2> print $x
1099511627776
DB<3> printf "%d", $x
-1
DB<4> printf "%.0f", $x
1099511627776

I want to investigate this from a different POV​: *why* should not '%d'
be equivalent to '%.0f' if argument is an out-of-range NV?

Backwardness aside, I don't see any good reason why not.

There is backward-compatibility on out-of-range NV->IV conversion
anyway. But

H​:\get\perl\perl5.005_62.my>perl -wle "printf '%.0f', (2**31)+0.6"
2147483649
H​:\get\perl\perl5.005_62.my>perl -wle "printf '%.0f', (2**31)+0.4"
2147483648

So one should trunc() first.

BTW, with my "cache NV->PV conversion" patch there less need for
quick-as-arrow NV->PV conversion. Thus I would like to rise again a
question on inclusion of lossless NV <==> PV converters (which were
rejected several years ago). I do not think that

monk​:~->perl -wle '$_ = 2**50 - 1; print'
1.12589990684262e+15

is acceptable any more.

Who can find this thread in archives? Did we have archives that time?

Ilya

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 1999

From @doughera88

On Tue, 9 Nov 1999, Gurusamy Sarathy wrote​:

On Tue, 09 Nov 1999 17​:16​:09 EST, Andy Dougherty wrote​:

In short, I suspect that enabling a "LARGEFILES" option that causes system
functions to return 64-bit off_t's will occasionally work on actual large
files, but will also often fail, unless perl itself is prepared to deal
with 64-bit integers. That is, I suspect Jarkko's right​: turning on
largefiles without -Duse64bits may well be turning on something that is
often going to fail. That's probably not a good idea.

FWIW, I don't see this as being a problem in practice. Counter examples
welcome.

Since *BSD platforms are already doing this in 5.005_0x, perhaps a check
in the archives may reveal some relevant bug reports.
I won't be well-enough connected until Thursday to look so perhaps someone
can beat me to it.

                                                  \[I don't dispute

that in C on some platforms this _can_ all work smoothly. But perl, with
it's one integral type, the IV, is not C.]

I don't see how that matters, since the off_t values are never held in an
IV internally. And NVs are "big enough" for practical sizes.

I hesitate to ask you of all people, but are you sure about this? I don't
think that used to be the case, but it may have been cleaned up since I
last looked (a long time ago). Aren't they at least sometimes stored in
Off_t's, which later get converted to something (IV if the value is small
and NV if it't not)?

There are some side issues about the IV -> NV conversion that also might
tend to show up here more, such as the default printing with the %15g
format doesn't preserve the full precision of some 16-digit numbers, and
that % sometimes does funny things. These are, perhaps correctly,
construed as bugs or gotchas worthy of warnings, not show-stoppers.

  Andy Dougherty doughera@​lafayette.edu

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From @jhi

BTW, with my "cache NV->PV conversion" patch there less need for
quick-as-arrow NV->PV conversion. Thus I would like to rise again a
question on inclusion of lossless NV <==> PV converters (which were
rejected several years ago). I do not think that

monk​:~->perl -wle '$_ = 2**50 - 1; print'
1.12589990684262e+15

is acceptable any more.

No, it isn't. I actually tried to fix the + - * / % to keep integers
as integers as long as possible while I was doing the 64 bit hacking,
but was scared away by Chip :-)

Who can find this thread in archives? Did we have archives that time?

Ilya

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From @AlanBurlison

Ilya Zakharevich wrote​:

I disagree (strongly). Putting 64 bit ints inside doubles is a bad, bad
idea.

I doubt that you will get any support without harder facts.

If you have read the rest of the thread, you will see I'm not alone in
my opinion.

For a start the size of the mantissa and exponent parts of a
double differ between platforms, so the precision can't be generally
stated with any certainty.

So what? Same happens for numerics, and nobody complained yet.

So what? I'm complaining about file offsets, not existing numeric
behaviour.

Please don't do this. I'd rather have to put up with 32 bit file
offsets than this abomination. Heck, I'd even rather write an XSUB to
do it properly for me...

It is nice to have it switchable-off indeed, so that Alan has a chance
to appreciate "Gosh, why did I switch it off?!!!" argument. ;-)

I've also just appreciated (again) why I stopped posting to p5p.
Sheesh.

Alan Burlison

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From [Unknown Contact. See original ticket]

Benjamin Stuhl <sho_pi@​hotmail.com> writes​:

While 64-bit on demand (or perhaps even autoresizing (SV_tBIGINT, anyone?)
needed to prevent overflow) would be way cool, there are a few problems.
Like, for instance, how do you print a gcc-ish "long long" on a system with
no native 64-bit support (ie, no %q or %Ld or whatever)?

We already do all the printf-y % stuff ourselves. So all we need is
the "long long to string" code. This is just a case of doing the usual
thing but with long long as the type.

The related idea of making %d render an NV as integer is much harder
as you cannot precisely divide an NV by 10 (at least if NV is radix-2).

--
Nick Ing-Simmons <nik@​tiuk.ti.com>
Via, but not speaking for​: Texas Instruments Ltd.

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From [Unknown Contact. See original ticket]

Jarkko Hietaniemi <jhi@​iki.fi> wrote

Losing precision is the nature of floating point numbers.
Losing precision is the death of file offsets.

I'd hope that the implementation, when converting from off64_t
(or whatever) to SV, would

a) Choose whichever of NV or IV gave the greater range
b) Produce EOVERFLOW if the conversion didn't fit.

That way, I can't see that we're ever worse off than the current state
of affairs.

Mike Guy

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From @jhi

M.J.T. Guy writes​:

Jarkko Hietaniemi <jhi@​iki.fi> wrote

Losing precision is the nature of floating point numbers.
Losing precision is the death of file offsets.

I'd hope that the implementation, when converting from off64_t
(or whatever) to SV, would

a) Choose whichever of NV or IV gave the greater range
b) Produce EOVERFLOW if the conversion didn't fit.

Converting is relatively easy, it's the maddening arith ops that
mangle IVs into NVs (+1, and you've just converted your IV into an NV).
An sv flag saying "don't cache the NV, ever"?

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From @jhi

An sv flag saying "don't cache the NV, ever"?

Hmmm, a flag that would have to contagious... ($new_offset = $old_offset + 42)
"integer taint" :-)

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From [Unknown Contact. See original ticket]

Jarkko Hietaniemi <jhi@​iki.fi> wrote

Converting is relatively easy, it's the maddening arith ops that
mangle IVs into NVs (+1, and you've just converted your IV into an NV).
An sv flag saying "don't cache the NV, ever"?

Let's not confuse the simple issue of conversion of off64_t with
the more general issues of what to do when IVs are more accurate than NVs.

These more general issues ought to be tackled, but there's no reason for
them to obstruct doing large files right, particularly on 32 bit
platforms.

Mike Guy

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From @gsar

On Tue, 09 Nov 1999 18​:50​:46 EST, Ilya Zakharevich wrote​:

Gurusamy Sarathy writes​:

Most people don't have 525 terabyte disks yet. :-)

Were do people got this ridiculous number from? It is almost
centi-ExaByte, or 8 petabytes to be precise.

perl -wle "$i = 2**53 - 16; $j = $i+1; print $j - $i"
1

In theory, yes, but in practice, DBL_DIG is usually 15, which
means Perl will use scientific notation to display numbers that
are larger, unless you use printf(). 2**49 or 512 TB (not 525 :)
is what you can fit in 15 digits.

Sarathy
gsar@​ActiveState.com

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From @gsar

On Tue, 09 Nov 1999 21​:25​:46 EST, Andy Dougherty wrote​:

                                                  \[I don't dispute

that in C on some platforms this _can_ all work smoothly. But perl, with
it's one integral type, the IV, is not C.]

I don't see how that matters, since the off_t values are never held in an
IV internally. And NVs are "big enough" for practical sizes.

I hesitate to ask you of all people, but are you sure about this? I don't
think that used to be the case, but it may have been cleaned up since I
last looked (a long time ago). Aren't they at least sometimes stored in
Off_t's, which later get converted to something (IV if the value is small
and NV if it't not)?

My somewhat limited empirical tests around 5.005_57 didn't find any
problems. That is not to say that no issues exist; I didn't audit
the sources enough to know.

Sarathy
gsar@​ActiveState.com

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From @doughera88

On Wed, 10 Nov 1999, M.J.T. Guy wrote​:

Jarkko Hietaniemi <jhi@​iki.fi> wrote

Converting is relatively easy, it's the maddening arith ops that
mangle IVs into NVs (+1, and you've just converted your IV into an NV).
An sv flag saying "don't cache the NV, ever"?

Let's not confuse the simple issue of conversion of off64_t with
the more general issues of what to do when IVs are more accurate than NVs.

These more general issues ought to be tackled, but there's no reason for
them to obstruct doing large files right, particularly on 32 bit
platforms.

But "doing large files right" probably means allowing the user to do
"something" with those large off_t values, which may include adding or
subtracting them or printing them out or passing them to other functions,
etc., etc. Many of these probably work now for many cases. Others
probably don't. The corners near the IV boundary are a little sharp, and
you have to be careful.

I know some of these things have come up before. A careful trip through
the archives (when time permits, which isn't today) may reveal something.

--
  Andy Dougherty doughera@​lafayette.edu
  Dept. of Physics
  Lafayette College, Easton PA 18042

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From [Unknown Contact. See original ticket]

On Wed, Nov 10, 1999 at 10​:41​:20AM +0200, Jarkko Hietaniemi wrote​:

BTW, with my "cache NV->PV conversion" patch there less need for
quick-as-arrow NV->PV conversion. Thus I would like to rise again a
question on inclusion of lossless NV <==> PV converters (which were
rejected several years ago). I do not think that

monk​:~->perl -wle '$_ = 2**50 - 1; print'
1.12589990684262e+15

is acceptable any more.

No, it isn't. I actually tried to fix the + - * / % to keep integers
as integers as long as possible while I was doing the 64 bit hacking,
but was scared away by Chip :-)

I do not think it is reasonable, and it will not save the above
problem anyway. The problem above is with NV<->PV conversions.

Ilya

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From [Unknown Contact. See original ticket]

On Wed, Nov 10, 1999 at 08​:45​:47AM -0800, Gurusamy Sarathy wrote​:

perl -wle "$i = 2**53 - 16; $j = $i+1; print $j - $i"
1

In theory, yes, but in practice, DBL_DIG is usually 15, which
means Perl will use scientific notation to display numbers that
are larger, unless you use printf(). 2**49 or 512 TB (not 525 :)
is what you can fit in 15 digits.

a) This should be fixed long time ago;
b) Thanks, but *I* can put numbers up to 10**15-1 into 15 digits. ;-)

Ilya

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From [Unknown Contact. See original ticket]

Alan Burlison writes​:

Please don't do this. I'd rather have to put up with 32 bit file
offsets than this abomination. Heck, I'd even rather write an XSUB to
do it properly for me...

It is nice to have it switchable-off indeed, so that Alan has a chance
to appreciate "Gosh, why did I switch it off?!!!" argument. ;-)

I've also just appreciated (again) why I stopped posting to p5p.

Oups, sorry if it offends you! This was not the intent...

Ilya

@p5pRT
Copy link
Author

p5pRT commented Nov 10, 1999

From [Unknown Contact. See original ticket]

Andy Dougherty writes​:

But "doing large files right" probably means allowing the user to do
"something" with those large off_t values, which may include adding or
subtracting them or printing them out or passing them to other functions,
etc., etc.

"Doing them right" is not the target. One can manually tune holy
files to break any simple implementation. On the opposite side,
"making things useful" with files which appear in practice is a
handable target.

Many of these probably work now for many cases. Others
probably don't. The corners near the IV boundary are a little sharp, and
you have to be careful.

Fortunately, behaviour at the corners should not enter the "practical"
picture for several more years, and then we have more experience to
tackle these questions.

As I said before​: show me an application which uses files greater than
8 Pb, and will show you how to create bigints from C. ;-)

Ilya

@p5pRT
Copy link
Author

p5pRT commented Nov 11, 1999

From @doughera88

On Tue, 9 Nov 1999, Andy Dougherty wrote​:

On Tue, 9 Nov 1999, Gurusamy Sarathy wrote​:

FWIW, I don't see this as being a problem in practice. Counter examples
welcome.

Since *BSD platforms are already doing this in 5.005_0x, perhaps a check
in the archives may reveal some relevant bug reports.
I won't be well-enough connected until Thursday to look so perhaps someone
can beat me to it.

Ok, well a non-exhaustive (but still exhausting!) crawl through the
archives revealed no *BSD problems in practice (though those folks are
traditionally underrepresented on p5p). I did find a few notes worth
passing on.

Overall, I'm beginning to agree with Sarathy that perhaps we ought to
enable large file stuff indepenedently of -Duse64bits, at least to the
extent that such separation of Configure options is feasible. I think
Jarkko's done a fine job eliminating various I32-isms that used to be
prevelant in pp_sys.c and elsewhere (and which were the main cause for my
concern). The failure of -e on Solaris/x86 cited below is certainly
something we'd like to avoid.

I still maintain that the IV -> NV rollover can be a little tricky (sign
issues, '%' behavior, printing, etc) and we have to be careful, but I
am a little more optimistic now.


1999-03/msg00556​: dubious pp_stat behavior

Tom C. pointed out slews of (I32) casts which are now mostly gone, but
we still have things like the following in pp_sys.c​:

  PUSHs(sv_2mortal(newSViv(PL_statcache.st_size)));


1999-08/msg00423​: [ID 19990810.001] Possible bug using stat w/large files
  Digital UNIX Perl 5.005_03

This was just attempting to use the default print function. The user got
a negative number.


1999-07/msg00413​: [ID 19990709.003] open and -e fail w/largefiles
(>2G) on solaris 2.6 x86

this was for 5.005_02, and may be fixed by -DLARGEFILES.

In fact, this is probably a good argument for enabling large files
even without -Duse64bits, since otherwise simple tests like -e
apparently failed.


1999-04/msg00604​: Re​: Why (IV) U_V(d) ?

Regarding storing file sizes in an NV, Nick observed that

  While that would suffice for large files up to 2**53 bytes,
  it is not sufficient for Win32 stat. The inode and dev equivalents
  are 64 bit integers. An NV (double) cannot represent 64 bit int
  exactly (and I suspect least significant bits are important).

I don't know anything about this. It sounds potentially quite serious,
but may not be, in fact.

--
  Andy Dougherty doughera@​lafayette.edu
  Dept. of Physics
  Lafayette College, Easton PA 18042

@p5pRT
Copy link
Author

p5pRT commented Nov 11, 1999

From @jhi

-----------------------------------------------------------
1999-03/msg00556​: dubious pp_stat behavior

Tom C. pointed out slews of (I32) casts which are now mostly gone, but
we still have things like the following in pp_sys.c​:

PUSHs\(sv\_2mortal\(newSViv\(PL\_statcache\.st\_size\)\)\);

Thanks for finding this. I'll double check the use of st_(size|uid|gid).

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant