Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perl_my_atof does not "correctly round" #9639

Open
p5pRT opened this issue Jan 25, 2009 · 12 comments
Open

Perl_my_atof does not "correctly round" #9639

p5pRT opened this issue Jan 25, 2009 · 12 comments

Comments

@p5pRT
Copy link

p5pRT commented Jan 25, 2009

Migrated from rt.perl.org#62746 (status was 'open')

Searchable as RT62746$

@p5pRT
Copy link
Author

p5pRT commented Jan 25, 2009

From chris.hall@highwayman.com

This is a bug report for perl from chris.hall@​highwayman.com,
generated with the help of perlbug 1.36 running under perl 5.10.0.

Decimal to binary conversion is hard.

The requirements of IEEE-754 are difficult to achieve, requiring
some extended precision arithmetic.

One of the requirements is that when working to at least 17
significant decimal digits, the conversion​: binary to decimal and
then decimal back to binary, must return the original binary value.

The standard also requires the conversion to be "correctly rounded".

The following demonstrates that the conversion Perl_my_atof2() (in
numeric.c) fails​:

  my @​p = (0x3FEF, 0xFFFF, 0xFFE0, 0x0000) ;
  my $f = unpack('d>', pack('n4', @​p)) ;
  my $s = sprintf('%.17f', $f) ;
  my $t = $s+0 ;
  my @​q = unpack('n4', pack('d>', $t)) ;

  printf "0x%04X_%04X_%04X_%04X -> %s -> 0x%04X_%04X_%04X_%04X\n",
  @​p, $s, @​q ;

In this case the problem is that the decimal digits 99999999976716936
(0x0163_4578_5C26_BA88) are floated to an ordinary IEEE double,
rounding to even, and then divided by 10**17. The float step is the
problem, because the rounding means that what is actually being
converted is 99999999976716928.

For completeness, the values around the test value are​:

  0x3FEF_FFFF_FFDF_FFFF == 0.999999999767169245323827908578 (approx)
  0x3FEF_FFFF_FFE0_0000 == 0.999999999767169356346130371093 (approx)
  decimal value == 0.99999999976716936
  0x3FEF_FFFF_FFE0_0001 == 0.999999999767169467368432833609 (approx)

NB​: the above is just one example of the failure of Perl_my_atof2().

NB​: The existing code will do OK with numbers with 15 or fewer
  significant decimal digits, provided there is no significant
  size exponent part. But under any pressure it will introduce
  multiple roundings and multiple rounding errors -- failing the
  "correct rounding" requirement.

It appears that Perl_my_atof2() exists in order to​:

  (a) cope with character set and locale issues,

  (b) provide a consistent, system independent, syntax for numbers,

  (c) avoid some overflow issues on certain machines

My recommendation would be to ditch all the attempt to do the decimal
to binary conversion, and replace it by something that does (a) and
(b) above, and generates a string that is then passed to atof()
-- leaving it up to the local library to use locally available extended
precision. Most of the time the original string will be acceptable to
atof(), so the wrapper could take a positive view and start by simply
checking for exceptions.

It appears that Perl's syntax for decimal numbers accepts '.' as well
as whatever the current locale allows. One approach to this would be
to translate decimal point characters to whatever locale is in force
(as far as atof() is concerned) or remove the decimal point character
and adjust the exponent (or introduce one). I see that Perl_my_atof()
goes to some effort, converting the decimal string *twice* changing
the locale setting in between, in order to work around issues with
locale... I suggest this might be better handled by processing the
string, and dealing with decimal points before converting.

I note that Kernighan & Richie ('The C Programming Language', 2nd Ed)
and ISO 9899​:1999 say that atof() will return +/-HUGE_VAL if the
number overflows. So, unless the target 'C' is broken, there seems no
reason to worry about overflow, so (c) is moot if atof() is used.


Flags​:
  category=core
  severity=high


This perlbug was built using Perl 5.10.0 in the Fedora build system.
It is being executed now by Perl 5.10.0 - Fri Nov 28 19​:14​:03 EST 2008.

Site configuration information for perl 5.10.0​:

Configured by Red Hat, Inc. at Fri Nov 28 19​:14​:03 EST 2008.

Summary of my perl5 (revision 5 version 10 subversion 0) configuration​:
  Platform​:
  osname=linux, osvers=2.6.18-92.1.10.el5,
archname=x86_64-linux-thread-multi
  uname='linux x86-3.fedora.phx.redhat.com 2.6.18-92.1.10.el5 #1 smp
wed jul 23 03​:56​:11 edt 2008 x86_64 x86_64 x86_64 gnulinux '
  config_args='-des -Doptimize=-O2 -g -pipe -Wall
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
--param=ssp-buffer-size=4 -m64 -mtune=generic -DPERL_USE_SAFE_PUTENV
-Dversion=5.10.0 -Dmyhostname=localhost -Dperladmin=root@​localhost
-Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr
-Dprivlib=/usr/lib/perl5/5.10.0
-Dsitelib=/usr/local/lib/perl5/site_perl/5.10.0
-Dvendorlib=/usr/lib/perl5/vendor_perl/5.10.0
-Darchlib=/usr/lib64/perl5/5.10.0/x86_64-linux-thread-multi
-Dsitearch=/usr/local/lib64/perl5/site_perl/5.10.0/x86_64-linux-thread-multi
-Dvendorarch=/usr/lib64/perl5/vendor_perl/5.10.0/x86_64-linux-thread-multi
-Darchname=x86_64-linux-thread-multi -Dlibpth=/usr/local/lib64 /lib64
/usr/lib64
-Dotherlibdirs=/usr/local/lib/perl5/site_perl​:/usr/lib/perl5/site_perl
-Dvendorprefix=/usr -Dsiteprefix=/usr/local -Duseshrplib -Dusethreads
-Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db
-Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio
-Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly
-Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_proto
-Ud_sethostent_r_proto -Ud_endprotoent_r_proto -Ud_setprotoent_r_proto
-Ud_endservent_r_proto -Ud_setservent_r_proto -Dscriptdir=/usr/bin'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=define, usemultiplicity=define
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=define, use64bitall=define, uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING
-fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
  optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic
-DPERL_USE_SAFE_PUTENV',
  cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING
-fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
  ccversion='', gccversion='4.3.0 20080428 (Red Hat 4.3.0-8)',
gccosandvers=''
  intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='gcc', ldflags =''
  libpth=/usr/local/lib64 /lib64 /usr/lib64
  libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread
-lc
  perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
  libc=, so=so, useshrplib=true, libperl=libperl.so
  gnulibc_version='2.8'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E
-Wl,-rpath,/usr/lib64/perl5/5.10.0/x86_64-linux-thread-multi/CORE'
  cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe -Wall
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
--param=ssp-buffer-size=4 -m64 -mtune=generic -DPERL_USE_SAFE_PUTENV'

Locally applied patches​:


@​INC for perl 5.10.0​:
  /usr/lib64/perl5/5.10.0/x86_64-linux-thread-multi
  /usr/lib/perl5/5.10.0
  /usr/local/lib64/perl5/site_perl/5.10.0/x86_64-linux-thread-multi
  /usr/local/lib/perl5/site_perl/5.10.0
  /usr/lib64/perl5/vendor_perl/5.10.0/x86_64-linux-thread-multi
  /usr/lib/perl5/vendor_perl/5.10.0
  /usr/lib/perl5/vendor_perl
  /usr/local/lib/perl5/site_perl/5.10.0
  /usr/local/lib/perl5/site_perl
  /usr/lib/perl5/site_perl
  .


Environment for perl 5.10.0​:
  HOME=/home/GMCH
  LANG=en_GB.UTF-8
  LANGUAGE (unset)
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)
  PATH=/usr/lib64/qt-3.3/bin​:/usr/kerberos/sbin​:/usr/kerberos/bin​:/usr/
local/sbin​:/usr/local/bin​:/sbin​:/bin​:/usr/sbin​:/usr/bin​:/root/bin
  PERL_BADLANG (unset)
  SHELL=/bin/bash
--
Chris Hall highwayman.com

@p5pRT
Copy link
Author

p5pRT commented Jan 25, 2009

From @ysth

On Sun, January 25, 2009 11​:26 am, Chris Hall wrote​:

It appears that Perl_my_atof2() exists in order to​:

(a) cope with character set and locale issues,

(b) provide a consistent, system independent, syntax for numbers,

(c) avoid some overflow issues on certain machines

While the above is true, I think the original drive to replace the
libc atof/strtod was the change in C99 that would have made
"0x1" == 1 instead of Perl's traditional "0x1" == 0. But that
could be dealt with in a wrapper.

I also thought there was (d) avoid rounding and other bugs in some
platforms' local libraries.

@p5pRT
Copy link
Author

p5pRT commented Jan 25, 2009

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jan 25, 2009

From @nwc10

On Sun, Jan 25, 2009 at 12​:52​:08PM -0800, Yitzchak Scott-Thoennes wrote​:

On Sun, January 25, 2009 11​:26 am, Chris Hall wrote​:

It appears that Perl_my_atof2() exists in order to​:

(a) cope with character set and locale issues,

(b) provide a consistent, system independent, syntax for numbers,

(c) avoid some overflow issues on certain machines

While the above is true, I think the original drive to replace the
libc atof/strtod was the change in C99 that would have made
"0x1" == 1 instead of Perl's traditional "0x1" == 0. But that
could be dealt with in a wrapper.

I also thought there was (d) avoid rounding and other bugs in some
platforms' local libraries.

I don't remember any issues with rounding bugs. The only bug was that the
C99 committee made it impossible to have an atof() function conformant with
both C89 and C99, by defining it in the later standard in a way that
conflicts with the earlier standard, without changing its name.

Personally I always felt that a wrapper that sanitised input to avoid
interpretation as hexadecimal floating point constants would have been
better, but the alternative solution that we have now was presented first.

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Jan 31, 2009

From module@renee-baecker.de

Personally I always felt that a wrapper that sanitised input to avoid
interpretation as hexadecimal floating point constants would have been
better, but the alternative solution that we have now was presented first.

Nicholas Clark

So this ticket can be closed?

@p5pRT
Copy link
Author

p5pRT commented Jan 31, 2009

From chris.hall@highwayman.com

On Sat Jan 31 07​:07​:54 2009, reneeb wrote​:

On Sun Jan 25 13​:15​:50 2009 Nicholas Clark wrote​:

Personally I always felt that a wrapper that sanitised input to
avoid
interpretation as hexadecimal floating point constants would have
been
better, but the alternative solution that we have now was presented
first.

So this ticket can be closed?

I don't think so.

The problem is that the "alternative solution" (the current
implementation) referred to is broken.

I didn't find any earlier bug report, so you may take a view on the
urgency.

Nevertheless, the current implementation is not fit for purpose.

Someone who fully understands the expected semantics, especially wrt
LOCALE, should throw away Perl_my_atof2() and S_mulexp10(), and rewrite
Perl_my_atof as a wrapper around atof().

When I grep Perl_my_atof2() and S_mulexp10() (in 5.10.0 source) I get​:

  $ grep Perl_my_atof2 *
  embed.h​:#define my_atof2 Perl_my_atof2
  embed.h​:#define my_atof2(a,b) Perl_my_atof2(aTHX_ a,b)
  global.sym​:Perl_my_atof2
  numeric.c​:Perl_my_atof2(pTHX_ const char* orig, NV* value)
  perl.h​:# define Perl_atof2(s, n) Perl_my_atof2(aTHX_ (s), &(n))
  proto.h​:PERL_CALLCONV char* Perl_my_atof2(pTHX_ const char *s,
NV* value)

  $ grep S_mulexp10 *
  Changes5.8​: Log​: Subject​: Re​: [PATCH #2] Re​: [PATCH]
numeric.c​:S_mulexp10 -- quit when you can
  Changes5.8​: Log​: Subject​: [PATCH #2] Re​: [PATCH]
numeric.c​:S_mulexp10 -- quit when you can
  Changes5.8​: Log​: Subject​: [PATCH] numeric.c​:S_mulexp10 -- quit
when you can
  Changes5.8​: Log​: Subject​: [PATCH​: perl@​10996] avoid overflow in
numeric.c​:S_mulexp10() on VAX
  embed.h​:#define mulexp10 S_mulexp10
  embed.h​:#define mulexp10 S_mulexp10
  numeric.c​:S_mulexp10(NV value, I32 exponent)
  numeric.c​: result[seen_dp] = S_mulexp10(result
[seen_dp],
  numeric.c​: result[0] = S_mulexp10(result[0], exp_acc[0]) + (NV)
accumulator[0];
  numeric.c​: result[1] = S_mulexp10(result[1], exp_acc[1]) + (NV)
accumulator[1];
  numeric.c​: result[2] = S_mulexp10(result[0],exponent+exp_adjust
[0])
  numeric.c​: + S_mulexp10(result[1],exponent-exp_adjust
[1]);
  numeric.c​: result[2] = S_mulexp10(result[0],exponent+exp_adjust
[0]);
  proto.h​:STATIC NV S_mulexp10(NV value, I32 exponent);

I was not expecting to see these used outside numeric.c -- and I regret
I have no idea what the above implies.

--
Chris Hall

@p5pRT
Copy link
Author

p5pRT commented Jan 31, 2009

From @nwc10

On Sat, Jan 31, 2009 at 07​:07​:55AM -0800, reneeb via RT wrote​:

Personally I always felt that a wrapper that sanitised input to avoid
interpretation as hexadecimal floating point constants would have been
better, but the alternative solution that we have now was presented first.

Nicholas Clark

So this ticket can be closed?

I didn't say that :-)

I didn't actually say what we should do next. I don't know.

(I was just adding to the ticket what I remembered about the decisions that
got us to where we are now, without thinking about anything else)

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Jun 8, 2016

From @dcollinsn

Perl 5.10.0​:
0x3FEF_FFFF_FFE0_0000 -> 0.99999999976716936 -> 0x3FEF_FFFF_FFDF_FFFF

Perl 5.20.0 and blead​:
0x3FEF_FFFF_FFE0_0000 -> 0.99999999976716936 -> 0x3FEF_FFFF_FFE0_0000

I do believe this ticket can be closed.

@p5pRT
Copy link
Author

p5pRT commented Jun 8, 2016

From [Unknown Contact. See original ticket]

Perl 5.10.0​:
0x3FEF_FFFF_FFE0_0000 -> 0.99999999976716936 -> 0x3FEF_FFFF_FFDF_FFFF

Perl 5.20.0 and blead​:
0x3FEF_FFFF_FFE0_0000 -> 0.99999999976716936 -> 0x3FEF_FFFF_FFE0_0000

I do believe this ticket can be closed.

@p5pRT
Copy link
Author

p5pRT commented Jun 8, 2016

From @cpansprout

On Tue Jun 07 17​:22​:45 2016, dcollinsn@​gmail.com wrote​:

Perl 5.10.0​:
0x3FEF_FFFF_FFE0_0000 -> 0.99999999976716936 -> 0x3FEF_FFFF_FFDF_FFFF

Perl 5.20.0 and blead​:
0x3FEF_FFFF_FFE0_0000 -> 0.99999999976716936 -> 0x3FEF_FFFF_FFE0_0000

I do believe this ticket can be closed.

Are you sure the difference is not just due to different default configuration values on the different versions of perl? Could you do a bisect?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jun 8, 2016

From @cpansprout

On Tue Jun 07 17​:51​:47 2016, sprout wrote​:

On Tue Jun 07 17​:22​:45 2016, dcollinsn@​gmail.com wrote​:

Perl 5.10.0​:
0x3FEF_FFFF_FFE0_0000 -> 0.99999999976716936 -> 0x3FEF_FFFF_FFDF_FFFF

Perl 5.20.0 and blead​:
0x3FEF_FFFF_FFE0_0000 -> 0.99999999976716936 -> 0x3FEF_FFFF_FFE0_0000

I do believe this ticket can be closed.

Are you sure the difference is not just due to different default
configuration values on the different versions of perl? Could you do
a bisect?

Never mind. I see your follow-up note in ticket #43811.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jun 8, 2016

From @cpansprout

On Tue Jun 07 17​:52​:42 2016, sprout wrote​:

Never mind. I see your follow-up note in ticket #43811.

Er, I meant #53784.

--

Father Chrysostomos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants