Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

netlib dtoa.c #14019

Open
p5pRT opened this issue Aug 6, 2014 · 14 comments
Open

netlib dtoa.c #14019

p5pRT opened this issue Aug 6, 2014 · 14 comments

Comments

@p5pRT
Copy link

p5pRT commented Aug 6, 2014

Migrated from rt.perl.org#122482 (status was 'new')

Searchable as RT122482$

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2014

From @jhi

As discussed elsewhere [1] there is a well-known solution for ascii-to-double (aka strtod) and double-to-ascii (aka printf, or Gconvert) conversions​: the netlib dtoa.c [2]. This code is apparently used by Python, PHP, Java, Firefox, Chrome, Safari

I suggest looking into integrating this into Perl. I volunteer myself to doing some of the work.

Pros​:
- well tested and widely used fp/a conversion
- consistent handling of fp/a conversions across platforms
- consistent handling of inf/nan especially
- hexadecimal floats (it's a C99 feature, and even then seemingly inconsistently implemented)
- Python, PHP, and Java compatibility (har har)

Cons​:
- new code to maintain​: the netlib code (disregarding the copyright at the top) is still actively maintained, so updates do happen
- new code to include​: ~4400 source lines, object code in Darwin ~37K
- does memory management of its own (since it's hard to know exactly how long a string to allocate for dtoa)
- has locale code in it ("1.23" -> double, duh), this may require rather complete ripping out / replacing due to our
  rather extensive locale-handling code

Unknowns/musings as of now​:
- is the license compatible for us (given the wide range of users, I would be rather surprised if there are problems)
- does dtoa.c work with long doubles
- while dtoa is *for implementing* printf, how exactly does that work (we have a string... now how do we do %10.3f ?)
- some platforms might have special quirks (I'm especially thinking nan/inf handling) that mean the dtoa.c cannot be used and the native strtod/printf facilities need still be used (though we could try backporting the work to dtoa.c and make the world a better place)

[1] https://rt-archive.perl.org/perl5/Ticket/Display.html?id=122219 ("support hexadecimal floats")
[2] http​://www.netlib.org/fp/dtoa.c

@p5pRT
Copy link
Author

p5pRT commented Aug 7, 2014

From @jhi

Initial investigation comments​:

- turning off the private memory management
- using plain malloc/free for now (which is a different issue from the private memory management)
- turning off the locale support for now (it used, by accident, the same USE_LOCALE define as Perl...) in Perl code, the thing is GROK_NUMERIC_RADIX
- there's also code for multiple threads (locking certain things)

More importantly, looks like integrating this will be even more ... intense than expected​: the Perl_my_atof and Perl_my_atof2 (and the helper, S_mulexp10) make for interesting reading, especially with the VAX (and Cray) specifics.

@p5pRT
Copy link
Author

p5pRT commented Aug 7, 2014

From [Unknown Contact. See original ticket]

Initial investigation comments​:

- turning off the private memory management
- using plain malloc/free for now (which is a different issue from the private memory management)
- turning off the locale support for now (it used, by accident, the same USE_LOCALE define as Perl...) in Perl code, the thing is GROK_NUMERIC_RADIX
- there's also code for multiple threads (locking certain things)

More importantly, looks like integrating this will be even more ... intense than expected​: the Perl_my_atof and Perl_my_atof2 (and the helper, S_mulexp10) make for interesting reading, especially with the VAX (and Cray) specifics.

@p5pRT
Copy link
Author

p5pRT commented Aug 7, 2014

From @jhi

More importantly, looks like integrating this will be even more ...
intense than expected​: the Perl_my_atof and Perl_my_atof2 (and the
helper, S_mulexp10) make for interesting reading, especially with the
VAX (and Cray) specifics.

Actually, to be more explicit - looks like we are not actually currently even using strtod() as such!
(Except for NaN/Inf conversion.) The above triad is the one taking care of the conversion.
Should've remembered, this all started at my tenure... maybe too long ago.

The whole area of code is a quilt of numeric overflow ballet, strange cases in various places.

Summary​: I'm starting to doubt the benefit of bringing in the netlib strtod, however well tested and widely used it is, the current code has been well tested in the platforms Perl runs on.

And since we are not really depending on the system strtod​:s anyway (except for nan/inf), it looks like for the hexadecimal fp "strtod-ing" it would be better just to implement our own. This would not, however, solve the hexadecimal fp output.

@p5pRT
Copy link
Author

p5pRT commented Aug 7, 2014

From [Unknown Contact. See original ticket]

More importantly, looks like integrating this will be even more ...
intense than expected​: the Perl_my_atof and Perl_my_atof2 (and the
helper, S_mulexp10) make for interesting reading, especially with the
VAX (and Cray) specifics.

Actually, to be more explicit - looks like we are not actually currently even using strtod() as such!
(Except for NaN/Inf conversion.) The above triad is the one taking care of the conversion.
Should've remembered, this all started at my tenure... maybe too long ago.

The whole area of code is a quilt of numeric overflow ballet, strange cases in various places.

Summary​: I'm starting to doubt the benefit of bringing in the netlib strtod, however well tested and widely used it is, the current code has been well tested in the platforms Perl runs on.

And since we are not really depending on the system strtod​:s anyway (except for nan/inf), it looks like for the hexadecimal fp "strtod-ing" it would be better just to implement our own. This would not, however, solve the hexadecimal fp output.

@p5pRT
Copy link
Author

p5pRT commented Aug 20, 2014

From @jhi

Then again, dtoa.c is not without is share of security problems, see e.g. https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2009-0689
(with many links to follow)

I also got a pointer to another implementation​:

http​://git.musl-libc.org/cgit/musl/tree/src/internal/floatscan.c

Much more minimal (good), and does seemingly 80-bit long doubles (good),
but while musl has in general good quality and reputation (good), how well has this been tested for the edge cases, and there are more IEEE (or -ish) long double types.

@p5pRT
Copy link
Author

p5pRT commented Aug 20, 2014

From [Unknown Contact. See original ticket]

Then again, dtoa.c is not without is share of security problems, see e.g. https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2009-0689
(with many links to follow)

I also got a pointer to another implementation​:

http​://git.musl-libc.org/cgit/musl/tree/src/internal/floatscan.c

Much more minimal (good), and does seemingly 80-bit long doubles (good),
but while musl has in general good quality and reputation (good), how well has this been tested for the edge cases, and there are more IEEE (or -ish) long double types.

@p5pRT
Copy link
Author

p5pRT commented Aug 22, 2014

From @jhi

After more inspection​: it seems that the gdtoa.tgz (also from netlib) is the "portable" version that handles more platforms, including long doubles.

@p5pRT
Copy link
Author

p5pRT commented Aug 22, 2014

From [Unknown Contact. See original ticket]

After more inspection​: it seems that the gdtoa.tgz (also from netlib) is the "portable" version that handles more platforms, including long doubles.

@p5pRT
Copy link
Author

p5pRT commented Feb 17, 2016

From @jhi

Some more commentary.

Some users of the dtoa.c​:

* Python​: https://hg.python.org/cpython/file/tip/Python/dtoa.c -- possibly relevant Python ticket​: http​://bugs.python.org/issue9009
* Ruby​: https://github.com/ruby/ruby/blob/trunk/util.c (dtoa.c relevant bits merged in here)
* https://github.com/php/php-src/blob/master/Zend/zend_strtod.c
* and multiple Mozilla projects​: https://dxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/misc/dtoa.c

I also tried test building gdtoa in OS X with clang, more than a year ago now, and found some nits fixed in the attached patch. I sent the patch to David Gay but haven't heard back from him, in a year or more.

Note that is important to use the latest gdtoa (20131209), since the older ones have known issues, e.g.

http​://www.exploringbinary.com/how-strtod-works-and-sometimes-doesnt/ (see end)

or is tricky to compile (though Perl is saved from this, since we explicitly avoid strict aliasing, for our own good)​:

http​://patrakov.blogspot.com/2009/03/dont-use-old-dtoac.html

And no, the official gdtoa is not in any kind of version control system. You get a date-stamped tar.

Rick Regan's blog is required reading on these matters, see e.g.

http​://www.exploringbinary.com/inconsistent-rounding-of-printed-floating-point-numbers/

@p5pRT
Copy link
Author

p5pRT commented Feb 17, 2016

From @jhi

gdtoa-20131209-patches.tgz

@p5pRT
Copy link
Author

p5pRT commented Feb 17, 2016

From @jhi

I also tried test building gdtoa in OS X with clang, more than a year
ago now, and found some nits fixed in the attached patch. I sent the
patch to David Gay but haven't heard back from him, in a year or more.

My apologies to Mr Gay, it hasn't been more than a year, just a few months. Dont' know where I pulled the one year from.

@p5pRT
Copy link
Author

p5pRT commented Feb 18, 2016

From @jhi

Adding link to https://rt-archive.perl.org/perl5/Ticket/Display.html?id=127182 since relevant discussion also there.

@p5pRT
Copy link
Author

p5pRT commented Feb 20, 2016

From @jhi

gdtoa 20160219 update from Mr Gay​:

http​://www.ampl.com/netlib/fp/gdtoa.tgz
http​://www.ampl.com/netlib/fp/changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants