printf uses wrong cached number #15273

p5pRT · 2016-04-13T09:12:27Z

Migrated from rt.perl.org#127887 (status was 'open')

Searchable as RT127887$

p5pRT · 2016-04-13T09:12:27Z

From rehsack@gmail.com

Created by rehsack@gmail.com

I play around with number representation to develop/prove a reasonable test for L::MU's minmax function.

One of my tests was

$ perl -le 'my $x = ~0; printf("%u\n", ++$x)'
18446744073709551615
$ perl -le 'my $x = ~0; print(++$x)'
1.84467440737096e+19

Seems that printf is using the values stored in STRUCT_SV.sv_u.svu_uv without proving (or maybe the check is broken) whether it's an IVUV or not or the documentation of sprintf is broken.

Perl Info


Flags:
    category=core
    severity=high

Site configuration information for perl 5.23.9:

Configured by sno at Fri Apr  1 09:20:54 CEST 2016.

Summary of my perl5 (revision 5 version 23 subversion 9) configuration:

  Platform:
    osname=darwin, osvers=15.4.0, archname=darwin-2level
    uname='darwin walter.muppets.liwing.de 15.4.0 darwin kernel version 15.4.0: fri feb 26 22:08:05 pst 2016; root:xnu-3248.40.184~3release_x86_64 x86_64 '
    config_args='-de -Dprefix=/Users/sno/perl5/perlbrew/perls/perl-5.23.9 -Dusedevel -Aeval:scriptdir=/Users/sno/perl5/perlbrew/perls/perl-5.23.9/bin'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector-strong',
    optimize='-O3',
    cppflags='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector-strong'
    ccversion='', gccversion='4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.29)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -fstack-protector-strong'
    libpth=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.3.0/lib/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/usr/lib /usr/lib
    libs=-lpthread -ldbm -ldl -lm -lutil -lc
    perllibs=-lpthread -ldl -lm -lutil -lc
    libc=, so=dylib, useshrplib=false, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -fstack-protector-strong'



@INC for perl 5.23.9:
    /Users/sno/perl5/perlbrew/perls/perl-5.23.9/lib/site_perl/5.23.9/darwin-2level
    /Users/sno/perl5/perlbrew/perls/perl-5.23.9/lib/site_perl/5.23.9
    /Users/sno/perl5/perlbrew/perls/perl-5.23.9/lib/5.23.9/darwin-2level
    /Users/sno/perl5/perlbrew/perls/perl-5.23.9/lib/5.23.9
    .


Environment for perl 5.23.9:
    DYLD_LIBRARY_PATH (unset)
    HOME=/Users/sno
    LANG=de_DE.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/Users/sno/perl5/perlbrew/bin:/Users/sno/perl5/perlbrew/perls/perl-5.23.9/bin:/Users/sno/bin:/opt/pkg/bin:/opt/pkg/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/X11/bin:/usr/local/MacGPG2/bin
    PERLBREW_BASHRC_VERSION=0.54
    PERLBREW_HOME=/Users/sno/.perlbrew
    PERLBREW_MANPATH=/Users/sno/perl5/perlbrew/perls/perl-5.23.9/man
    PERLBREW_PATH=/Users/sno/perl5/perlbrew/bin:/Users/sno/perl5/perlbrew/perls/perl-5.23.9/bin
    PERLBREW_PERL=perl-5.23.9
    PERLBREW_ROOT=/Users/sno/perl5/perlbrew
    PERLBREW_VERSION=0.73
    PERL_BADLANG (unset)
    SHELL=/bin/bash



--
Jens Rehsack - rehsack@gmail.com

p5pRT · 2016-04-13T17:11:32Z

From zefram@fysh.org

Jens Rehsack wrote:

$ perl -le 'my $x = ~0; printf("%u\n", ++$x)'
18446744073709551615
$ perl -le 'my $x = ~0; print(++$x)'
1.84467440737096e+19

This looks fine to me. Your ~0 yields a UV of value 2**64-1, the ++
overflows UV range so switches to an NV of value 2**64, then the %u
requests conversion of this value to UV, due to range limit yielding
the maximum UV value 2**64-1. What results were you expecting?

-zefram

p5pRT · 2016-04-13T17:11:32Z

The RT System itself - Status changed from 'new' to 'open'

p5pRT · 2016-04-13T17:35:57Z

From rehsack@gmail.com

Am 13.04.2016 um 19:11 schrieb Zefram via RT <perlbug-followup@perl.org>:

Jens Rehsack wrote:

$ perl -le 'my $x = ~0; printf("%u\n", ++$x)'
18446744073709551615
$ perl -le 'my $x = ~0; print(++$x)'
1.84467440737096e+19

This looks fine to me. Your ~0 yields a UV of value 2**64-1, the ++
overflows UV range so switches to an NV of value 2**64, then the %u
requests conversion of this value to UV, due to range limit yielding
the maximum UV value 2**64-1. What results were you expecting?

I would expect either 0 or NaN or whatever - but not a wrong result.
I precisely understand what happens - but delivering the wrong output
is not an option.

Cheers
--
Jens Rehsack - rehsack@gmail.com

p5pRT · 2016-04-13T17:41:54Z

From zefram@fysh.org

Jens Rehsack wrote:

I would expect either 0 or NaN or whatever - but not a wrong result.

0 would be an equally wrong result. NaN is impossible to get, because
it's not representable in UV. It's necessary to produce a `wrong'
result, because the conversion must yield a UV, and the input value is
not representable as a UV. The only alternative would be an exception
for inexact conversion, and that's not going to happen because perl has
always done coercions permissively.

-zefram

p5pRT · 2016-04-13T17:59:51Z

From @Leont

On Wed, Apr 13, 2016 at 7:35 PM, Jens Rehsack <rehsack@gmail.com> wrote:

I would expect either 0 or NaN or whatever - but not a wrong result.
I precisely understand what happens - but delivering the wrong output
is not an option.

There really isn't any correct answer there. I would support warning in
such a case, but there isn't much more we can do at that point.

Leon

p5pRT · 2016-04-13T18:03:06Z

From rehsack@gmail.com

Am 13.04.2016 um 19:41 schrieb Zefram via RT <perlbug-followup@perl.org>:

Jens Rehsack wrote:

I would expect either 0 or NaN or whatever - but not a wrong result.

0 would be an equally wrong result. NaN is impossible to get, because
it's not representable in UV.

But a simple, stand alone '-' or '.' would be.

It's necessary to produce a `wrong'
result, because the conversion must yield a UV, and the input value is
not representable as a UV.

It is never necessary to produce a wrong result. Maybe you meant something
different - in that case, please rephrase.

The only alternative would be an exception
for inexact conversion, and that's not going to happen because perl has
always done coercions permissively.

That's not true. printf can be disadvised, the documentation can be updated
to point to the potential wrong result.... There're a lot of options.

Cheers
--
Jens Rehsack - rehsack@gmail.com

p5pRT · 2016-04-13T18:25:50Z

From zefram@fysh.org

Jens Rehsack wrote:

Am 13.04.2016 um 19:41 schrieb Zefram via RT <perlbug-followup@perl.org>:
0 would be an equally wrong result. NaN is impossible to get, because
it's not representable in UV.

But a simple, stand alone '-' or '.' would be.

"-" and "." are equally unrepresentable as UV. You've got 64 bits to
play with, and all 2**64 states are defined to represent numerical values.
There is no state available to represent an error.

You may of course argue that this rather low-level data type (copied
straight from C) makes a poor abstraction for a high-level language.
That would be correct, but it's a question of language design that was
settled for Perl many years ago. It is not a bug, and we are bound to
maintain the existing behaviour.

It is never necessary to produce a wrong result. Maybe you meant something
different - in that case, please rephrase.

Given the input you supplied, there is no `right' result available.
The only options are to yield a `wrong' result or to throw an exception.
As a matter of language design again, it was not necessary for Perl to
be defined to perform inexact coercions, but you are decades too late
to argue that issue. Established behaviour of Perl dictates what the
right coercion results are for out-of-range and non-integer inputs.

            printf can be disadvised\, the documentation can be updated
to point to the potential wrong result.... There're a lot of options.

perlnumber(1) describes the behaviour for conversions where the input
is not exactly representable.

-zefram

p5pRT · 2016-04-13T18:39:28Z

From rehsack@gmail.com

Am 13.04.2016 um 20:25 schrieb Zefram via RT <perlbug-followup@perl.org>:

Jens Rehsack wrote:

Am 13.04.2016 um 19:41 schrieb Zefram via RT <perlbug-followup@perl.org>:
0 would be an equally wrong result. NaN is impossible to get, because
it's not representable in UV.

But a simple, stand alone '-' or '.' would be.

"-" and "." are equally unrepresentable as UV. You've got 64 bits to
play with, and all 2**64 states are defined to represent numerical values.
There is no state available to represent an error.

Sure - I simply tried to say, it would be possible to represent an
out-of-range value. What is allowed and not used, it "-0".

You may of course argue that this rather low-level data type (copied
straight from C) makes a poor abstraction for a high-level language.
That would be correct, but it's a question of language design that was
settled for Perl many years ago. It is not a bug, and we are bound to
maintain the existing behaviour.

It is never necessary to produce a wrong result. Maybe you meant something
different - in that case, please rephrase.

Given the input you supplied, there is no `right' result available.
The only options are to yield a `wrong' result or to throw an exception.
As a matter of language design again, it was not necessary for Perl to
be defined to perform inexact coercions, but you are decades too late
to argue that issue. Established behaviour of Perl dictates what the
right coercion results are for out-of-range and non-integer inputs.
           printf can be disadvised\, the documentation can be updated
to point to the potential wrong result.... There're a lot of options.
perlnumber(1) describes the behaviour for conversions where the input
is not exactly representable.

Especially such a hint in printf documentation would be very helpful.
I initially said - the behavior should be probably better documented.

Cheers
--
Jens Rehsack - rehsack@gmail.com

p5pRT · 2016-04-13T18:54:15Z

From @kentfredric

On 14 April 2016 at 06:02, Jens Rehsack <rehsack@gmail.com> wrote:

Jens Rehsack wrote:

I would expect either 0 or NaN or whatever - but not a wrong result.

I should point out that you can get a different behaviour if you don't
mind stepping away from Perl's native types.

perl -Mbigint -lE ' my $x = ~0; print(++$x) '
0

--
Kent

KENTNL - https://metacpan.org/author/KENTNL

p5pRT · 2016-04-13T19:04:56Z

From rehsack@gmail.com

Am 13.04.2016 um 20:53 schrieb Kent Fredric <kentfredric@gmail.com>:

On 14 April 2016 at 06:02, Jens Rehsack <rehsack@gmail.com> wrote:

Jens Rehsack wrote:

I would expect either 0 or NaN or whatever - but not a wrong result.

I should point out that you can get a different behaviour if you don't
mind stepping away from Perl's native types.

perl -Mbigint -lE ' my $x = ~0; print(++$x) '
0

I should try to explain my motivation filling a ticket:

I tried something very common for me and got a very surprising result.
Following Zefram's hints in perlnumber(1), the behavior is intended for the
language.

But I didn't find an explanation for the surprise within the typical 20
minutes - and I'm not that bad in Perl as others might be.

I think, it is important to explain such surprises. So a warn as Leon
suggested would be great, a big notice in sprintf(1) mandatory.

Cheers
--
Jens Rehsack - rehsack@gmail.com

p5pRT · 2016-05-08T18:03:17Z

From @arc

Jens Rehsack <rehsack@gmail.com> wrote:

I think, it is important to explain such surprises. So a warn as Leon
suggested would be great, a big notice in sprintf(1) mandatory.

I am fairly strongly unconvinced that a warning is appropriate in
these circumstances, on the grounds of historical practice. That is:
Perl has never warned on inexact conversions in sprintf, and I suspect
that starting to emit such warnings now would therefore be too
surprising.

I'm less sure about whether it's a good idea to extend the sprintf
documentation (already two thousand words of text) to explicitly note
this issue. I've attached a patch with the following proposed wording:

Note that you may get surprising results if the value to be
formatted by one of the integer conversions is outside the range of
the relevant C-level type. For example, C<sprintf '%u', 1 + ~0> and
C<sprintf '%u', ~0> will produce the same output, because the result
of C<1 + ~0> will not fit into the underlying C-level type.

However, I don't plan to apply that patch unless this idea receives
positive feedback.

--
Aaron Crane ** http://aaroncrane.co.uk/

p5pRT · 2016-05-08T18:03:17Z

From @arc

0001-perl-127887-sprintf-doc-note-for-out-of-range-intege.patch

From 632c43423fed670e4d9f83daf5f691f5e185dab7 Mon Sep 17 00:00:00 2001
From: Aaron Crane <arc@cpan.org>
Date: Sun, 8 May 2016 18:48:28 +0100
Subject: [PATCH] [perl #127887] sprintf doc note for out-of-range integers

If you ask for an integer conversion, sprintf does the best it can to honour
that; if that's impossible, the results may be surprising.
---
 pod/perlfunc.pod | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod
index e9c7038..b0d2685 100644
--- a/pod/perlfunc.pod
+++ b/pod/perlfunc.pod
@@ -7850,6 +7850,12 @@ You can find out whether your Perl supports quads via L<Config>:
         print "Nice quads!\n";
     }
 
+Note that you may get surprising results if the value to be formatted by one
+of the integer conversions is outside the range of the relevant C-level
+type. For example, C<sprintf '%u', 1 + ~0> and C<sprintf '%u', ~0> will
+produce the same output, because the result of C<1 + ~0> will not fit into
+the underlying C-level type.
+
 For floating-point conversions (C<e f g E F G>), numbers are usually assumed
 to be the default floating-point size on your platform (double or long double),
 but you can force "long double" with C<q>, C<L>, or C<ll> if your
-- 
2.7.4

p5pRT · 2016-05-08T20:26:10Z

From rehsack@gmail.com

Am 08.05.2016 um 20:02 schrieb Aaron Crane <arc@cpan.org>:

Jens Rehsack <rehsack@gmail.com> wrote:

I think, it is important to explain such surprises. So a warn as Leon
suggested would be great, a big notice in sprintf(1) mandatory.

I am fairly strongly unconvinced that a warning is appropriate in
these circumstances, on the grounds of historical practice. That is:
Perl has never warned on inexact conversions in sprintf, and I suspect
that starting to emit such warnings now would therefore be too
surprising.

And because it's still 1985, what is must stay :P

Every language outside improved over time - and even K&R compatible
C compilers warn on loss of precision:

$ cat precision.c
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

int main()
{
double d = INT_MAX;
++d;
printf("%u\n", d);
}
$ cc -W -o precision precision.c
precision.c:9:20: warning: format specifies type 'unsigned int' but the argument has type 'double' [-Wformat]
printf("%u\n", d);
~~ ^
%f
1 warning generated.

I just want to clarify: "It never had so never will" isn't an argument
for a living language. Behavior which might cause errors lead to warnings,
and people who know and want to suppress, suppress the warnings by
disabling specific ones.

I'm less sure about whether it's a good idea to extend the sprintf
documentation (already two thousand words of text) to explicitly note
this issue. I've attached a patch with the following proposed wording:

Note that you may get surprising results if the value to be
formatted by one of the integer conversions is outside the range of
the relevant C-level type. For example, C<sprintf '%u', 1 + ~0> and
C<sprintf '%u', ~0> will produce the same output, because the result
of C<1 + ~0> will not fit into the underlying C-level type.

However, I don't plan to apply that patch unless this idea receives
positive feedback.

--
Aaron Crane ** http://aaroncrane.co.uk/
<0001-perl-127887-sprintf-doc-note-for-out-of-range-intege.patch>

Cheers
--
Jens Rehsack - rehsack@gmail.com

p5pRT · 2016-05-09T00:30:35Z

From @sisyphus

-----Original Message-----
From: Jens Rehsack
Sent: Monday, May 09, 2016 6:25 AM
To: Aaron Crane
Cc: PerlBug Followup via RT
Subject: Re: [perl #127887] printf uses wrong cached number

$ cat precision.c
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

int main()
{
double d = INT_MAX;
++d;
printf("%u\n", d);
}
$ cc -W -o precision precision.c
precision.c:9:20: warning: format specifies type 'unsigned int' but the
argument has type 'double' [-Wformat]
printf("%u\n", d);
~~ ^
%f
1 warning generated.

But that's a compile time warning that will be generated irrespective of the
value that d will hold.
I get the same C compiler warning with:

printf("%u\n", (double)0);

For perl, AFAICS, the closest we could get to that is to have a runtime
warning emitted based on changes to an NV's flags (and not warning whenever
IV/UV value is equivalent to the NV value).
Is that what you're requesting ?
Seems reasonable to me. After all, it could well signify unintended or
erroneous usage.

In the meantime, I'm personally not too distressed about perl just silently
giving me what I asked for.

Cheers,
Rob

p5pRT · 2016-05-09T07:29:25Z

From rehsack@gmail.com

Am 09.05.2016 um 02:29 schrieb <sisyphus1@optusnet.com.au> <sisyphus1@optusnet.com.au>:

-----Original Message----- From: Jens Rehsack
Sent: Monday, May 09, 2016 6:25 AM
To: Aaron Crane
Cc: PerlBug Followup via RT
Subject: Re: [perl #127887] printf uses wrong cached number

$ cat precision.c
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

int main()
{
double d = INT_MAX;
++d;
printf("%u\n", d);
}
$ cc -W -o precision precision.c
precision.c:9:20: warning: format specifies type 'unsigned int' but the argument has type 'double' [-Wformat]
printf("%u\n", d);
~~ ^
%f
1 warning generated.

But that's a compile time warning that will be generated irrespective of the value that d will hold.
I get the same C compiler warning with:

printf("%u\n", (double)0);

Precisely, because it has strict typing. Perl has dynamic typing, which causes the above described behavior ...

For perl, AFAICS, the closest we could get to that is to have a runtime warning emitted based on changes to an NV's flags (and not warning whenever IV/UV value is equivalent to the NV value).
Is that what you're requesting ?

Yes

Seems reasonable to me. After all, it could well signify unintended or erroneous usage.

In the meantime, I'm personally not too distressed about perl just silently giving me what I asked for.

As maintainer of several modules I receive requests reporting those silent automatisms are surprising to a lot of people.
And the "once" warning is a very good example where an explicit request to do something leads to a warning.

Cheers
--
Jens Rehsack - rehsack@gmail.com

p5pRT · 2017-12-16T23:33:04Z

From zefram@fysh.org

The analogy with the C warning is bogus: C isn't warning about
inexact conversion, but about the printf() argument having the wrong
representation for the format, such that the argument will be completely
misunderstood and printf() may fail to locate any subsequent arguments.

I'm unenthusiastic about the proposed doc patch. The sprintf()
doc already refers many times to C and to the concept of conversion.
The existing documentation about numeric conversion seems adequate.

This ticket should be closed.

-zefram

p5pRT · 2017-12-18T08:24:37Z

From rehsack@gmail.com

Am 17.12.2017 um 00:33 schrieb Zefram via RT <perlbug-followup@perl.org>:

The analogy with the C warning is bogus: C isn't warning about
inexact conversion, but about the printf() argument having the wrong
representation for the format, such that the argument will be completely
misunderstood and printf() may fail to locate any subsequent arguments.

I'm unenthusiastic about the proposed doc patch. The sprintf()
doc already refers many times to C and to the concept of conversion.
The existing documentation about numeric conversion seems adequate.

I scanned (not studied in terms of: reading every sentence and try to
interpret what could be meant, too) the sprintf doc wrt. integer formats.
There is no word around the explanation of integer formats who say: If
you use this, we will round whatever you gave us to the next value fitting
the appropriate range.

This ticket should be closed.

I don't think so.

People complain and complain about the bad standing of perl on OSS
conferences - this is one of the issues: it behaves (for a language
which is very close to system people) completely unexpected and it
is not mentioned loud enough neither a warning about an unexpected
conversion is given.

I do not (really) ask for modifying behavior of Perl, I ask for
warn people who do not hack the Perl core day by day to be aware
of unexpected conversion.

And - if this is not imaginable - document it very loud. Mind that
the people who will be surprised don't study the design principles
before writing an 5-liner counting occurrences of particular error
messages in log-files.

Cheers
--
Jens Rehsack - rehsack@gmail.com

p5pRT added Severity High distro-darwin type-core labels Oct 19, 2019

xenu removed affects-5.23 labels Nov 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

printf uses wrong cached number #15273

printf uses wrong cached number #15273

p5pRT commented Apr 13, 2016

p5pRT commented Apr 13, 2016

p5pRT commented Apr 13, 2016

p5pRT commented Apr 13, 2016

p5pRT commented Apr 13, 2016

p5pRT commented Apr 13, 2016

p5pRT commented Apr 13, 2016

p5pRT commented Apr 13, 2016

p5pRT commented Apr 13, 2016

p5pRT commented Apr 13, 2016

p5pRT commented Apr 13, 2016

p5pRT commented Apr 13, 2016

p5pRT commented May 8, 2016

p5pRT commented May 8, 2016

p5pRT commented May 8, 2016

p5pRT commented May 9, 2016

p5pRT commented May 9, 2016

p5pRT commented Dec 16, 2017

p5pRT commented Dec 18, 2017

printf uses wrong cached number #15273

printf uses wrong cached number #15273

Comments

p5pRT commented Apr 13, 2016

p5pRT commented Apr 13, 2016

From rehsack@gmail.com

Created by rehsack@gmail.com

p5pRT commented Apr 13, 2016

From zefram@fysh.org

p5pRT commented Apr 13, 2016

p5pRT commented Apr 13, 2016

From rehsack@gmail.com

p5pRT commented Apr 13, 2016

From zefram@fysh.org

p5pRT commented Apr 13, 2016

From @Leont

p5pRT commented Apr 13, 2016

From rehsack@gmail.com

p5pRT commented Apr 13, 2016

From zefram@fysh.org

p5pRT commented Apr 13, 2016

From rehsack@gmail.com

p5pRT commented Apr 13, 2016

From @kentfredric

p5pRT commented Apr 13, 2016

From rehsack@gmail.com

p5pRT commented May 8, 2016

From @arc

p5pRT commented May 8, 2016

From @arc

p5pRT commented May 8, 2016

From rehsack@gmail.com

p5pRT commented May 9, 2016

From @sisyphus

p5pRT commented May 9, 2016

From rehsack@gmail.com

p5pRT commented Dec 16, 2017

From zefram@fysh.org

p5pRT commented Dec 18, 2017

From rehsack@gmail.com