Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Data::Dumper' reports double value as string. #9610

Open
p5pRT opened this issue Dec 27, 2008 · 18 comments
Open

'Data::Dumper' reports double value as string. #9610

p5pRT opened this issue Dec 27, 2008 · 18 comments
Labels
dist-Data-Dumper issues in the dual-life blead-first Data-Dumper distribution type-library

Comments

@p5pRT
Copy link

p5pRT commented Dec 27, 2008

Migrated from rt.perl.org#61754 (status was 'open')

Searchable as RT61754$

@p5pRT
Copy link
Author

p5pRT commented Dec 27, 2008

From 0body0@rambler.ru

Created by 0body0@rambler.ru

$ perl -MData​::Dumper -e'print Dumper(1.0)'
$VAR1 = '1';
must be​:
$VAR1 = 1;

===Note single quotes===

Best regard, grian

Perl Info

Flags:
    category=library
    severity=low

Site configuration information for perl 5.10.0:

Configured by gtoly at Sat Nov  1 16:44:39 MSK 2008.

Summary of my perl5 (revision 5 version 10 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.6.24.4, archname=x86_64-linux-thread-multi
    uname='linux toh-dev 2.6.24.4 #2 smp mon oct 6 15:36:34 msd 2008 x86_64 amd opteron(tm) processor 852 authenticamd gnulinux '
    config_args=''
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O3',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
    ccversion='', gccversion='4.1.2 (Gentoo 4.1.2 p1.0.2)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64 /usr/local/lib64
    libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=/lib/libc-2.6.1.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.6.1'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O3 -L/usr/local/lib'

Locally applied patches:
    


@INC for perl 5.10.0:
    /home/gtoly/lib
    /usr/local/lib/perl5/5.10.0/x86_64-linux-thread-multi
    /usr/local/lib/perl5/5.10.0
    /usr/local/lib/perl5/site_perl/5.10.0/x86_64-linux-thread-multi
    /usr/local/lib/perl5/site_perl/5.10.0
    .


Environment for perl 5.10.0:
    HOME=/home/gtoly
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/opt/bin:~/bin:/home/toh/toh.ru/bin
    PERL5LIB=/home/gtoly/lib
    PERL5OPT= -M5.010 -MData::Dumper
    PERL_BADLANG (unset)
    SHELL=/bin/bash


@p5pRT
Copy link
Author

p5pRT commented Dec 27, 2008

From @eserte

Anatoly Grishaev (via RT) <perlbug-followup@​perl.org> writes​:

# <URL​: http​://rt.perl.org/rt3/Ticket/Display.html?id=61754 >

$ perl -MData​::Dumper -e'print Dumper(1.0)'
$VAR1 = '1';
must be​:
$VAR1 = 1;

===Note single quotes===

Note that the result depends whether the XS or the pure perl
implementation is used​:

$ perl -MData​::Dumper -e'$Data​::Dumper​::Useperl=0; print Dumper(1.0)'
$VAR1 = '1';
$ perl -MData​::Dumper -e'$Data​::Dumper​::Useperl=1; print Dumper(1.0)'
$VAR1 = 1;

Regards,
  Slaven

--
Slaven Rezic - slaven <at> rezic <dot> de
  BBBike - route planner for cyclists in Berlin
  WWW version​: http​://www.bbbike.de
  Perl/Tk version for Unix and Windows​: http​://bbbike.sourceforge.net

@p5pRT
Copy link
Author

p5pRT commented Dec 27, 2008

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Dec 29, 2008

From module@renee-baecker.de

On Sa. 27. Dez. 2008, 02​:40​:54, grian wrote​:

[Please enter your report here]

$ perl -MData​::Dumper -e'print Dumper(1.0)'
$VAR1 = '1';
must be​:
$VAR1 = 1;

===Note single quotes===

Is this really a bug?

I know that the behaviour differs with Dumper.xs and Dumper.pm.

This is current behaviour​:

C​:\>perl -MData​::Dumper -e "print Dumper 1.0"
$VAR1 = '1';

C​:\>perl -MData​::Dumper -e "print Dumper 1.01"
$VAR1 = '1.01';

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper 1.01"
$VAR1 = '1.01';

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper 1.0"
$VAR1 = 1;

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper
123456789.0"
$VAR1 = 123456789;

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper
1234567890.0"
$VAR1 = '1234567890';

You can see that even with the Perl implementation of Dumper it uses
single quotes if the number has more than 9 digits. This is due to this
line​:

  elsif ($val =~ /^(?​:0|-?[1-9]\d{0,8})\z/) { # safe decimal number
  $out .= $val;
  }

If you want to get rid of the single quotes for decimals that ends with
".0" you have to remove the differentiation of Perl < 5.6 and Perl > 5.6
for the "DD_is_integer" definition in Dumper.xs.

@p5pRT
Copy link
Author

p5pRT commented Dec 31, 2008

From 0body0@rambler.ru

And what is a bug?

My only motivation to see numerical scalars without single quotes if this
don't cause lost precision.
And every number is a string. But not every string is a number.

A thus examle of is ok then big numbers dump with quotes

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper
1234567890.0"
$VAR1 = '1234567890';

Desired behaviour

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper
1234567890.0"
$VAR1 = '1234567890';
C​:\>perl -MData​::Dumper -e "print Dumper 1234567890.0"
$VAR1 = 1234567890;

and

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper 1.0"
$VAR1 = 1;
C​:\>perl -MData​::Dumper -e "print Dumper 1.0"
$VAR1 = 1;
and for string

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper '1.0'"
$VAR1 = '1.0';
C​:\>perl -MData​::Dumper -e "print Dumper '1.0'"
$VAR1 = '1.0';

and may be

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper ('1.5',
1.5)"
$VAR1 = 1.5;
$VAR2 = 1.5;
C​:\>perl -MData​::Dumper -e "print Dumper '1.0'"
$VAR1 = '1.5';
$VAR2 = 1.5;

and may be

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper
('1.5123456', 1.5123456)"
$VAR1 = '1.5123456';
$VAR2 = '1.5123456';
C​:\>perl -MData​::Dumper -e "print Dumper '1.0'"
$VAR1 = '1.5123456';
$VAR2 = 1.5123456;

In last case xs implementation will be more correct

maybe change this string
elsif ($val =~ /^(?​:0|-?[1-9]\d{0,8})\z/) { # safe decimal number
  $out .= $val;
  }
on
elsif ($val =~ /^(?​:0|-?[1-9]\d{0,8})(?​:\.\d*[1-9]\z/) { # safe decimal
number

and appropriate change in xs, and tests

I can provide a patch for this change if anybody tell me how do it( I use
svn, and may be can use git)

I think differences with perl and xs implementation will be less, and
results will be more accurate and nothing will be broken. IMHO

Happy New Year.

----- Original Message -----
From​: "reneeb via RT" <perlbug-followup@​perl.org>
To​: <0body0@​rambler.ru>
Sent​: Monday, December 29, 2008 5​:27 PM
Subject​: [perl #61754] 'Data​::Dumper' reports double value as string.

On Sa. 27. Dez. 2008, 02​:40​:54, grian wrote​:

[Please enter your report here]

$ perl -MData​::Dumper -e'print Dumper(1.0)'
$VAR1 = '1';
must be​:
$VAR1 = 1;

===Note single quotes===

Is this really a bug?

You can see that even with the Perl implementation of Dumper it uses
single quotes if the number has more than 9 digits. This is due to this
line​:

elsif ($val =~ /^(?​:0|-?[1-9]\d{0,8})\z/) { # safe decimal number
$out .= $val;
}

If you want to get rid of the single quotes for decimals that ends with
".0" you have to remove the differentiation of Perl < 5.6 and Perl > 5.6
for the "DD_is_integer" definition in Dumper.xs.

I know that the behaviour differs with Dumper.xs and Dumper.pm.

This is current behaviour​:

C​:\>perl -MData​::Dumper -e "print Dumper 1.0"
$VAR1 = '1';

C​:\>perl -MData​::Dumper -e "print Dumper 1.01"
$VAR1 = '1.01';

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper 1.01"
$VAR1 = '1.01';

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper 1.0"
$VAR1 = 1;

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper
123456789.0"
$VAR1 = 123456789;

C​:\>perl -MData​::Dumper -e "$Data​::Dumper​::Useperl=1; print Dumper
1234567890.0"
$VAR1 = '1234567890';

@p5pRT
Copy link
Author

p5pRT commented Jan 12, 2009

From module@renee-baecker.de

On Mi. 31. Dez. 2008, 12​:47​:08, grian wrote​:

In last case xs implementation will be more correct

maybe change this string
elsif ($val =~ /^(?​:0|-?[1-9]\d{0,8})\z/) { # safe decimal number
$out .= $val;
}
on
elsif ($val =~ /^(?​:0|-?[1-9]\d{0,8})(?​:\.\d*[1-9]\z/) { # safe
decimal
number

and appropriate change in xs, and tests

I can provide a patch for this change if anybody tell me how do it( I
use
svn, and may be can use git)

please provide a unified diff (diff -u <original_version>
<new_version>). The paths to the files should have the same number of
slashes (e.g. diff -u file.orig file.new or diff -u dir/file.orig
dir/file.new).

Then you can send the diff to this ticket system.

I think differences with perl and xs implementation will be less, and
results will be more accurate and nothing will be broken. IMHO

@p5pRT
Copy link
Author

p5pRT commented Jan 12, 2009

From 0body0@rambler.ru

Patch for Data​::Dumper (Fix decimal printing)

<grian@​cpan.org>

----- Original Message -----
From​: "reneeb via RT" <perlbug-followup@​perl.org>
To​: <0body0@​rambler.ru>
Sent​: Monday, January 12, 2009 12​:26 PM
Subject​: [perl #61754] 'Data​::Dumper' reports double value as string.

On Mi. 31. Dez. 2008, 12​:47​:08, grian wrote​:

In last case xs implementation will be more correct

maybe change this string
elsif ($val =~ /^(?​:0|-?[1-9]\d{0,8})\z/) { # safe decimal number
$out .= $val;
}
on
elsif ($val =~ /^(?​:0|-?[1-9]\d{0,8})(?​:\.\d*[1-9]\z/) { # safe
decimal
number

and appropriate change in xs, and tests

I can provide a patch for this change if anybody tell me how do it( I
use
svn, and may be can use git)

please provide a unified diff (diff -u <original_version>
<new_version>). The paths to the files should have the same number of
slashes (e.g. diff -u file.orig file.new or diff -u dir/file.orig
dir/file.new).

Then you can send the diff to this ticket system.

I think differences with perl and xs implementation will be less, and
results will be more accurate and nothing will be broken. IMHO

@p5pRT
Copy link
Author

p5pRT commented Jan 12, 2009

@p5pRT
Copy link
Author

p5pRT commented Jan 24, 2009

From 0body0@rambler.ru

Created by 0body0@rambler.ru

$ perl -MData​::Dumper -e'print Dumper(1.0)'
$VAR1 = '1';
must be​:
$VAR1 = 1;

===Note single quotes===

Best regard, grian

Perl Info

Flags:
    category=library
    severity=low

Site configuration information for perl 5.10.0:

Configured by gtoly at Sat Nov  1 16:44:39 MSK 2008.

Summary of my perl5 (revision 5 version 10 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.6.24.4, archname=x86_64-linux-thread-multi
    uname='linux toh-dev 2.6.24.4 #2 smp mon oct 6 15:36:34 msd 2008 x86_64 amd opteron(tm) processor 852 authenticamd gnulinux '
    config_args=''
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O3',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
    ccversion='', gccversion='4.1.2 (Gentoo 4.1.2 p1.0.2)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64 /usr/local/lib64
    libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=/lib/libc-2.6.1.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.6.1'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O3 -L/usr/local/lib'

Locally applied patches:
    


@INC for perl 5.10.0:
    /home/gtoly/lib
    /usr/local/lib/perl5/5.10.0/x86_64-linux-thread-multi
    /usr/local/lib/perl5/5.10.0
    /usr/local/lib/perl5/site_perl/5.10.0/x86_64-linux-thread-multi
    /usr/local/lib/perl5/site_perl/5.10.0
    .


Environment for perl 5.10.0:
    HOME=/home/gtoly
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/opt/bin:~/bin:/home/toh/toh.ru/bin
    PERL5LIB=/home/gtoly/lib
    PERL5OPT= -M5.010 -MData::Dumper
    PERL_BADLANG (unset)
    SHELL=/bin/bash


@p5pRT
Copy link
Author

p5pRT commented Feb 4, 2012

From @cpansprout

On Mon Jan 12 04​:21​:34 2009, grian wrote​:

Patch for Data​::Dumper (Fix decimal printing)

<grian@​cpan.org>

The patch attached to this ticket has been here for three years. I�ve
only just noticed it. It looks fine to me at first. Does anyone else
have an opinion on it?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Feb 4, 2012

From @cpansprout

On Fri Feb 03 23​:14​:39 2012, sprout wrote​:

On Mon Jan 12 04​:21​:34 2009, grian wrote​:

Patch for Data​::Dumper (Fix decimal printing)

<grian@​cpan.org>

The patch attached to this ticket has been here for three years. I�ve
only just noticed it. It looks fine to me at first. Does anyone else
have an opinion on it?

And if anyone else wants to apply it, please don�t assume I will, but go
ahead. I will be away from the computer for about a week.

(And I need to answer Nicholas Clark�s mail about continue, but that
will have to wait, too.)

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Feb 4, 2012

From @nwc10

On Fri, Feb 03, 2012 at 11​:14​:40PM -0800, Father Chrysostomos via RT wrote​:

On Mon Jan 12 04​:21​:34 2009, grian wrote​:

Patch for Data​::Dumper (Fix decimal printing)

<grian@​cpan.org>

The patch attached to this ticket has been here for three years. I've
only just noticed it. It looks fine to me at first. Does anyone else
have an opinion on it?

Sorry, I missed this one too.

If I understand the original bug report, this patch is going a lot further.
The original report was that these two differed​:

$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1.0)'
$VAR1 = '1';
$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1)'
$VAR1 = 1;

and then the observation was made that it's the XS code. The pure perl
version is the same​:

$ ./perl -Ilib -MData​::Dumper -e '$Data​::Dumper​::Useperl = 1; print Dumper(1.0)'
$VAR1 = 1;
$ ./perl -Ilib -MData​::Dumper -e '$Data​::Dumper​::Useperl = 1; print Dumper(1)'
$VAR1 = 1;

Extracting the patches from the gzip'd tarfile, and tweaking it to apply​:

Inline Patch
diff --git a/dist/Data-Dumper/Dumper.pm b/dist/Data-Dumper/Dumper.pm
index fcf06ad..52d1f6c 100644
--- a/dist/Data-Dumper/Dumper.pm
+++ b/dist/Data-Dumper/Dumper.pm
@@ -542,7 +542,7 @@ sub _dump {
        and ref $ref eq 'VSTRING' || eval{Scalar::Util::isvstring($val)}) {
       $out .= sprintf "%vd", $val;
     }
-    elsif ($val =~ /^(?:0|-?[1-9]\d{0,8})\z/) { # safe decimal number
+    elsif ($val =~ /^(?:0|-?[1-9]\d{0,8})(?:\.\d{0,5}[1-9])?\z/) { # safe decimal number
       $out .= $val;
     }
     else {				 # string
diff --git a/dist/Data-Dumper/Dumper.xs b/dist/Data-Dumper/Dumper.xs
index 2ad53a1..8fe8e70 100644
--- a/dist/Data-Dumper/Dumper.xs
+++ b/dist/Data-Dumper/Dumper.xs
@@ -61,6 +61,8 @@ Perl_utf8_to_uvchr(pTHX_ U8 *s, STRLEN *retlen)
 #define DD_is_integer(sv) SvIOK(sv)
 #endif
 
+#define DD_is_double(sv) SvNOK(sv)
+
 /* does a string need to be protected? */
 static I32
 needs_quote(register const char *s, STRLEN len)
@@ -891,10 +893,12 @@ DD_dump(pTHX_ SV *val, const char *name, STRLEN namelen, SV *retval, HV *seenhv,
 	    }
 	}
 
-        if (DD_is_integer(val)) {
+        if (DD_is_integer(val) || DD_is_double(val)) {
             STRLEN len;
 	    if (SvIsUV(val))
 	      len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"UVuf, SvUV(val));
+	    else if (DD_is_double(val)) 
+	      len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"NVgf, SvNV(val));
 	    else
 	      len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"IVdf, SvIV(val));
             if (SvPOK(val)) {



The patch as-is goes too far.

New result is

$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1.1)'
$VAR1 = 1.1;
$ ./perl -Ilib -MData​::Dumper -e '$Data​::Dumper​::Useperl = 1; print Dumper(1.1)'
$VAR1 = 1.1;

Whereas the existing code is
$ perl -MData​::Dumper -e 'print Dumper(1.1)'
$VAR1 = '1.1';
$ perl -MData​::Dumper -e '$Data​::Dumper​::Useperl = 1; print Dumper(1.1)'
$VAR1 = '1.1';

ie '' used on floating point values for decimals.

I don't think that we should change behaviour to use floating point literals.
If we did this, we'd implicitly add a round trip conversion in the reading
code (parser converts decimals to binary floating point, sv_2pv() converts
binary floating point back to decimal), which may introduce errors.

I think that a correct fix would only be a change to Dumper.xs, and is
*something* like this​:

Inline Patch
diff --git a/dist/Data-Dumper/Dumper.xs b/dist/Data-Dumper/Dumper.xs
index 2ad53a1..1746aa6 100644
--- a/dist/Data-Dumper/Dumper.xs
+++ b/dist/Data-Dumper/Dumper.xs
@@ -891,12 +891,18 @@ DD_dump(pTHX_ SV *val, const char *name, STRLEN namelen, SV *retval, HV *seenhv,
 	    }
 	}
 
-        if (DD_is_integer(val)) {
+        if (DD_is_integer(val)
+	    || (SvNOK(val) && -1e10 < SvNVX(sv) && SvNVX(sv) < 1e10
+		&& SvNVX(sv) == Perl_floor(SvNVX(sv)))) {
             STRLEN len;
-	    if (SvIsUV(val))
-	      len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"UVuf, SvUV(val));
-	    else
-	      len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"IVdf, SvIV(val));
+	    if (DD_is_integer(val)) {
+		if (SvIsUV(val))
+		    len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"UVuf, SvUV(val));
+		else
+		    len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"IVdf, SvIV(val));
+	    } else {
+		len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"NVgf, SvNVX(val));
+	    }
             if (SvPOK(val)) {
               /* Need to check to see if this is a string such as " 0".
                  I'm assuming from sprintf isn't going to clash with utf8.


which gives:

$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1.0)'
$VAR1 = 1;
$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1.1)'
$VAR1 = '1.1';
$ ./perl -Ilib -MData​::Dumper -e '$Data​::Dumper​::Useperl = 1; print Dumper(1.0)'
$VAR1 = 1;
$ ./perl -Ilib -MData​::Dumper -e '$Data​::Dumper​::Useperl = 1; print Dumper(1.1)'
$VAR1 = '1.1';

ie only use unquoted literals for integers with 9 or fewer digits, and
keep using the existing code for anything else. The reason for the range
checks on ±1e10 are because that integer code path has this a bit further​:

  if (len > 10) {
  /* Looks like we're on a 64 bit system. Make it a string so that
  if a 32 bit system reads the number it will cope better. */
  sv_catpvf(retval, "'%s'", tmpbuf);
  } else
  sv_catpvn(retval, tmpbuf, len);

and checking that the floating point value is less than ±2**53 before doing
anything avoids all sorts of possible errors when at the limit of mantissa.

However that "%g"NVgf is not correct for larger values, and I'm not sure
what is, as I've just broken this​:

$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(12345678901.1)'
$VAR1 = '12345678901.1';
$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1234567890.1)'
$VAR1 = '1234567890.1';
$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(12345678901.0)'
$VAR1 = '12345678901';
$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1234567890.0)'
$VAR1 = '1.23457e+09';

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Feb 7, 2012

From 0body0@rambler.ru

04.02.2012 12​:10, Nicholas Clark via RT пиÑ�еÑ�​:

On Fri, Feb 03, 2012 at 11​:14​:40PM -0800, Father Chrysostomos via RT wrote​:

On Mon Jan 12 04​:21​:34 2009, grian wrote​:

Patch for Data​::Dumper (Fix decimal printing)

<grian@​cpan.org>
The patch attached to this ticket has been here for three years. I've
only just noticed it. It looks fine to me at first. Does anyone else
have an opinion on it?
Sorry, I missed this one too.

If I understand the original bug report, this patch is going a lot further.
The original report was that these two differed​:

$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1.0)'
$VAR1 = '1';
$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1)'
$VAR1 = 1;

and then the observation was made that it's the XS code. The pure perl
version is the same​:

$ ./perl -Ilib -MData​::Dumper -e '$Data​::Dumper​::Useperl = 1; print Dumper(1.0)'
$VAR1 = 1;
$ ./perl -Ilib -MData​::Dumper -e '$Data​::Dumper​::Useperl = 1; print Dumper(1)'
$VAR1 = 1;

Extracting the patches from the gzip'd tarfile, and tweaking it to apply​:

diff --git a/dist/Data-Dumper/Dumper.pm b/dist/Data-Dumper/Dumper.pm
index fcf06ad..52d1f6c 100644
--- a/dist/Data-Dumper/Dumper.pm
+++ b/dist/Data-Dumper/Dumper.pm
@​@​ -542,7 +542,7 @​@​ sub _dump {
and ref $ref eq 'VSTRING' || eval{Scalar​::Util​::isvstring($val)}) {
$out .= sprintf "%vd", $val;
}
- elsif ($val =~ /^(?​:0|-?[1-9]\d{0,8})\z/) { # safe decimal number
+ elsif ($val =~ /^(?​:0|-?[1-9]\d{0,8})(?​:\.\d{0,5}[1-9])?\z/) { # safe decimal number
$out .= $val;
}
else { # string
diff --git a/dist/Data-Dumper/Dumper.xs b/dist/Data-Dumper/Dumper.xs
index 2ad53a1..8fe8e70 100644
--- a/dist/Data-Dumper/Dumper.xs
+++ b/dist/Data-Dumper/Dumper.xs
@​@​ -61,6 +61,8 @​@​ Perl_utf8_to_uvchr(pTHX_ U8 *s, STRLEN *retlen)
#define DD_is_integer(sv) SvIOK(sv)
#endif

+#define DD_is_double(sv) SvNOK(sv)
+
/* does a string need to be protected? */
static I32
needs_quote(register const char *s, STRLEN len)
@​@​ -891,10 +893,12 @​@​ DD_dump(pTHX_ SV *val, const char *name, STRLEN namelen, SV *retval, HV *seenhv,
}
}

- if (DD_is_integer(val)) {
+ if (DD_is_integer(val) || DD_is_double(val)) {
STRLEN len;
if (SvIsUV(val))
len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"UVuf, SvUV(val));
+ else if (DD_is_double(val))
+ len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"NVgf, SvNV(val));
else
len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"IVdf, SvIV(val));
if (SvPOK(val)) {

The patch as-is goes too far.

Maybe change perl version of Data​::Dumper was not good idea ...
This part may be missed because it is not practical( Everyone use xs
version of Data​::Dumper).
I have done some test and see short numbers ( matching perl regexp ) are
converted from string to number and back without visible effects.

New result is

$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1.1)'
$VAR1 = 1.1;
$ ./perl -Ilib -MData​::Dumper -e '$Data​::Dumper​::Useperl = 1; print Dumper(1.1)'
$VAR1 = 1.1;

Whereas the existing code is
$ perl -MData​::Dumper -e 'print Dumper(1.1)'
$VAR1 = '1.1';
$ perl -MData​::Dumper -e '$Data​::Dumper​::Useperl = 1; print Dumper(1.1)'
$VAR1 = '1.1';

ie '' used on floating point values for decimals.

I don't think that we should change behaviour to use floating point literals.
If we did this, we'd implicitly add a round trip conversion in the reading
code (parser converts decimals to binary floating point, sv_2pv() converts
binary floating point back to decimal), which may introduce errors.

I think that a correct fix would only be a change to Dumper.xs, and is
*something* like this​:

diff --git a/dist/Data-Dumper/Dumper.xs b/dist/Data-Dumper/Dumper.xs
index 2ad53a1..1746aa6 100644
--- a/dist/Data-Dumper/Dumper.xs
+++ b/dist/Data-Dumper/Dumper.xs
@​@​ -891,12 +891,18 @​@​ DD_dump(pTHX_ SV *val, const char *name, STRLEN namelen, SV *retval, HV *seenhv,
}
}

- if (DD_is_integer(val)) {
+ if (DD_is_integer(val)
+ || (SvNOK(val)&& -1e10< SvNVX(sv)&& SvNVX(sv)< 1e10
+ && SvNVX(sv) == Perl_floor(SvNVX(sv)))) {
STRLEN len;
- if (SvIsUV(val))
- len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"UVuf, SvUV(val));
- else
- len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"IVdf, SvIV(val));
+ if (DD_is_integer(val)) {
+ if (SvIsUV(val))
+ len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"UVuf, SvUV(val));
+ else
+ len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"IVdf, SvIV(val));
+ } else {
+ len = my_snprintf(tmpbuf, sizeof(tmpbuf), "%"NVgf, SvNVX(val));
+ }
if (SvPOK(val)) {
/* Need to check to see if this is a string such as " 0".
I'm assuming from sprintf isn't going to clash with utf8.

I looks that in the expression​: missed fabs() function call

(SvNOK(val)&& -1e10< SvNVX(sv)&& SvNVX(sv)< 1e10
  && SvNVX(sv) == Perl_floor(SvNVX(sv))))

which gives​:

$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1.0)'
$VAR1 = 1;
$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1.1)'
$VAR1 = '1.1';
$ ./perl -Ilib -MData​::Dumper -e '$Data​::Dumper​::Useperl = 1; print Dumper(1.0)'
$VAR1 = 1;
$ ./perl -Ilib -MData​::Dumper -e '$Data​::Dumper​::Useperl = 1; print Dumper(1.1)'
$VAR1 = '1.1';

ie only use unquoted literals for integers with 9 or fewer digits, and
keep using the existing code for anything else. The reason for the range
checks on ±1e10 are because that integer code path has this a bit further​:

         if \(len>  10\) \{
           /\* Looks like we're on a 64 bit system\.  Make it a string so that
              if a 32 bit system reads the number it will cope better\.  \*/
           sv\_catpvf\(retval\, "'%s'"\, tmpbuf\);
         \} else
           sv\_catpvn\(retval\, tmpbuf\, len\);

and checking that the floating point value is less than ±2**53 before doing
anything avoids all sorts of possible errors when at the limit of mantissa.

However that "%g"NVgf is not correct for larger values, and I'm not sure
what is, as I've just broken this​:

$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(12345678901.1)'
$VAR1 = '12345678901.1';
$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1234567890.1)'
$VAR1 = '1234567890.1';
$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(12345678901.0)'
$VAR1 = '12345678901';
$ ./perl -Ilib -MData​::Dumper -e 'print Dumper(1234567890.0)'
$VAR1 = '1.23457e+09';
This is ok. It is other original bug and it is much more difficult to
fix than this patch. It is better to document this behavior ( when
numbers are longer 10 symbols )
than to find solution for all machines and FPU. More over it seems
impractical to me( It is rare cases ).

For most common cases it better apply this patch and left note about
bound cases in documentation. ( I am not native English speaker I can't
make this one)

Nicholas Clark

--
С �важением �на�олий.

@p5pRT
Copy link
Author

p5pRT commented Feb 7, 2012

From @nwc10

On Tue Feb 07 06​:54​:25 2012, grian wrote​:

04.02.2012 12​:10, Nicholas Clark via RT пиÑ�еÑ�​:

diff --git a/dist/Data-Dumper/Dumper.xs b/dist/Data-Dumper/Dumper.xs
index 2ad53a1..1746aa6 100644
--- a/dist/Data-Dumper/Dumper.xs
+++ b/dist/Data-Dumper/Dumper.xs
@​@​ -891,12 +891,18 @​@​ DD_dump(pTHX_ SV *val, const char *name,
STRLEN namelen, SV *retval, HV *seenhv,
}
}

- if (DD_is_integer(val)) {
+ if (DD_is_integer(val)
+ || (SvNOK(val)&& -1e10< SvNVX(sv)&& SvNVX(sv)< 1e10
+ && SvNVX(sv) == Perl_floor(SvNVX(sv)))) {

I looks that in the expression​: missed fabs() function call

(SvNOK(val)&& -1e10< SvNVX(sv)&& SvNVX(sv)< 1e10
&& SvNVX(sv) == Perl_floor(SvNVX(sv))))

I didn't think so. My intent was that

  -1e10 < SvNVX(sv) && SvNVX(sv) < 1e10

should be equivalent to

  fabs(SvNVX(sv)) < 1.e0

except that

a) it doesn't assume that SvNVX(sv) is of type double
  (it might be long double, depending on the configuration)
b) it's clearer than Perl_fabs() [which will be long double on
  such a configuration], but one needs to know what Perl_fabs() is
c) it will not be true if SvNVX(sv) is a NaN.
  I'd guess that fabs() return NaN for a Nan, but I would have had
  to look that up, *and* I wouldn't trust it as much not to be
  buggy. (Then again, Intel's icc compiler's default optimiser
  setting is to be buggy by ignoring IEEE NaNs)

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Feb 7, 2012

From @nwc10

[and to the list this time]

On Tue Feb 07 06​:54​:25 2012, grian wrote​:

04.02.2012 12​:10, Nicholas Clark via RT пиÑ�еÑ�​:

- if (DD_is_integer(val)) {
+ if (DD_is_integer(val)
+ || (SvNOK(val)&& -1e10< SvNVX(sv)&& SvNVX(sv)< 1e10
+ && SvNVX(sv) == Perl_floor(SvNVX(sv)))) {

I looks that in the expression​: missed fabs() function call

(SvNOK(val)&& -1e10< SvNVX(sv)&& SvNVX(sv)< 1e10
&& SvNVX(sv) == Perl_floor(SvNVX(sv))))

I didn't think so. My intent was that

-1e10 < SvNVX(sv) && SvNVX(sv) < 1e10

should be equivalent to

fabs(SvNVX(sv)) < 1.e0

except that

a) it doesn't assume that SvNVX(sv) is of type double
  (it might be long double, depending on the configuration)
b) it's clearer than Perl_fabs() [which will be long double on
  such a configuration], but one needs to know what Perl_fabs() is
c) it will not be true if SvNVX(sv) is a NaN.

I'd guess that fabs() return NaN for a Nan, but I would have had
to look that up, *and* I wouldn't trust it as much not to be
buggy. (Then again, Intel's icc compiler's default optimiser
setting is to be buggy by ignoring IEEE NaNs)

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Feb 7, 2012

From [Unknown Contact. See original ticket]

[and to the list this time]

On Tue Feb 07 06​:54​:25 2012, grian wrote​:

04.02.2012 12​:10, Nicholas Clark via RT пиÑ�еÑ�​:

- if (DD_is_integer(val)) {
+ if (DD_is_integer(val)
+ || (SvNOK(val)&& -1e10< SvNVX(sv)&& SvNVX(sv)< 1e10
+ && SvNVX(sv) == Perl_floor(SvNVX(sv)))) {

I looks that in the expression​: missed fabs() function call

(SvNOK(val)&& -1e10< SvNVX(sv)&& SvNVX(sv)< 1e10
&& SvNVX(sv) == Perl_floor(SvNVX(sv))))

I didn't think so. My intent was that

-1e10 < SvNVX(sv) && SvNVX(sv) < 1e10

should be equivalent to

fabs(SvNVX(sv)) < 1.e0

except that

a) it doesn't assume that SvNVX(sv) is of type double
  (it might be long double, depending on the configuration)
b) it's clearer than Perl_fabs() [which will be long double on
  such a configuration], but one needs to know what Perl_fabs() is
c) it will not be true if SvNVX(sv) is a NaN.

I'd guess that fabs() return NaN for a Nan, but I would have had
to look that up, *and* I wouldn't trust it as much not to be
buggy. (Then again, Intel's icc compiler's default optimiser
setting is to be buggy by ignoring IEEE NaNs)

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Feb 3, 2013

From @jkeenan

On Tue Feb 07 07​:16​:52 2012, nicholas wrote​:

[and to the list this time]

On Tue Feb 07 06​:54​:25 2012, grian wrote​:

04.02.2012 12​:10, Nicholas Clark via RT пиÑ�еÑ�​:

- if (DD_is_integer(val)) {
+ if (DD_is_integer(val)
+ || (SvNOK(val)&& -1e10< SvNVX(sv)&& SvNVX(sv)< 1e10
+ && SvNVX(sv) == Perl_floor(SvNVX(sv)))) {

I looks that in the expression​: missed fabs() function call

(SvNOK(val)&& -1e10< SvNVX(sv)&& SvNVX(sv)< 1e10
&& SvNVX(sv) == Perl_floor(SvNVX(sv))))

I didn't think so. My intent was that

-1e10 < SvNVX(sv) && SvNVX(sv) < 1e10

should be equivalent to

fabs(SvNVX(sv)) < 1.e0

except that

a) it doesn't assume that SvNVX(sv) is of type double
(it might be long double, depending on the configuration)
b) it's clearer than Perl_fabs() [which will be long double on
such a configuration], but one needs to know what Perl_fabs() is
c) it will not be true if SvNVX(sv) is a NaN.

I'd guess that fabs() return NaN for a Nan, but I would have had
to look that up, *and* I wouldn't trust it as much not to be
buggy. (Then again, Intel's icc compiler's default optimiser
setting is to be buggy by ignoring IEEE NaNs)

Nicholas Clark

Discussion in this RT petered out about twelve months ago.

Are there any changes to code or documentation which we should still be
considering?

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Jun 26, 2013

From @tonycoz

On Sun Feb 03 10​:34​:34 2013, jkeenan wrote​:

Discussion in this RT petered out about twelve months ago.

Are there any changes to code or documentation which we should still be
considering?

I've updated the status to mark the patch as rejected based on the
discussion.

Nicholas complained that​:

I don't think that we should change behaviour to use floating point
literals.
If we did this, we'd implicitly add a round trip conversion in the reading
code (parser converts decimals to binary floating point, sv_2pv() converts
binary floating point back to decimal), which may introduce errors.

but the current mechanism​:

a) already does a round trip, if the user supplied the data as a float,
presumably they expect to use it as a float in the code that uses the
value, so it will need to be converted.

b) already introduces errors​:

tony@​mars​:.../git/perl$ ./perl -Ilib -MData​::Dumper -e 'print
Dumper(1.0000000000000001)'
$VAR1 = '1';
tony@​mars​:.../git/perl$ perl -le 'print 1.0000000000001-1'
9.99200722162641e-14

Fixing that is little more complex than the supplied though - possibly
keep adding digits until values match.

Tony

@jkeenan jkeenan added dist-Data-Dumper issues in the dual-life blead-first Data-Dumper distribution and removed Severity Low distro-Linux labels Jul 5, 2021
@xenu xenu removed the affects-5.10 label Nov 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dist-Data-Dumper issues in the dual-life blead-first Data-Dumper distribution type-library
Projects
None yet
Development

No branches or pull requests

3 participants