Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data::Dumper gets string lengths wrong when the utf8 flag is set #15416

Closed
p5pRT opened this issue Jul 2, 2016 · 11 comments
Closed

Data::Dumper gets string lengths wrong when the utf8 flag is set #15416

p5pRT opened this issue Jul 2, 2016 · 11 comments
Labels
dist-Data-Dumper issues in the dual-life blead-first Data-Dumper distribution

Comments

@p5pRT
Copy link

p5pRT commented Jul 2, 2016

Migrated from rt.perl.org#128524 (status was 'resolved')

Searchable as RT128524$

@p5pRT
Copy link
Author

p5pRT commented Jul 2, 2016

From @karenetheridge

seen in both Data​::Dumper 2.154 and 2.160, perl 5.24.0 and perl 5.25.0.

Here is a self-contained reproduction case showing that string lengths are being counted wrong​:

use strict;
use warnings;
 
my $runtime = 'runtime';
my $requires = 'requires';
 
utf8​::upgrade($runtime);
utf8​::upgrade($requires);
 
my $data = {
  $runtime => {
  $requires => { 'foo' => 'bar' }
  }
};
 
use Data​::Dumper;
 
print Data​::Dumper->new( [$data], ['x'] )->Purity(1)->Sortkeys(1)->Terse(0)->Dump() . "\n";

__END__
got​:
$x = {
  'runtime' => {
  'requires' => {
  'foo' => 'bar'
  }
  }
  };

but the output should instead be​:
$x = {
  'runtime' => {
  'requires' => {
  'foo' => 'bar'
  }
  }
  };

@p5pRT
Copy link
Author

p5pRT commented Jul 3, 2016

From @jkeenan

On Sat Jul 02 15​:00​:44 2016, ether wrote​:

seen in both Data​::Dumper 2.154 and 2.160, perl 5.24.0 and perl
5.25.0.

Here is a self-contained reproduction case showing that string lengths
are being counted wrong​:

use strict;
use warnings;

my $runtime = 'runtime';
my $requires = 'requires';

utf8​::upgrade($runtime);
utf8​::upgrade($requires);

my $data = {
$runtime => {
$requires => { 'foo' => 'bar' }
}
};

use Data​::Dumper;

print Data​::Dumper->new( [$data], ['x'] )->Purity(1)->Sortkeys(1)-

Terse(0)->Dump() . "\n";

__END__
got​:
$x = {
'runtime' => {
'requires' => {
'foo'
=> 'bar'
}
}
};

but the output should instead be​:
$x = {
'runtime' => {
'requires' => {
'foo'
=> 'bar'
}
}
};

The difference created by executing 'utf8​::upgrade' on the strings can be demonstrated in even simpler versions of Data​::Dumper calls. See attached file 'saydumper.pl'.

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Jul 3, 2016

From @jkeenan

saydumper.pl

@p5pRT
Copy link
Author

p5pRT commented Jul 3, 2016

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jul 3, 2016

From @jkeenan

On Sun Jul 03 06​:39​:25 2016, jkeenan wrote​:

On Sat Jul 02 15​:00​:44 2016, ether wrote​:
[snip]

The difference created by executing 'utf8​::upgrade' on the strings can
be demonstrated in even simpler versions of Data​::Dumper calls. See
attached file 'saydumper.pl'.

Also, the above was observed as far back as perl-5.10.1.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Jul 13, 2016

From @tonycoz

On Sat Jul 02 15​:00​:44 2016, ether wrote​:

seen in both Data​::Dumper 2.154 and 2.160, perl 5.24.0 and perl
5.25.0.

Here is a self-contained reproduction case showing that string lengths
are being counted wrong​:

The problem was more that the key length was always being used rather then the
quoted key length.

The attached fixes it for me.

Tony

@p5pRT
Copy link
Author

p5pRT commented Jul 13, 2016

From @tonycoz

0001-perl-128524-correct-indentation-for-utf-8-key-hash-e.patch
From ace8b39f6f00f1de15be195bf2d36a02dcc93dc6 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Wed, 13 Jul 2016 15:48:52 +1000
Subject: (perl #128524) correct indentation for utf-8 key hash elements

---
 dist/Data-Dumper/Dumper.xs |  4 ++--
 dist/Data-Dumper/t/bugs.t  | 32 +++++++++++++++++++++++++++++++-
 2 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/dist/Data-Dumper/Dumper.xs b/dist/Data-Dumper/Dumper.xs
index 8220241..0dc7699 100644
--- a/dist/Data-Dumper/Dumper.xs
+++ b/dist/Data-Dumper/Dumper.xs
@@ -886,7 +886,6 @@ DD_dump(pTHX_ SV *val, const char *name, STRLEN namelen, SV *retval, HV *seenhv,
 	    SV *sname;
 	    HE *entry = NULL;
 	    char *key;
-	    STRLEN klen;
 	    SV *hval;
 	    AV *keys = NULL;
 	
@@ -976,6 +975,7 @@ DD_dump(pTHX_ SV *val, const char *name, STRLEN namelen, SV *retval, HV *seenhv,
                 char *nkey_buffer = NULL;
                 STRLEN nticks = 0;
 		SV* keysv;
+                STRLEN klen;
 		STRLEN keylen;
                 STRLEN nlen;
 		bool do_utf8 = FALSE;
@@ -1029,7 +1029,7 @@ DD_dump(pTHX_ SV *val, const char *name, STRLEN namelen, SV *retval, HV *seenhv,
                 if (style->quotekeys || key_needs_quote(key,keylen)) {
                     if (do_utf8 || style->useqq) {
                         STRLEN ocur = SvCUR(retval);
-                        nlen = esc_q_utf8(aTHX_ retval, key, klen, do_utf8, style->useqq);
+                        klen = nlen = esc_q_utf8(aTHX_ retval, key, klen, do_utf8, style->useqq);
                         nkey = SvPVX(retval) + ocur;
                     }
                     else {
diff --git a/dist/Data-Dumper/t/bugs.t b/dist/Data-Dumper/t/bugs.t
index a440b0a..b3c1a17 100644
--- a/dist/Data-Dumper/t/bugs.t
+++ b/dist/Data-Dumper/t/bugs.t
@@ -12,7 +12,7 @@ BEGIN {
 }
 
 use strict;
-use Test::More tests => 15;
+use Test::More tests => 23;
 use Data::Dumper;
 
 {
@@ -144,4 +144,34 @@ SKIP: {
   &$tests;
 }
 
+{ # https://rt.perl.org/Ticket/Display.html?id=128524
+    my $want = <<'EOW';
+$VAR1 = {
+          'runtime' => {
+                         'requires' => {
+                                         'foo' => 'bar'
+                                       }
+                       }
+        };
+EOW
+    my @want = split /\n/, $want;
+    my $runtime = "runtime";
+    my $requires = "requires";
+    utf8::upgrade(my $uruntime = $runtime);
+    utf8::upgrade(my $urequires = $requires);
+    for my $run ($runtime, $uruntime) {
+        for my $req ($requires, $urequires) {
+            my $data = { $run => { $req => { foo => "bar" } } };
+            local $Data::Dumper::Useperl = 1;
+            is(Dumper( $data ), $want, "utf-8 indents");
+            {
+                defined &Data::Dumper::Dumpxs
+                  or skip "No XS available", 1;
+                local $Data::Dumper::Useperl = 0;
+                is(Dumper( $data ), $want, "utf8-indents");
+            }
+        }
+    }
+}
+
 # EOF
-- 
2.1.4

@p5pRT
Copy link
Author

p5pRT commented Jul 20, 2016

From @tonycoz

On Tue Jul 12 22​:50​:11 2016, tonyc wrote​:

On Sat Jul 02 15​:00​:44 2016, ether wrote​:

seen in both Data​::Dumper 2.154 and 2.160, perl 5.24.0 and perl
5.25.0.

Here is a self-contained reproduction case showing that string
lengths
are being counted wrong​:

The problem was more that the key length was always being used rather
then the
quoted key length.

The attached fixes it for me.

I've adjusted the test to skip properly and I think made the test less fragile.

Applied as 3a3625f.

Tony

@p5pRT
Copy link
Author

p5pRT commented Jul 20, 2016

@tonycoz - Status changed from 'open' to 'pending release'

@p5pRT
Copy link
Author

p5pRT commented May 30, 2017

From @khwilliamson

Thank you for filing this report. You have helped make Perl better.

With the release today of Perl 5.26.0, this and 210 other issues have been
resolved.

Perl 5.26.0 may be downloaded via​:
https://metacpan.org/release/XSAWYERX/perl-5.26.0

If you find that the problem persists, feel free to reopen this ticket.

@p5pRT p5pRT closed this as completed May 30, 2017
@p5pRT
Copy link
Author

p5pRT commented May 30, 2017

@khwilliamson - Status changed from 'pending release' to 'resolved'

@jkeenan jkeenan added the dist-Data-Dumper issues in the dual-life blead-first Data-Dumper distribution label Jul 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dist-Data-Dumper issues in the dual-life blead-first Data-Dumper distribution
Projects
None yet
Development

No branches or pull requests

2 participants