Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data::Dumper Feature Request: Tighter packing options #15071

Open
p5pRT opened this issue Dec 5, 2015 · 8 comments
Open

Data::Dumper Feature Request: Tighter packing options #15071

p5pRT opened this issue Dec 5, 2015 · 8 comments
Labels
Closable? We might be able to close this ticket, but we need to check with the reporter dist-Data-Dumper issues in the dual-life blead-first Data-Dumper distribution Wishlist

Comments

@p5pRT
Copy link

p5pRT commented Dec 5, 2015

Migrated from rt.perl.org#126814 (status was 'open')

Searchable as RT126814$

@p5pRT
Copy link
Author

p5pRT commented Dec 5, 2015

From @kentfredric

Data​::Dumper has long annoyed me with even its terse forms being excessively terse, and terse+1 being longer than is useful.

Take for instance, the output dumped from this structure​:

  [
  map { [ map { int ( rand() * 10 ) } 0 .. 4 ] } 0 .. 4
  ]

Data​::Dumper gives you 2 choices, Indent=1 gives you 37 lines, Indent=0 gives you one line. Neither are very good for readability.

It could be possibly improved with an option to pack it instead like this​:

[
  [2, 9, 7, 9, 9],
  [3, 4, 1, 6, 6],
  [7, 0, 3, 2, 4],
  [1, 5, 0, 0, 5],
  [4, 6, 7, 2, 6],
]

Which is how Data​::Dump formats it.

It seems to have a threshold where it trips each structure at​:

For instance, these 2 structures give 2 slightly different results​:

  {
  alpha => [ 2, 3, 1, 4, 5, 9, 1, 10 ],
  beta => [ 1, 3, 2, 4, 5, 9, 1, 10 ],
  },
  {
  alpha => [ 2, 3, 1, 4, 5, 9, 1, ],
  beta => [ 1, 3, 2, 4, 5, 9, 1, ],
  },

The first becomes​:

{
  alpha => [2, 3, 1, 4, 5, 9, 1, 10],
  beta => [1, 3, 2, 4, 5, 9, 1, 10],
}

The Second becomes​:

{ alpha => [2, 3, 1, 4, 5, 9, 1], beta => [1, 3, 2, 4, 5, 9, 1] }

And that's a behaviour I think is worth emulating.

However, Both Data​::Dump and Data​::Dumper break down and produce large screens of data with large arrays, and I think we can do better.

  [ sort { rand() <=> rand() } 1 .. 40 ],

This could possibly be represented in terms of

[
  20 items, , , , ,
  20 items, , , , ,
]

Similar opportunities exist for hash structures​:

With few enough items​:

{ a => 1, b => 1, c => 1, d => 1, e => 1, f => 1, g => 1, h => 1 }

With lots of items​:

{
  a => 1, b => 1, c => 1, d => 1, e => 1, f => 1, g => 1, h => 1,
  i => 1, j => 1, k => 1, l => 1, m => 1, n => 1, o => 1, p => 1,
  q => 1, r => 1, s => 1, t => 1, u => 1, v => 1, w => 1, x => 1,
  y => 1, z => 1,
}

It may be better to have 2 control options to seperate condensing
arrays and hashes, like this, idk.

Though of my last three requests, this is the one I imagine
has the most complexity and the least likely to be possible in DD.


Flags​:
  category=library
  severity=wishlist
  module=Data​::Dumper


Site configuration information for perl 5.22.0​:

Configured by kent at Fri Jun 19 08​:03​:55 NZST 2015.

Summary of my perl5 (revision 5 version 22 subversion 0) configuration​:
 
  Platform​:
  osname=linux, osvers=4.0.0-gentoo, archname=x86_64-linux
  uname='linux katipo2 4.0.0-gentoo #23 smp preempt sat apr 25 06​:58​:21 nzst 2015 x86_64 intel(r) core(tm) i5-2410m cpu @​ 2.30ghz genuineintel gnulinux '
  config_args='-de -Dprefix=/home/kent/perl5/perlbrew/perls/5.22.0 -Dusecbacktrace -Doptimize= -fno-stack-protector -O3 -march=native -mtune=native -Dman1dir=none -Dman3dir=none -Accflags= -fno-stack-protector -DPERL_HASH_FUNC_SDBM -DUSE_C_BACKTRACE_ON_ERROR -Aldflags= -fno-stack-protector -lbfd -Aeval​:scriptdir=/home/kent/perl5/perlbrew/perls/5.22.0/bin'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=undef, usemultiplicity=undef
  use64bitint=define, use64bitall=define, uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-fno-stack-protector -DPERL_HASH_FUNC_SDBM -DUSE_C_BACKTRACE_ON_ERROR -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -DUSE_C_BACKTRACE -g -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize=' -fno-stack-protector -O3 -march=native -mtune=native',
  cppflags='-fno-stack-protector -DPERL_HASH_FUNC_SDBM -DUSE_C_BACKTRACE_ON_ERROR -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong'
  ccversion='', gccversion='4.9.2', gccosandvers=''
  intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3
  ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags =' -fno-stack-protector -lbfd -fstack-protector-strong -L/usr/local/lib'
  libpth=/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/include-fixed /usr/lib /usr/local/lib /lib/../lib64 /usr/lib/../lib64 /lib /lib64 /usr/lib64 /usr/local/lib64
  libs=-lpthread -lnsl -lnm -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
  perllibs=-lpthread -lnsl -lnm -ldl -lm -lcrypt -lutil -lc
  libc=libc-2.20.so, so=so, useshrplib=false, libperl=libperl.a
  gnulibc_version='2.20'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
  cccdlflags='-fPIC', lddlflags='-shared -fno-stack-protector -O3 -march=native -mtune=native -L/usr/local/lib -fstack-protector-strong'


@​INC for perl 5.22.0​:
  /home/kent/perl5/perlbrew/perls/5.22.0/lib/site_perl/5.22.0/x86_64-linux
  /home/kent/perl5/perlbrew/perls/5.22.0/lib/site_perl/5.22.0
  /home/kent/perl5/perlbrew/perls/5.22.0/lib/5.22.0/x86_64-linux
  /home/kent/perl5/perlbrew/perls/5.22.0/lib/5.22.0
  .


Environment for perl 5.22.0​:
  HOME=/home/kent
  LANG (unset)
  LANGUAGE (unset)
  LC_CTYPE=en_NZ.UTF8
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)
  PATH=/home/kent/perl5/perlbrew/bin​:/home/kent/perl5/perlbrew/perls/5.22.0/bin​:/home/kent/.perl6/2013.04/bin​:/home/kent/.gem/ruby/1.8/bin/​:/home/kent/.rvm/gems/ruby-2.1.2/bin​:/home/kent/.rvm/gems/ruby-2.1.2@​global/bin​:/home/kent/.rvm/rubies/ruby-2.1.2/bin​:/usr/local/bin​:/usr/bin​:/bin​:/opt/bin​:/usr/x86_64-pc-linux-gnu/gcc-bin/5.2.0​:/opt/android-sdk-update-manager/tools​:/opt/android-sdk-update-manager/platform-tools​:/usr/games/bin​:/home/kent/.rvm/bin​:/home/kent/.rvm/bin
  PERLBREW_BASHRC_VERSION=0.72
  PERLBREW_HOME=/home/kent/.perlbrew
  PERLBREW_MANPATH=/home/kent/perl5/perlbrew/perls/5.22.0/man
  PERLBREW_PATH=/home/kent/perl5/perlbrew/bin​:/home/kent/perl5/perlbrew/perls/5.22.0/bin
  PERLBREW_PERL=5.22.0
  PERLBREW_ROOT=/home/kent/perl5/perlbrew
  PERLBREW_VERSION=0.72
  PERL_BADLANG (unset)
  SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Sep 3, 2017

From @jkeenan

On Sat, 05 Dec 2015 02​:33​:42 GMT, kentfredric wrote​:

Data​::Dumper has long annoyed me with even its terse forms being
excessively terse, and terse+1 being longer than is useful.

Take for instance, the output dumped from this structure​:

[
map { [ map { int ( rand() * 10 ) } 0 .. 4 ] } 0 .. 4
]

Data​::Dumper gives you 2 choices, Indent=1 gives you 37 lines,
Indent=0 gives you one line. Neither are very good for readability.

It could be possibly improved with an option to pack it instead like
this​:

[
[2, 9, 7, 9, 9],
[3, 4, 1, 6, 6],
[7, 0, 3, 2, 4],
[1, 5, 0, 0, 5],
[4, 6, 7, 2, 6],
]

Which is how Data​::Dump formats it.

It seems to have a threshold where it trips each structure at​:

For instance, these 2 structures give 2 slightly different results​:

{
alpha => [ 2, 3, 1, 4, 5, 9, 1, 10 ],
beta => [ 1, 3, 2, 4, 5, 9, 1, 10 ],
},
{
alpha => [ 2, 3, 1, 4, 5, 9, 1, ],
beta => [ 1, 3, 2, 4, 5, 9, 1, ],
},

The first becomes​:

{
alpha => [2, 3, 1, 4, 5, 9, 1, 10],
beta => [1, 3, 2, 4, 5, 9, 1, 10],
}

The Second becomes​:

{ alpha => [2, 3, 1, 4, 5, 9, 1], beta => [1, 3, 2, 4, 5, 9, 1] }

And that's a behaviour I think is worth emulating.

However, Both Data​::Dump and Data​::Dumper break down and produce large
screens of data with large arrays, and I think we can do better.

[ sort { rand() <=> rand() } 1 .. 40 ],

This could possibly be represented in terms of

[
20 items, , , , ,
20 items, , , , ,
]

Similar opportunities exist for hash structures​:

With few enough items​:

{ a => 1, b => 1, c => 1, d => 1, e => 1, f => 1, g => 1, h => 1 }

With lots of items​:

{
a => 1, b => 1, c => 1, d => 1, e => 1, f => 1, g => 1, h => 1,
i => 1, j => 1, k => 1, l => 1, m => 1, n => 1, o => 1, p => 1,
q => 1, r => 1, s => 1, t => 1, u => 1, v => 1, w => 1, x => 1,
y => 1, z => 1,
}

It may be better to have 2 control options to seperate condensing
arrays and hashes, like this, idk.

Though of my last three requests, this is the one I imagine
has the most complexity and the least likely to be possible in DD.

This ticket has received no comments or replies since it was first posted in December 2015. Lack of comment on a new feature request usually means that people have (silently) made the judgment​: "The benefit of this feature does not outweigh the cost of creating and maintaining it."

I suspect that's the case here. Data​::Dumper poses maintenance problems because it has both "pure perl" and XS versions. New features would have to be implemented both ways. The people who can grok the XS code only have the time to keep up with critical bugs. I doubt they have time to implement additional features -- and then maintain them.

In this situation you're more likely to get the new functionality if you do it first as a new CPAN distribution. If that CPAN distribution is well received, then perhaps it could be brought into core in future years.

For that reason I recommend that we close this ticket.

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Sep 3, 2017

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Sep 3, 2017

From @kentfredric

On 3 September 2017 at 12​:16, James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

In this situation you're more likely to get the new functionality if you do it first as a new CPAN distribution. If that CPAN distribution is well received, then perhaps it could be brought into core in future years.

The given behaviour I describe already exists in a competing CPAN
distribution, one I named explicitly in the bug report. The only
reason I filed the bug report is because the functionality has, over
time, proven its worth to me, and I sought to reduce the competition
by bringing this proven feature into the fold.

--
Kent

KENTNL - https://metacpan.org/author/KENTNL

@p5pRT
Copy link
Author

p5pRT commented Sep 4, 2017

From @iabyn

On Sat, Sep 02, 2017 at 05​:16​:00PM -0700, James E Keenan via RT wrote​:

This ticket has received no comments or replies since it was first
posted in December 2015. Lack of comment on a new feature request
usually means that people have (silently) made the judgment​: "The
benefit of this feature does not outweigh the cost of creating and
maintaining it."

Or equally likely​: the ticket got forgotten; or someone thinks the idea is
good but hasn't had time to do it yet; or people think the idea is good
but its not their area of expertise.

Etc.

I think wishlist tickets should be kept open unless we have explicitly
rejected a suggestion.

--
No matter how many dust sheets you use, you will get paint on the carpet.

@p5pRT
Copy link
Author

p5pRT commented Sep 4, 2017

From @demerphq

This feature request is relatively easy to implement as a regex post
processor. Dumper output is sufficiently well defined and stable that this
does not constitute parsing perl with regexes.

At work in fact we already have done something very similar. I will
investigate releasing said code in about a week.

Yves

On 4 Sep 2017 11​:32, "Dave Mitchell" <davem@​iabyn.com> wrote​:

On Sat, Sep 02, 2017 at 05​:16​:00PM -0700, James E Keenan via RT wrote​:

This ticket has received no comments or replies since it was first
posted in December 2015. Lack of comment on a new feature request
usually means that people have (silently) made the judgment​: "The
benefit of this feature does not outweigh the cost of creating and
maintaining it."

Or equally likely​: the ticket got forgotten; or someone thinks the idea is
good but hasn't had time to do it yet; or people think the idea is good
but its not their area of expertise.

Etc.

I think wishlist tickets should be kept open unless we have explicitly
rejected a suggestion.

--
No matter how many dust sheets you use, you will get paint on the carpet.

@jkeenan jkeenan added dist-Data-Dumper issues in the dual-life blead-first Data-Dumper distribution Wishlist and removed Severity Low labels Jul 5, 2021
@jkeenan
Copy link
Contributor

jkeenan commented Jul 5, 2021

From @demerphq

This feature request is relatively easy to implement as a regex post
processor. Dumper output is sufficiently well defined and stable that this
does not constitute parsing perl with regexes.

At work in fact we already have done something very similar. I will
investigate releasing said code in about a week.

Yves

@demerphq, did that code ever get released?

If not, I think this ticket is closable.

On 4 Sep 2017 11​:32, "Dave Mitchell" <davem@​iabyn.com> wrote​:

On Sat, Sep 02, 2017 at 05​:16​:00PM -0700, James E Keenan via RT wrote​:

This ticket has received no comments or replies since it was first
posted in December 2015. Lack of comment on a new feature request
usually means that people have (silently) made the judgment​: "The
benefit of this feature does not outweigh the cost of creating and
maintaining it."

Or equally likely​: the ticket got forgotten; or someone thinks the idea is
good but hasn't had time to do it yet; or people think the idea is good
but its not their area of expertise.
Etc.
I think wishlist tickets should be kept open unless we have explicitly
rejected a suggestion.

No matter how many dust sheets you use, you will get paint on the carpet.

@jkeenan jkeenan added the Closable? We might be able to close this ticket, but we need to check with the reporter label Jul 5, 2021
@demerphq
Copy link
Collaborator

demerphq commented Jul 6, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closable? We might be able to close this ticket, but we need to check with the reporter dist-Data-Dumper issues in the dual-life blead-first Data-Dumper distribution Wishlist
Projects
None yet
Development

No branches or pull requests

3 participants