Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

please correct perlfunc to tell when "sprintf @..." is useful #16107

Open
p5pRT opened this issue Aug 10, 2017 · 9 comments
Open

please correct perlfunc to tell when "sprintf @..." is useful #16107

p5pRT opened this issue Aug 10, 2017 · 9 comments

Comments

@p5pRT
Copy link

p5pRT commented Aug 10, 2017

Migrated from rt.perl.org#131875 (status was 'open')

Searchable as RT131875$

@p5pRT
Copy link
Author

p5pRT commented Aug 10, 2017

From perl-diddler@tlinx.org

Created by perl-diddler@tlinx.org

In perlfunc.pod, it says​:

  Unlike "printf", "sprintf" does not do what you probably mean when
  you pass it an array as your first argument. The array is given
  scalar context, and instead of using the 0th element of the array
  as the format, Perl will use the count of elements in the array as
  the format, which is almost never useful.

For a language that is describe as a DWIM language, and where the
sources uses "DWIM" to describe implemented behaviors, easily over 50
times, it seems odd or out of place to highlight a special case where
this is not true while giving no examples of how it *is* useful.

It seems it would be more useful, to know how the current behavior of
using an array with "sprintf" can be useful. As it is, it *seems* to
be documenting a known deficiency in the language without giving
a good reason, why, in a language designed to be DWIM, where possible,
the "sprintf" command isn't corrected to Do What "You probably Mean
when you pass it an array as [the] first argument".

Alternatively, if there are no examples where the current behavior is
_useful_, perhaps this behavior could altered as the text suggests to
do the behavior that the programer probably means? I believe grepping
through CPAN was used in the 5.18 timeframe to check on the impact of
some change that went in.

I tried searching through a 2.8G CPAN-author tree using zgrep and
'sprintf[\x20\x09]+@​' (using an actual space & tab instead of the hex
values shown) and found no usages. If it isn't possible to
document how the current behavior can be useful, perhaps this
behavior can be changed to "be useful" along the lines of perl's
other DWIM behaviors?

Perl Info

Flags:
    category=docs
    severity=medium

Summary of my perl5 (revision 5 version 24 subversion 0) configuration:
   
  Platform:
    osname=linux, osvers=4.10.1-isht-van, archname=x86_64-linux-multi-ld
    uname='linux ishtar 4.10.1-isht-van #1 smp preempt tue feb 28 18:57:48 pst 2017 x86_64 gnulinux '
    config_args=''
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=define
    use64bitint=define, use64bitall=define, uselongdouble=define
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2',
    optimize='-O2',
    cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong'
    ccversion='', gccversion='4.9.0', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3
    ivtype='long', ivsize=8, nvtype='long double', nvsize=16, Off_t='off_t', lseeksize=8
    alignbytes=16, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/lib64/gcc/x86_64-suse-linux/4.9/include-fixed /usr/lib /usr/local/lib /lib/../lib64 /usr/lib/../lib64 /lib /lib64 /usr/lib64
    libs=-lpthread -lnsl -lgdbm -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.19.so, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.19'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/home/perl/perl-5.24.0/usr/lib/x86_64-linux-multi-ld/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'


Characteristics of this binary (from libperl): 
  Compile-time options: HAS_TIMES MULTIPLICITY PERLIO_LAYERS
                        PERL_COPY_ON_WRITE PERL_DONT_CREATE_GVSV
                        PERL_HASH_FUNC_ONE_AT_A_TIME_HARD
                        PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP
                        PERL_PRESERVE_IVUV USE_64_BIT_ALL USE_64_BIT_INT
                        USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE
                        USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_LOCALE_TIME
                        USE_LONG_DOUBLE USE_PERLIO USE_PERL_ATOF
  Built under linux
  Compiled at Apr  2 2017 17:08:11
  %ENV:
    PERL5LIB="/home/perl/perl-5.24/lib"
  @INC:
    /home/perl/perl-5.24/lib
    /home/perl/perl-5.24.0/usr/lib/site/x86_64-linux-multi-ld
    /home/perl/perl-5.24.0/usr/lib/site
    /home/perl/perl-5.24.0/usr/lib/vendor/x86_64-linux-multi-ld
    /home/perl/perl-5.24.0/usr/lib/vendor
    /home/perl/perl-5.24.0/usr/lib/x86_64-linux-multi-ld
    /home/perl/perl-5.24.0/usr/lib
    .

Environment for perl 5.24.0
    HOME=/home/perl
    LANG (unset)
    LANGUAGE (unset)
    LC_COLLATE=C
    LC_CTYPE=en_US.UTF-8
    LC_MESSAGES=C
    LC_MONETARY=C
    LC_NUMERIC=C
    LC_TIME=C
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
		PATH=/home/perl/perl-5.24:/home/perl/perl-5.24/usr/bin:/sbin:/usr/local/bin:/usr/bin:/bin:/opt/kde3/bin:/usr/sbin
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Aug 14, 2017

From @iabyn

On Wed, Aug 09, 2017 at 10​:53​:47PM -0700, Linda Walsh wrote​:

In perlfunc.pod, it says​:

 Unlike "printf"\, "sprintf" does not do what you probably mean when
 you pass it an array as your first argument\. The array is given
 scalar context\, and instead of using the 0th element of the array
 as the format\, Perl will use the count of elements in the array as
 the format\, which is almost never useful\.

For a language that is describe as a DWIM language, and where the
sources uses "DWIM" to describe implemented behaviors, easily over 50
times, it seems odd or out of place to highlight a special case where
this is not true while giving no examples of how it *is* useful.

It seems it would be more useful, to know how the current behavior of
using an array with "sprintf" can be useful. As it is, it *seems* to
be documenting a known deficiency in the language without giving
a good reason, why, in a language designed to be DWIM, where possible,
the "sprintf" command isn't corrected to Do What "You probably Mean
when you pass it an array as [the] first argument".

The prototype for sprintf is '$@​'. This was possibly a bad choice at the
time, but that's what we've inherited.

Our two main choices are​:

1) leave it as-is.

In this case, 'sprintf @​a' gets compiled as 'sprintf scalar(@​a), ()'
which likely doesn't DWIM. Hence the warning text in the pod. We can't
add anything to the pod explaining why it would be useful, as there are
likely no circumstances where it would be useful.

Note that a couple of other ops have the '$@​' prototype too; in
particular, pack and join. So 'join @​a' doesn't DWIM either.

Arguably we could add a warning if the first arg of sprintf/pack/join is
an array.

2) Change sprintf's prototype to '@​' so that it becomes a normal list
operator. The problem with this is that it changes the first argument
from being in scalar context to list context, so for example in

  $x = sprintf my_fmt(), @​args

my_fmt() is now called in list context, and may start to return something
different.

I tried searching through a 2.8G CPAN-author tree using zgrep and
'sprintf[\x20\x09]+@​' (using an actual space & tab instead of the hex
values shown) and found no usages. If it isn't possible to
document how the current behavior can be useful, perhaps this
behavior can be changed to "be useful" along the lines of perl's
other DWIM behaviors?

These two searches

  http​://grep.cpan.me/?q=sprintf%5Cs*%40
  http​://grep.cpan.me/?q=sprintf%5Cs*%5C%28%5Cs*%40

show about 50 CPAN distributions which are (probably incorrectly) doing

  sprintf @​array ...
  sprintf(@​array ...

While a search for

  sprintf(foo(...

shows about 100 distributions. In some of those 'foo' is 'qq', but plenty
are calling real functions, and so could break if that function behaves
differently in list context.

So I don't think changing the prototype is a viable option.

--
I thought I was wrong once, but I was mistaken.

@p5pRT
Copy link
Author

p5pRT commented Aug 14, 2017

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Aug 14, 2017

From perl-diddler@tlinx.org

Dave Mitchell via RT wrote​:

The prototype for sprintf is '$@​'. This was possibly a bad choice at the
time, but that's what we've inherited.

Our two main choices are​:

1) leave it as-is.

In this case, 'sprintf @​a' gets compiled as 'sprintf scalar(@​a), ()'
which likely doesn't DWIM. Hence the warning text in the pod. We can't
add anything to the pod explaining why it would be useful, as there are
likely no circumstances where it would be useful.

Note that a couple of other ops have the '$@​' prototype too; in
particular, pack and join. So 'join @​a' doesn't DWIM either.


FWIW, sprintf stick out more, for me, as it doesn't follow fprintf
and printf. While pack and join also suffer from not being DWIM, they
haven't stuck out as much to me as they don't have parallel functions
that operate differentlyl.

Arguably we could add a warning if the first arg of sprintf/pack/join is
an array.


  While I'd prefer the functionality, the warning would be a reasonable
way of preventing

2) Change sprintf's prototype to '@​' so that it becomes a normal list
operator. The problem with this is that it changes the first argument
from being in scalar context to list context, so for example in

$x = sprintf my\_fmt\(\)\, @​args

my_fmt() is now called in list context, and may start to return something
different.

I tried searching through a 2.8G CPAN-author tree using zgrep and
'sprintf[\x20\x09]+@​' (using an actual space & tab instead of the hex
values shown) and found no usages. If it isn't possible to
document how the current behavior can be useful, perhaps this
behavior can be changed to "be useful" along the lines of perl's
other DWIM behaviors?

These two searches

http​://grep\.cpan\.me/?q=sprintf%5Cs\*%40
http​://grep\.cpan\.me/?q=sprintf%5Cs\*%5C%28%5Cs\*%40

  Never seen that "tool"/site before. Looks useful....

show about 50 CPAN distributions which are (probably incorrectly) doing

sprintf @​array \.\.\.
sprintf\(@​array \.\.\.

  I'd think it to be a good thing if an automated-type message could be
sent to the authors of those uses to examine what might be a potential
problem?

While a search for

sprintf\(foo\(\.\.\.

shows about 100 distributions. In some of those 'foo' is 'qq', but plenty
are calling real functions, and so could break if that function behaves
differently in list context.

So I don't think changing the prototype is a viable option.


  On the surface, I'd tend to agree, however, one question is --
is how sprintf operates now, how we would *like* it to operate? I.e.
Is this something a "use feature/use VERSION" might be called for in order
to eliminate the wart in the future?

  The second issue obviates the need of considering the 1st case, and that
is what could a function return in that context that would
be valid for sprintf?

  I.e. sprintf requires a format statement -- it will die at that point
if given a reference. The only other option scalar func()... could return
would be the number of elements in an array, OR the recently changed output
of SCALAR hash-ref (i.e. what was "4/14" would now be seen as "4" if I
understand recent change notes(?)).

  The only way 'foo(...)' would return something different is if it
were returning a reference for the SCALAR case OR expecting to return
the the result of a SCALAR ARRAY or SCALAR HASH.

  If it was expecting to return a SCALAR HASH -- then recent changes
for what is returned in that case would cause breakage already, no?

  In the other two cases, SCALAR ARRAY, or a reference, it's still
the case that sprintf would generate an error.

  I.e. can you think of what a function might return that would be
affected by context ($ or @​), that would be valid input for sprintf?

  If it is the case that no function can return valid input to sprintf
if the function's output is 'mutable' (dependent on scalar or list
context), then we still have a case where changing sprintf's prototype
would not affect any existing, *working*, code. No?

 

@p5pRT
Copy link
Author

p5pRT commented Aug 14, 2017

From @iabyn

On Mon, Aug 14, 2017 at 11​:57​:31AM -0700, L A Walsh wrote​:

I.e. can you think of what a function might return that would be
affected by context ($ or @​), that would be valid input for sprintf?

As a trivial (if slightly contrived) example​:

my @​formats = qw(%s%s %s-%s a=%sb=%s);
sub f {
  wantarray ? @​formats : $formats[0];
}

print scalar f(), "\n";
print f(), "\n";

which outputs

  %s%s
  %s%s%s-%sa=%sb=%s

--
You never really learn to swear until you learn to drive.

@p5pRT
Copy link
Author

p5pRT commented Aug 14, 2017

From perl-diddler@tlinx.org

Dave Mitchell via RT wrote​:

On Mon, Aug 14, 2017 at 11​:57​:31AM -0700, L A Walsh wrote​:

I.e. can you think of what a function might return that would be
affected by context ($ or @​), that would be valid input for sprintf?

As a trivial (if slightly contrived) example​:

my @​formats = qw(%s%s %s-%s a=%sb=%s);
sub f {
wantarray ? @​formats : $formats[0];
}


  Considering that we are talking about using
'f's output as a format to sprintf, and not to any, arbitrary function,
if you are going to contrive an example, you might, at least use
sprintf.

  I.e. the non-scalar case as passed to the 1st argument of
sprintf would never be called. I.e. it is the case that you
would already have "faulty" code.

If you felt such faulty code was a possibility, might not an
equally contrived case show that the recent change of
SCALAR $hash from its prior form of "NN/MM" to "NN" also could
generate an error in some form?

Admittedly, also a bit of a contrived example, but many years ago
Intel released a chip that had some simple math err in the chip.
If someone had written code with the error in mind, it would no longer
work when it was fixed. Do you think Intel gave any thought to
how people's programs, relying on a quirk in the chip's language,
would develop errors if Intel fixed it?

However, if you really believe there is a reasonable chance that
someone would use such a feature -- then, at least,
such usage could be used as an example in the perldoc as how the
current behavior could be useful in perl code.

If such usage is considered a "non-pattern", poor, or a 'bug',
then it seems deprecating the old behavior (as it encourages
obscure or broken code) and fixing the behavior in the
future would be the most reasonable step.

However, if it is desired to protect such code, then fixing the behavior
only in future versions (i.e use 'feature' or use 'PERLVERSION') would
be the other option.

Either way, it seems the current behavior is only lending it self to
either buggy code (as your greps indicate in several cases) or obscure code
that would be "more than a little" hard to maintain.

@p5pRT
Copy link
Author

p5pRT commented Aug 15, 2017

From @iabyn

On Mon, Aug 14, 2017 at 02​:28​:33PM -0700, L A Walsh wrote​:

Dave Mitchell via RT wrote​:

On Mon, Aug 14, 2017 at 11​:57​:31AM -0700, L A Walsh wrote​:

I.e. can you think of what a function might return that would be
affected by context ($ or @​), that would be valid input for sprintf?

As a trivial (if slightly contrived) example​:

my @​formats = qw(%s%s %s-%s a=%sb=%s);
sub f { wantarray ? @​formats : $formats[0];
}
----
Considering that we are talking about using
'f's output as a format to sprintf, and not to any, arbitrary function,
if you are going to contrive an example, you might, at least use
sprintf.

I.e. the non-scalar case as passed to the 1st argument of
sprintf would never be called. I.e. it is the case that you
would already have "faulty" code.

I used a non-sprintf example purely to demonstrate that a function can
can meaningfully return a valid format in scalar context, while also
capable of being used in list context.

Here's an example with sprintf​:

  my @​formats = qw(%s%s %s-%s a=%sb=%s);
  sub formats {
  wantarray ? @​formats : $formats[0];
  }

  sub simple_output {
  sprintf formats(), @​_;
  }

  sub complex_output {
  my $selector = shift;
  my ($format) = grep /\Q$selector/, formats();
  sprintf $format, @​_;
  }

  $x = simple_output("a", "b");
  $y = complex_output("=", "a", "b");

If you felt such faulty code was a possibility,

I don't think my code examples are faulty.

might not an
equally contrived case show that the recent change of
SCALAR $hash from its prior form of "NN/MM" to "NN" also could
generate an error in some form?

Any non-backwardly compatible change we make requires a value judgement
of the benefits versus the risks. It's possible to reach different
decisions for different circumstances.

Admittedly, also a bit of a contrived example, but many years ago
Intel released a chip that had some simple math err in the chip.
If someone had written code with the error in mind, it would no longer
work when it was fixed. Do you think Intel gave any thought to
how people's programs, relying on a quirk in the chip's language,
would develop errors if Intel fixed it?

The two are not alike. The perl case is documented behaviour, which
like many things in perl, can produce surprising effects if the coder is
not careful. The pentium issue was a bug. No one in their right mind would
have written code that needed the bug to be present in order to function.
At best, code may have probed for the bug being present, and if so,
worked around it.

However, if you really believe there is a reasonable chance that
someone would use such a feature -- then, at least,
such usage could be used as an example in the perldoc as how the
current behavior could be useful in perl code.

I'm not saying that the current behaviour is particularly useful, merely
that it's not a bug, and that therefore we should consider carefully before
changing its behaviour. As such there is no need to add an example to
perldoc.

If such usage is considered a "non-pattern", poor, or a 'bug',
then it seems deprecating the old behavior (as it encourages
obscure or broken code) and fixing the behavior in the
future would be the most reasonable step.

However, if it is desired to protect such code, then fixing the behavior
only in future versions (i.e use 'feature' or use 'PERLVERSION') would
be the other option.

Either way, it seems the current behavior is only lending it self to
either buggy code (as your greps indicate in several cases) or obscure code
that would be "more than a little" hard to maintain.

There is a third option​: to recognise that the first argument being in
scalar context is documented behaviour and that some code may rely on it,
and leave it as is. But to recognise specifically that having an array
as the first argument indicates a coder error, and start to emit a
warning in that case.

I am in favour of adding a warning.
I am not in favour of changing the prototype of sprintf, with or without
the protection of a version or 'use feature' guard.

--
I thought I was wrong once, but I was mistaken.

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2017

From zefram@fysh.org

There is nothing to change here. sprintf with an array as first argument
is not useful, but scalar context is useful and, more importantly,
likely to have code relying on it. There's nothing more to say in the
documentation, and we can't change the behaviour for backcompat reasons.
This ticket should be closed.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2017

From @iabyn

On Fri, Dec 15, 2017 at 07​:19​:54AM +0000, Zefram wrote​:

There is nothing to change here. sprintf with an array as first argument
is not useful, but scalar context is useful and, more importantly,
likely to have code relying on it. There's nothing more to say in the
documentation, and we can't change the behaviour for backcompat reasons.
This ticket should be closed.

I still intend at some point to add a warning when the first arg of
sprintf is an array, as in

  sprintf @​foo ...

since that almost certainly represents a coding error.

--
You're only as old as you look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants