Skip Menu |
Report information
Id: 131875
Status: open
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: LAWalsh <perl-diddler [at] tlinx.org>
Cc:
AdminCc:

Operating System: Linux
PatchStatus: (no value)
Severity: medium
Type: docs
Perl Version: (no value)
Fixed In: (no value)



To: perlbug [...] perl.org
Date: Wed, 9 Aug 2017 22:25:41 -0700
Subject: please correct perlfunc to tell when "sprintf @..." is useful
From: perl-diddler [...] tlinx.org
Download (untitled) / with headers
text/plain 5.1k
This is a bug report for perl from perl-diddler@tlinx.org, generated with the help of perlbug 1.39 running under perl 5.16.3. ----------------------------------------------------------------- [Please describe your issue here] In perlfunc.pod, it says: Unlike "printf", "sprintf" does not do what you probably mean when you pass it an array as your first argument. The array is given scalar context, and instead of using the 0th element of the array as the format, Perl will use the count of elements in the array as the format, which is almost never useful. For a language that is describe as a DWIM language, and where the sources uses "DWIM" to describe implemented behaviors, easily over 50 times, it seems odd or out of place to highlight a special case where this is not true while giving no examples of how it *is* useful. It seems it would be more useful, to know how the current behavior of using an array with "sprintf" can be useful. As it is, it *seems* to be documenting a known deficiency in the language without giving a good reason, why, in a language designed to be DWIM, where possible, the "sprintf" command isn't corrected to Do What "You probably Mean when you pass it an array as [the] first argument". Alternatively, if there are no examples where the current behavior is _useful_, perhaps this behavior could altered as the text suggests to do the behavior that the programer probably means? I believe grepping through CPAN was used in the 5.18 timeframe to check on the impact of some change that went in. I tried searching through a 2.8G CPAN-author tree using zgrep and 'sprintf[\x20\x09]+@' (using an actual space & tab instead of the hex values shown) and found no usages. If it isn't possible to document how the current behavior can be useful, perhaps this behavior can be changed to "be useful" along the lines of perl's other DWIM behaviors? [Please do not change anything below this line] ----------------------------------------------------------------- --- Flags: category=docs severity=medium --- Summary of my perl5 (revision 5 version 24 subversion 0) configuration: Platform: osname=linux, osvers=4.10.1-isht-van, archname=x86_64-linux-multi-ld uname='linux ishtar 4.10.1-isht-van #1 smp preempt tue feb 28 18:57:48 pst 2017 x86_64 gnulinux ' config_args='' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=define use64bitint=define, use64bitall=define, uselongdouble=define usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2', optimize='-O2', cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong' ccversion='', gccversion='4.9.0', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3 ivtype='long', ivsize=8, nvtype='long double', nvsize=16, Off_t='off_t', lseeksize=8 alignbytes=16, prototype=define Linker and Libraries: ld='gcc', ldflags =' -fstack-protector-strong -L/usr/local/lib' libpth=/usr/lib64/gcc/x86_64-suse-linux/4.9/include-fixed /usr/lib /usr/local/lib /lib/../lib64 /usr/lib/../lib64 /lib /lib64 /usr/lib64 libs=-lpthread -lnsl -lgdbm -ldl -lm -lcrypt -lutil -lc -lgdbm_compat perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc libc=libc-2.19.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.19' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/home/perl/perl-5.24.0/usr/lib/x86_64-linux-multi-ld/CORE' cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong' Characteristics of this binary (from libperl): Compile-time options: HAS_TIMES MULTIPLICITY PERLIO_LAYERS PERL_COPY_ON_WRITE PERL_DONT_CREATE_GVSV PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP PERL_PRESERVE_IVUV USE_64_BIT_ALL USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_LOCALE_TIME USE_LONG_DOUBLE USE_PERLIO USE_PERL_ATOF Built under linux Compiled at Apr 2 2017 17:08:11 %ENV: PERL5LIB="/home/perl/perl-5.24/lib" @INC: /home/perl/perl-5.24/lib /home/perl/perl-5.24.0/usr/lib/site/x86_64-linux-multi-ld /home/perl/perl-5.24.0/usr/lib/site /home/perl/perl-5.24.0/usr/lib/vendor/x86_64-linux-multi-ld /home/perl/perl-5.24.0/usr/lib/vendor /home/perl/perl-5.24.0/usr/lib/x86_64-linux-multi-ld /home/perl/perl-5.24.0/usr/lib . --- Environment for perl 5.24.0 HOME=/home/perl LANG (unset) LANGUAGE (unset) LC_COLLATE=C LC_CTYPE=en_US.UTF-8 LC_MESSAGES=C LC_MONETARY=C LC_NUMERIC=C LC_TIME=C LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/perl/perl-5.24:/home/perl/perl-5.24/usr/bin:/sbin:/usr/local/bin:/usr/bin:/bin:/opt/kde3/bin:/usr/sbin PERL_BADLANG (unset) SHELL=/bin/bash
Subject: Re: [perl #131875] please correct perlfunc to tell when "sprintf @..." is useful
From: Dave Mitchell <davem [...] iabyn.com>
To: perl5-porters [...] perl.org
Date: Mon, 14 Aug 2017 09:28:14 +0100
Download (untitled) / with headers
text/plain 2.9k
On Wed, Aug 09, 2017 at 10:53:47PM -0700, Linda Walsh wrote: Show quoted text
> In perlfunc.pod, it says: > > Unlike "printf", "sprintf" does not do what you probably mean when > you pass it an array as your first argument. The array is given > scalar context, and instead of using the 0th element of the array > as the format, Perl will use the count of elements in the array as > the format, which is almost never useful. > > For a language that is describe as a DWIM language, and where the > sources uses "DWIM" to describe implemented behaviors, easily over 50 > times, it seems odd or out of place to highlight a special case where > this is not true while giving no examples of how it *is* useful. > > It seems it would be more useful, to know how the current behavior of > using an array with "sprintf" can be useful. As it is, it *seems* to > be documenting a known deficiency in the language without giving > a good reason, why, in a language designed to be DWIM, where possible, > the "sprintf" command isn't corrected to Do What "You probably Mean > when you pass it an array as [the] first argument".
The prototype for sprintf is '$@'. This was possibly a bad choice at the time, but that's what we've inherited. Our two main choices are: 1) leave it as-is. In this case, 'sprintf @a' gets compiled as 'sprintf scalar(@a), ()' which likely doesn't DWIM. Hence the warning text in the pod. We can't add anything to the pod explaining why it would be useful, as there are likely no circumstances where it would be useful. Note that a couple of other ops have the '$@' prototype too; in particular, pack and join. So 'join @a' doesn't DWIM either. Arguably we could add a warning if the first arg of sprintf/pack/join is an array. 2) Change sprintf's prototype to '@' so that it becomes a normal list operator. The problem with this is that it changes the first argument from being in scalar context to list context, so for example in $x = sprintf my_fmt(), @args my_fmt() is now called in list context, and may start to return something different. Show quoted text
> I tried searching through a 2.8G CPAN-author tree using zgrep and > 'sprintf[\x20\x09]+@' (using an actual space & tab instead of the hex > values shown) and found no usages. If it isn't possible to > document how the current behavior can be useful, perhaps this > behavior can be changed to "be useful" along the lines of perl's > other DWIM behaviors?
These two searches http://grep.cpan.me/?q=sprintf%5Cs*%40 http://grep.cpan.me/?q=sprintf%5Cs*%5C%28%5Cs*%40 show about 50 CPAN distributions which are (probably incorrectly) doing sprintf @array ... sprintf(@array ... While a search for sprintf(foo(... shows about 100 distributions. In some of those 'foo' is 'qq', but plenty are calling real functions, and so could break if that function behaves differently in list context. So I don't think changing the prototype is a viable option. -- I thought I was wrong once, but I was mistaken.
Subject: Re: [perl #131875] please correct perlfunc to tell when "sprintf @..." is useful
From: L A Walsh <perl-diddler [...] tlinx.org>
To: perlbug-followup [...] perl.org
Date: Mon, 14 Aug 2017 11:57:31 -0700
Dave Mitchell via RT wrote: Show quoted text
> The prototype for sprintf is '$@'. This was possibly a bad choice at the > time, but that's what we've inherited. > > Our two main choices are: > > 1) leave it as-is. > > In this case, 'sprintf @a' gets compiled as 'sprintf scalar(@a), ()' > which likely doesn't DWIM. Hence the warning text in the pod. We can't > add anything to the pod explaining why it would be useful, as there are > likely no circumstances where it would be useful. > > Note that a couple of other ops have the '$@' prototype too; in > particular, pack and join. So 'join @a' doesn't DWIM either. >
---- FWIW, sprintf stick out more, for me, as it doesn't follow fprintf and printf. While pack and join also suffer from not being DWIM, they haven't stuck out as much to me as they don't have parallel functions that operate differentlyl. Show quoted text
> Arguably we could add a warning if the first arg of sprintf/pack/join is > an array. >
--- While I'd prefer the functionality, the warning would be a reasonable way of preventing Show quoted text
> 2) Change sprintf's prototype to '@' so that it becomes a normal list > operator. The problem with this is that it changes the first argument > from being in scalar context to list context, so for example in > > $x = sprintf my_fmt(), @args > > my_fmt() is now called in list context, and may start to return something > different. > >
>> I tried searching through a 2.8G CPAN-author tree using zgrep and >> 'sprintf[\x20\x09]+@' (using an actual space & tab instead of the hex >> values shown) and found no usages. If it isn't possible to >> document how the current behavior can be useful, perhaps this >> behavior can be changed to "be useful" along the lines of perl's >> other DWIM behaviors? >>
> > These two searches > > http://grep.cpan.me/?q=sprintf%5Cs*%40 > http://grep.cpan.me/?q=sprintf%5Cs*%5C%28%5Cs*%40 >
---- Never seen that "tool"/site before. Looks useful.... Show quoted text
> show about 50 CPAN distributions which are (probably incorrectly) doing > > sprintf @array ... > sprintf(@array ... >
---- I'd think it to be a good thing if an automated-type message could be sent to the authors of those uses to examine what might be a potential problem? Show quoted text
> While a search for > > sprintf(foo(... > > shows about 100 distributions. In some of those 'foo' is 'qq', but plenty > are calling real functions, and so could break if that function behaves > differently in list context. > > So I don't think changing the prototype is a viable option. >
----- On the surface, I'd tend to agree, however, one question is -- is how sprintf operates now, how we would *like* it to operate? I.e. Is this something a "use feature/use VERSION" might be called for in order to eliminate the wart in the future? The second issue obviates the need of considering the 1st case, and that is what could a function return in that context that would be valid for sprintf? I.e. sprintf requires a format statement -- it will die at that point if given a reference. The only other option scalar func()... could return would be the number of elements in an array, OR the recently changed output of SCALAR hash-ref (i.e. what was "4/14" would now be seen as "4" if I understand recent change notes(?)). The only way 'foo(...)' would return something different is if it were returning a reference for the SCALAR case OR expecting to return the the result of a SCALAR ARRAY or SCALAR HASH. If it was expecting to return a SCALAR HASH -- then recent changes for what is returned in that case would cause breakage already, no? In the other two cases, SCALAR ARRAY, or a reference, it's still the case that sprintf would generate an error. I.e. can you think of what a function might return that would be affected by context ($ or @), that would be valid input for sprintf? If it is the case that no function can return valid input to sprintf if the function's output is 'mutable' (dependent on scalar or list context), then we still have a case where changing sprintf's prototype would not affect any existing, *working*, code. No?
Date: Mon, 14 Aug 2017 22:02:09 +0100
CC: perlbug-followup [...] perl.org
From: Dave Mitchell <davem [...] iabyn.com>
Subject: Re: [perl #131875] please correct perlfunc to tell when "sprintf @..." is useful
To: L A Walsh <perl-diddler [...] tlinx.org>
Download (untitled) / with headers
text/plain 494b
On Mon, Aug 14, 2017 at 11:57:31AM -0700, L A Walsh wrote: Show quoted text
> I.e. can you think of what a function might return that would be > affected by context ($ or @), that would be valid input for sprintf?
As a trivial (if slightly contrived) example: my @formats = qw(%s%s %s-%s a=%sb=%s); sub f { wantarray ? @formats : $formats[0]; } print scalar f(), "\n"; print f(), "\n"; which outputs %s%s %s%s%s-%sa=%sb=%s -- You never really learn to swear until you learn to drive.
From: L A Walsh <perl-diddler [...] tlinx.org>
Subject: Re: [perl #131875] please correct perlfunc to tell when "sprintf @..." is useful
To: perlbug-followup [...] perl.org
Date: Mon, 14 Aug 2017 14:28:33 -0700
Download (untitled) / with headers
text/plain 2.1k
Dave Mitchell via RT wrote: Show quoted text
> On Mon, Aug 14, 2017 at 11:57:31AM -0700, L A Walsh wrote: >
>> I.e. can you think of what a function might return that would be >> affected by context ($ or @), that would be valid input for sprintf? >>
> > As a trivial (if slightly contrived) example: > > my @formats = qw(%s%s %s-%s a=%sb=%s); > sub f { > wantarray ? @formats : $formats[0]; > } >
---- Considering that we are talking about using 'f's output as a format to sprintf, and not to any, arbitrary function, if you are going to contrive an example, you might, at least use sprintf. I.e. the non-scalar case as passed to the 1st argument of sprintf would never be called. I.e. it is the case that you would already have "faulty" code. If you felt such faulty code was a possibility, might not an equally contrived case show that the recent change of SCALAR $hash from its prior form of "NN/MM" to "NN" also could generate an error in some form? Admittedly, also a bit of a contrived example, but many years ago Intel released a chip that had some simple math err in the chip. If someone had written code with the error in mind, it would no longer work when it was fixed. Do you think Intel gave any thought to how people's programs, relying on a quirk in the chip's language, would develop errors if Intel fixed it? However, if you really believe there is a reasonable chance that someone would use such a feature -- then, at least, such usage could be used as an example in the perldoc as how the current behavior could be useful in perl code. If such usage is considered a "non-pattern", poor, or a 'bug', then it seems deprecating the old behavior (as it encourages obscure or broken code) and fixing the behavior in the future would be the most reasonable step. However, if it is desired to protect such code, then fixing the behavior only in future versions (i.e use 'feature' or use 'PERLVERSION') would be the other option. Either way, it seems the current behavior is only lending it self to either buggy code (as your greps indicate in several cases) or obscure code that would be "more than a little" hard to maintain.
To: L A Walsh <perl-diddler [...] tlinx.org>
From: Dave Mitchell <davem [...] iabyn.com>
Subject: Re: [perl #131875] please correct perlfunc to tell when "sprintf @..." is useful
CC: perlbug-followup [...] perl.org
Date: Tue, 15 Aug 2017 08:12:46 +0100
Download (untitled) / with headers
text/plain 4.1k
On Mon, Aug 14, 2017 at 02:28:33PM -0700, L A Walsh wrote: Show quoted text
> Dave Mitchell via RT wrote:
> > On Mon, Aug 14, 2017 at 11:57:31AM -0700, L A Walsh wrote:
> > > I.e. can you think of what a function might return that would be > > > affected by context ($ or @), that would be valid input for sprintf?
> > > > As a trivial (if slightly contrived) example: > > > > my @formats = qw(%s%s %s-%s a=%sb=%s); > > sub f { wantarray ? @formats : $formats[0]; > > }
> ---- > Considering that we are talking about using > 'f's output as a format to sprintf, and not to any, arbitrary function, > if you are going to contrive an example, you might, at least use > sprintf. > > I.e. the non-scalar case as passed to the 1st argument of > sprintf would never be called. I.e. it is the case that you > would already have "faulty" code.
I used a non-sprintf example purely to demonstrate that a function can can meaningfully return a valid format in scalar context, while also capable of being used in list context. Here's an example with sprintf: my @formats = qw(%s%s %s-%s a=%sb=%s); sub formats { wantarray ? @formats : $formats[0]; } sub simple_output { sprintf formats(), @_; } sub complex_output { my $selector = shift; my ($format) = grep /\Q$selector/, formats(); sprintf $format, @_; } $x = simple_output("a", "b"); $y = complex_output("=", "a", "b"); Show quoted text
> If you felt such faulty code was a possibility,
I don't think my code examples are faulty. Show quoted text
> might not an > equally contrived case show that the recent change of > SCALAR $hash from its prior form of "NN/MM" to "NN" also could > generate an error in some form?
Any non-backwardly compatible change we make requires a value judgement of the benefits versus the risks. It's possible to reach different decisions for different circumstances. Show quoted text
> Admittedly, also a bit of a contrived example, but many years ago > Intel released a chip that had some simple math err in the chip. > If someone had written code with the error in mind, it would no longer > work when it was fixed. Do you think Intel gave any thought to > how people's programs, relying on a quirk in the chip's language, > would develop errors if Intel fixed it?
The two are not alike. The perl case is documented behaviour, which like many things in perl, can produce surprising effects if the coder is not careful. The pentium issue was a bug. No one in their right mind would have written code that needed the bug to be present in order to function. At best, code may have probed for the bug being present, and if so, worked around it. Show quoted text
> However, if you really believe there is a reasonable chance that > someone would use such a feature -- then, at least, > such usage could be used as an example in the perldoc as how the > current behavior could be useful in perl code.
I'm not saying that the current behaviour is particularly useful, merely that it's not a bug, and that therefore we should consider carefully before changing its behaviour. As such there is no need to add an example to perldoc. Show quoted text
> If such usage is considered a "non-pattern", poor, or a 'bug', > then it seems deprecating the old behavior (as it encourages > obscure or broken code) and fixing the behavior in the > future would be the most reasonable step. > > However, if it is desired to protect such code, then fixing the behavior > only in future versions (i.e use 'feature' or use 'PERLVERSION') would > be the other option. > > Either way, it seems the current behavior is only lending it self to > either buggy code (as your greps indicate in several cases) or obscure code > that would be "more than a little" hard to maintain.
There is a third option: to recognise that the first argument being in scalar context is documented behaviour and that some code may rely on it, and leave it as is. But to recognise specifically that having an array as the first argument indicates a coder error, and start to emit a warning in that case. I am in favour of adding a warning. I am not in favour of changing the prototype of sprintf, with or without the protection of a version or 'use feature' guard. -- I thought I was wrong once, but I was mistaken.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org