Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trailing space in heredoc delimiter #15427

Open
p5pRT opened this issue Jul 6, 2016 · 9 comments
Open

Trailing space in heredoc delimiter #15427

p5pRT opened this issue Jul 6, 2016 · 9 comments

Comments

@p5pRT
Copy link

p5pRT commented Jul 6, 2016

Migrated from rt.perl.org#128557 (status was 'open')

Searchable as RT128557$

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2016

From @epa

Created by @epa

The end marker for a here-document must be the exact string, without
trailing spaces. If the heredoc begins with <<END but instead of
'END' you have 'END ', Perl treats that as part of the quoted string
without warning.

I appreciate that this is done to be consistent with Unix tradition.
The Unix shell likewise requires an exact line with the terminator
string and no trailing space. However, I suggest that in this day and
age it may be worth a small break with tradition in order to catch the
mistake of a trailing space after the terminator.

Firstly, many commonly used text editors do not show any visual
difference between a line with trailing space and one without, so two
programs which look identical end up with different behaviour.

Secondly, we can no longer assume that Perl programmers are familiar
with the dark arts of Unix shell scripting, so they will not know
about this 'gotcha'.

Thirdly, Perl has already broken with the behaviour of the shell in
the case of a carriage-return character. A trailing carriage return
after the heredoc terminator is silently ignored, even on platforms
where text files use just a newline character for end of line. The
Unix shell does not do this (or at least GNU bash doesn't). Given
that we have already broken with the classical shell heredoc behaviour
for one kind of whitespace character, why not for others?

In my personal opinion the right fix would be to add a deprecation
warning when the heredoc terminator is seen followed by whitespace and
end-of-line. 'In future Perl versions this will be treated as the end
of the heredoc.' Then that change can indeed be made after a release
cycle or two.

An alternative would be to simply add a warning when the heredoc
terminator appears but has whitespace after it. Although I appreciate
the comments that heredocs are intended to be unrestricted with
'anything' appearing in them, I feel the case for a warning is even
stronger here given the invisibilty of the trailing space characters.

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl 5.22.2:

Configured by Red Hat, Inc. at Mon Jun 13 12:25:51 UTC 2016.

Summary of my perl5 (revision 5 version 22 subversion 2) configuration:
   
  Platform:
    osname=linux, osvers=4.4.9-300.fc23.x86_64, archname=x86_64-linux-thread-multi
    uname='linux buildvm-14.phx2.fedoraproject.org 4.4.9-300.fc23.x86_64 #1 smp wed may 4 23:56:27 utc 2016 x86_64 x86_64 x86_64 gnulinux '
    config_args='-des -Doptimize=none -Dccflags=-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches  -m64 -mtune=generic -Dldflags=-Wl,-z,relro  -Dccdlflags=-Wl,--enable-new-dtags -Wl,-z,relro  -Dlddlflags=-shared -Wl,-z,relro  -Dshrpdir=/usr/lib64 -DDEBUGGING=-g -Dversion=5.22.2 -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl5 -Dsitearch=/usr/local/lib64/perl5 -Dprivlib=/usr/share/perl5 -Dvendorlib=/usr/share/perl5/vendor_perl -Darchlib=/usr/lib64/perl5 -Dvendorarch=/usr/lib64/perl5/vendor_perl -Darchname=x86_64-linux-thread-multi -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64 -Duseshrplib -Dusethreads -Duseithreads -Dusedtrace=/usr/bin/dtrace -Duselargefiles -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_proto -Ud_sethostent_r_proto -Ud_endprotoent_r_proto -Ud_setprotoent_r_proto -Ud_endservent_r_proto -Ud_setservent_r_proto -Dscriptdir=/usr/bin -Dusesitecustomize'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fwrapv -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='  -g',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fwrapv -fno-strict-aliasing -I/usr/local/include'
    ccversion='', gccversion='5.3.1 20160406 (Red Hat 5.3.1-6)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags ='-Wl,-z,relro  -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib64 /lib64 /usr/lib64 /usr/local/lib /usr/lib /lib/../lib64 /usr/lib/../lib64 /lib
    libs=-lpthread -lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-lpthread -lresolv -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.22.so, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.22'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,--enable-new-dtags -Wl,-z,relro '
    cccdlflags='-fPIC', lddlflags='-shared -Wl,-z,relro  -L/usr/local/lib -fstack-protector-strong'

Locally applied patches:
    Fedora Patch1: Removes date check, Fedora/RHEL specific
    Fedora Patch3: support for libdir64
    Fedora Patch4: use libresolv instead of libbind
    Fedora Patch5: USE_MM_LD_RUN_PATH
    Fedora Patch6: Skip hostname tests, due to builders not being network capable
    Fedora Patch7: Dont run one io test due to random builder failures
    Fedora Patch15: Define SONAME for libperl.so
    Fedora Patch16: Install libperl.so to -Dshrpdir value
    Fedora Patch22: Document Math::BigInt::CalcEmu requires Math::BigInt (CPAN RT#85015)
    Fedora Patch26: Make *DBM_File desctructors thread-safe (RT#61912)
    Fedora Patch27: Make PadlistNAMES() lvalue again (CPAN RT#101063)
    Fedora Patch28: Make magic vtable writable as a work-around for Coro (CPAN RT#101063)
    Fedora Patch29: Fix duplicating PerlIO::encoding when spawning threads (RT#31923)
    Fedora Patch200: Link XS modules to libperl.so with EU::CBuilder on Linux
    Fedora Patch201: Link XS modules to libperl.so with EU::MM on Linux


@INC for perl 5.22.2:
    /home/eda/lib64/perl5/
    /usr/local/lib64/perl5
    /usr/local/share/perl5
    /usr/lib64/perl5/vendor_perl
    /usr/share/perl5/vendor_perl
    /usr/lib64/perl5
    /usr/share/perl5
    .


Environment for perl 5.22.2:
    HOME=/home/eda
    LANG=en_GB.UTF-8
    LANGUAGE (unset)
    LC_COLLATE=C
    LC_CTYPE=en_GB.UTF-8
    LC_MESSAGES=en_GB.UTF-8
    LC_MONETARY=en_GB.UTF-8
    LC_NUMERIC=en_GB.UTF-8
    LC_TIME=en_GB.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/eda/bin:/home/eda/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/sbin:/home/eda/.local/bin:/home/eda/bin:/sbin:/usr/sbin
    PERL5LIB=/home/eda/lib64/perl5/
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Sep 30, 2017

From @jkeenan

On Wed, 06 Jul 2016 13​:50​:48 GMT, eda@​waniasset.com wrote​:

This is a bug report for perl from eda@​waniasset.com,
generated with the help of perlbug 1.40 running under perl 5.22.2.

-----------------------------------------------------------------
[Please describe your issue here]

The end marker for a here-document must be the exact string, without
trailing spaces. If the heredoc begins with <<END but instead of
'END' you have 'END ', Perl treats that as part of the quoted string
without warning.

I am unsure as to the specific case you are discussing.

In the attachment, the first heredoc is defined with '<<END;'. The second is defined with '<<END ;' In both cases there is no whitespace after the terminator 'END'. Both "work" in the sense that both compile and print -- even though the second definition looks weird.

Is that the case you are concerned about? Or something else?

Thank you very much.
--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Sep 30, 2017

@p5pRT
Copy link
Author

p5pRT commented Sep 30, 2017

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Sep 30, 2017

From @Abigail

On Fri, Sep 29, 2017 at 07​:27​:58PM -0700, James E Keenan via RT wrote​:

On Wed, 06 Jul 2016 13​:50​:48 GMT, eda@​waniasset.com wrote​:

This is a bug report for perl from eda@​waniasset.com,
generated with the help of perlbug 1.40 running under perl 5.22.2.

-----------------------------------------------------------------
[Please describe your issue here]

The end marker for a here-document must be the exact string, without
trailing spaces. If the heredoc begins with <<END but instead of
'END' you have 'END ', Perl treats that as part of the quoted string
without warning.

I am unsure as to the specific case you are discussing.

In the attachment, the first heredoc is defined with '<<END;'. The second is defined with '<<END ;' In both cases there is no whitespace after the terminator 'END'. Both "work" in the sense that both compile and print -- even though the second definition looks weird.

Is that the case you are concerned about? Or something else?

I think the perceived problem is with a trailing space after the
terminator *after* the here doc. As in​:

  print << END;
  This is printed

  END
  This as well, because the previous line
  has a trailing space.

  END

I'm not sure how big a problem this is in practise. In many cases, a
trailing space will lead to a program which doesn't compile. For the
remaining cases, where both the program compiles and doesn't behave
badly, a linter (Perl​::Critic?) ought to do the trick.

Abigail

@p5pRT
Copy link
Author

p5pRT commented Sep 30, 2017

From @jkeenan

On Sat, 30 Sep 2017 16​:18​:58 GMT, abigail@​abigail.be wrote​:

On Fri, Sep 29, 2017 at 07​:27​:58PM -0700, James E Keenan via RT wrote​:

On Wed, 06 Jul 2016 13​:50​:48 GMT, eda@​waniasset.com wrote​:

This is a bug report for perl from eda@​waniasset.com,
generated with the help of perlbug 1.40 running under perl 5.22.2.

-----------------------------------------------------------------
[Please describe your issue here]

The end marker for a here-document must be the exact string,
without
trailing spaces. If the heredoc begins with <<END but instead of
'END' you have 'END ', Perl treats that as part of the quoted
string
without warning.

I am unsure as to the specific case you are discussing.

In the attachment, the first heredoc is defined with '<<END;'. The
second is defined with '<<END ;' In both cases there is no
whitespace after the terminator 'END'. Both "work" in the sense that
both compile and print -- even though the second definition looks
weird.

Is that the case you are concerned about? Or something else?

I think the perceived problem is with a trailing space after the
terminator *after* the here doc. As in​:

print << END;
This is printed

END
This as well, because the previous line
has a trailing space.

END

I'm not sure how big a problem this is in practise. In many cases, a
trailing space will lead to a program which doesn't compile. For the
remaining cases, where both the program compiles and doesn't behave
badly, a linter (Perl​::Critic?) ought to do the trick.

I agree. I think the feature request should be rejected.

Thank you very much.
--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Oct 1, 2017

From @xsawyerx

On 09/30/2017 07​:27 PM, James E Keenan via RT wrote​:

On Sat, 30 Sep 2017 16​:18​:58 GMT, abigail@​abigail.be wrote​:

On Fri, Sep 29, 2017 at 07​:27​:58PM -0700, James E Keenan via RT wrote​:

On Wed, 06 Jul 2016 13​:50​:48 GMT, eda@​waniasset.com wrote​:

This is a bug report for perl from eda@​waniasset.com,
generated with the help of perlbug 1.40 running under perl 5.22.2.

-----------------------------------------------------------------
[Please describe your issue here]

The end marker for a here-document must be the exact string,
without
trailing spaces. If the heredoc begins with <<END but instead of
'END' you have 'END ', Perl treats that as part of the quoted
string
without warning.

I am unsure as to the specific case you are discussing.

In the attachment, the first heredoc is defined with '<<END;'. The
second is defined with '<<END ;' In both cases there is no
whitespace after the terminator 'END'. Both "work" in the sense that
both compile and print -- even though the second definition looks
weird.

Is that the case you are concerned about? Or something else?

I think the perceived problem is with a trailing space after the
terminator *after* the here doc. As in​:

print << END;
This is printed

END
This as well, because the previous line
has a trailing space.

END

I'm not sure how big a problem this is in practise. In many cases, a
trailing space will lead to a program which doesn't compile. For the
remaining cases, where both the program compiles and doesn't behave
badly, a linter (Perl​::Critic?) ought to do the trick.

I agree. I think the feature request should be rejected.

Agreed.

@p5pRT
Copy link
Author

p5pRT commented Oct 1, 2017

From @epa

Thanks for your responses. This bug report is based on real-world code; a lot of code (repeated SQL calls, or printing chunks of output) do have repeated heredoc uses, and if those all use the same delimiter (such as END) it is quite easy to silently include a trailing space -- not shown in most editors -- and end up with code that compiles but is wrong.

I filed the bug report because it's quite awkward that a space character (moreover, an end-of-line one, which are the most difficult to see) has such a large effect. But forget the suggestion of adding a warning. That gets bogged down in the usual back and forth about perlcritic (which I love, and run over the codebase regularly -- but not in every single edit-test cycle). Could I ask instead whether the whitespace-sensitivity here is really needed?

What would happen if heredoc delimiters became like every other language token and weren't affected by space characters after them? They will always be special because they do depend on start of line, but couldn't they be more relaxed about whitespace at the end? Nothing else in the language is as fussy, and we know that with the mixture of different editors in use, trailing spaces tend to creep into code over time. Wouldn't it be a sensible incremental improvement to allow whitespace after these heredoc terminators?

Note that the current behaviour is not exactly consistent​: if you put the start of the heredoc as '<<END ' with a trailing space, it doesn't get terminated by the 'END ' you specified. The trailing space on the starting heredoc delimiter is silently stripped. Why not strip on the closing delimiter too?

@p5pRT
Copy link
Author

p5pRT commented Oct 1, 2017

From @Abigail

On Sun, Oct 01, 2017 at 12​:10​:57PM -0700, Ed Avis via RT wrote​:

Thanks for your responses. This bug report is based on real-world code; a lot of code (repeated SQL calls, or printing chunks of output) do have repeated heredoc uses, and if those all use the same delimiter (such as END) it is quite easy to silently include a trailing space -- not shown in most editors -- and end up with code that compiles but is wrong.

I filed the bug report because it's quite awkward that a space character (moreover, an end-of-line one, which are the most difficult to see) has such a large effect. But forget the suggestion of adding a warning. That gets bogged down in the usual back and forth about perlcritic (which I love, and run over the codebase regularly -- but not in every single edit-test cycle). Could I ask instead whether the whitespace-sensitivity here is really needed?

What would happen if heredoc delimiters became like every other language token and weren't affected by space characters after them? They will always be special because they do depend on start of line, but couldn't they be more relaxed about whitespace at the end? Nothing else in the language is as fussy, and we know that with the mixture of different editors in use, trailing spaces tend to creep into code over time. Wouldn't it be a sensible incremental improvement to allow whitespace after these heredoc terminators?

Note that the current behaviour is not exactly consistent​: if you put the start of the heredoc as '<<END ' with a trailing space, it doesn't get terminated by the 'END ' you specified. The trailing space on the starting heredoc delimiter is silently stripped. Why not strip on the closing delimiter too?

This does what you expect it to do​:

  print << "END ";
  The line below does not end with a space​:
  END
  The line below does end with a space​:
  END

Abigail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants