Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sv.c: Eliminate LGTM warning Comparison result is always the same #16776

Closed
p5pRT opened this issue Nov 29, 2018 · 18 comments
Closed

sv.c: Eliminate LGTM warning Comparison result is always the same #16776

p5pRT opened this issue Nov 29, 2018 · 18 comments

Comments

@p5pRT
Copy link

p5pRT commented Nov 29, 2018

Migrated from rt.perl.org#133703 (status was 'rejected')

Searchable as RT133703$

@p5pRT
Copy link
Author

p5pRT commented Nov 29, 2018

From @jkeenan

This ticket will hold a patch intended to suppress a warning reporting
by LGTM.com analysis of the Perl 5 core distribution.

The relevant part of sv.c is​:

#####
  8486 if (rspara) { /* have to do this both before and after */
  8487 do { /* to make sure file boundaries work right */
  8488 if (PerlIO_eof(fp))
  8489 return 0;
  8490 i = PerlIO_getc(fp);
  8491 if (i != '\n') {
  8492 if (i == -1)
  8493 return 0;
  8494 PerlIO_ungetc(fp,i);
  8495 break;
  8496 }
  8497 } while (i != EOF);
  8498 }
#####

Warning​: "Comparison result is always the same"
Analysis​:
https://lgtm.com/projects/g/Perl/perl5/alerts/?mode=tree&ruleFocus=2154840804
Warnings rule​: https://lgtm.com/rules/2154840804/

I'll attach the patch once I get an RT number.

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Nov 29, 2018

From @jkeenan

Summary of my perl5 (revision 5 version 29 subversion 6) configuration​:
  Derived from​: fdfb42a
  Platform​:
  osname=linux
  osvers=4.15.0-39-generic
  archname=x86_64-linux
  uname='linux zareason 4.15.0-39-generic #42-ubuntu smp tue oct 23 15​:48​:01 utc 2018 x86_64 x86_64 x86_64 gnulinux '
  config_args='-des -Dusedevel'
  hint=recommended
  useposix=true
  d_sigaction=define
  useithreads=undef
  usemultiplicity=undef
  use64bitint=define
  use64bitall=define
  uselongdouble=undef
  usemymalloc=n
  default_inc_excludes_dot=define
  bincompat5005=undef
  Compiler​:
  cc='cc'
  ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
  optimize='-O2'
  cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
  ccversion=''
  gccversion='7.3.0'
  gccosandvers=''
  intsize=4
  longsize=8
  ptrsize=8
  doublesize=8
  byteorder=12345678
  doublekind=3
  d_longlong=define
  longlongsize=8
  d_longdbl=define
  longdblsize=16
  longdblkind=3
  ivtype='long'
  ivsize=8
  nvtype='double'
  nvsize=8
  Off_t='off_t'
  lseeksize=8
  alignbytes=8
  prototype=define
  Linker and Libraries​:
  ld='cc'
  ldflags =' -fstack-protector-strong -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/7/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /lib64 /usr/lib64
  libs=-lpthread -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc
  perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
  libc=libc-2.27.so
  so=so
  useshrplib=false
  libperl=libperl.a
  gnulibc_version='2.27'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs
  dlext=so
  d_dlsymun=undef
  ccdlflags='-Wl,-E'
  cccdlflags='-fPIC'
  lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl)​:
  Compile-time options​:
  HAS_TIMES
  PERLIO_LAYERS
  PERL_COPY_ON_WRITE
  PERL_DONT_CREATE_GVSV
  PERL_MALLOC_WRAP
  PERL_OP_PARENT
  PERL_PRESERVE_IVUV
  PERL_USE_DEVEL
  USE_64_BIT_ALL
  USE_64_BIT_INT
  USE_LARGE_FILES
  USE_LOCALE
  USE_LOCALE_COLLATE
  USE_LOCALE_CTYPE
  USE_LOCALE_NUMERIC
  USE_LOCALE_TIME
  USE_PERLIO
  USE_PERL_ATOF
  Built under linux
  Compiled at Nov 28 2018 21​:59​:22
  %ENV​:
  PERL2DIR="/home/jkeenan/gitwork/perl2"
  PERLBREW_HOME="/home/jkeenan/.perlbrew"
  PERLBREW_MANPATH="/home/jkeenan/perl5/perlbrew/perls/perl-5.28.0/man"
  PERLBREW_PATH="/home/jkeenan/perl5/perlbrew/bin​:/home/jkeenan/perl5/perlbrew/perls/perl-5.28.0/bin"
  PERLBREW_PERL="perl-5.28.0"
  PERLBREW_ROOT="/home/jkeenan/perl5/perlbrew"
  PERLBREW_SHELLRC_VERSION="0.84"
  PERLBREW_VERSION="0.84"
  PERL_WORKDIR="/home/jkeenan/gitwork/perl"
  @​INC​:
  lib
  /usr/local/lib/perl5/site_perl/5.29.6/x86_64-linux
  /usr/local/lib/perl5/site_perl/5.29.6
  /usr/local/lib/perl5/5.29.6/x86_64-linux
  /usr/local/lib/perl5/5.29.6

@p5pRT
Copy link
Author

p5pRT commented Nov 29, 2018

From @jkeenan

On Thu, 29 Nov 2018 20​:47​:22 GMT, jkeenan@​pobox.com wrote​:

This ticket will hold a patch intended to suppress a warning reporting
by LGTM.com analysis of the Perl 5 core distribution.

The relevant part of sv.c is​:

#####
8486 if (rspara) { /* have to do this both before and after
*/
8487 do { /* to make sure file boundaries work right */
8488 if (PerlIO_eof(fp))
8489 return 0;
8490 i = PerlIO_getc(fp);
8491 if (i != '\n') {
8492 if (i == -1)
8493 return 0;
8494 PerlIO_ungetc(fp,i);
8495 break;
8496 }
8497 } while (i != EOF);
8498 }
#####

Warning​: "Comparison result is always the same"
Analysis​:
https://lgtm.com/projects/g/Perl/perl5/alerts/?mode=tree&ruleFocus=2154840804
Warnings rule​: https://lgtm.com/rules/2154840804/

I'll attach the patch once I get an RT number.

Thank you very much.
Jim Keenan

Branch​: smoke-me/jkeenan/133703-sv-c

Patch​: 133703-0001-If-comparison-is-always-true-no-need-for-do-while-lo.patch

Note​: While the branch has been passing its test, we have to be alert to the possibility that the test suite might never have exercised the code in question.

Thank you very much.
--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 29, 2018

From @jkeenan

133703-0001-If-comparison-is-always-true-no-need-for-do-while-lo.patch
From 3bef985dd028c3af8cfd667cf164452c83159fad Mon Sep 17 00:00:00 2001
From: James E Keenan <jkeenan@cpan.org>
Date: Tue, 27 Nov 2018 22:08:40 -0500
Subject: [PATCH] If comparison is always true, no need for do-while loop.

LGTM analysis: https://lgtm.com/projects/g/Perl/perl5/alerts/?mode=tree&ruleFocus=2154840804

LGTM rule: https://lgtm.com/rules/2154840804/
---
 sv.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/sv.c b/sv.c
index 826228bc92..73b0659d7d 100644
--- a/sv.c
+++ b/sv.c
@@ -8483,18 +8483,16 @@ Perl_sv_gets(pTHX_ SV *const sv, PerlIO *const fp, I32 append)
      * null assign is a placeholder. */
     rslast = rslen ? rsptr[rslen - 1] : '\0';
 
-    if (rspara) {		/* have to do this both before and after */
-	do {			/* to make sure file boundaries work right */
-	    if (PerlIO_eof(fp))
-		return 0;
-	    i = PerlIO_getc(fp);
-	    if (i != '\n') {
-		if (i == -1)
-		    return 0;
-		PerlIO_ungetc(fp,i);
-		break;
-	    }
-	} while (i != EOF);
+    if (rspara) {        /* have to do this both before and after */
+                         /* to make sure file boundaries work right */
+        if (PerlIO_eof(fp))
+            return 0;
+        i = PerlIO_getc(fp);
+        if (i != '\n') {
+            if (i == -1)
+                return 0;
+            PerlIO_ungetc(fp,i);
+        }
     }
 
     /* See if we know enough about I/O mechanism to cheat it ! */
-- 
2.17.1

@p5pRT
Copy link
Author

p5pRT commented Nov 29, 2018

From [Unknown Contact. See original ticket]

On Thu, 29 Nov 2018 20​:47​:22 GMT, jkeenan@​pobox.com wrote​:

This ticket will hold a patch intended to suppress a warning reporting
by LGTM.com analysis of the Perl 5 core distribution.

The relevant part of sv.c is​:

#####
8486 if (rspara) { /* have to do this both before and after
*/
8487 do { /* to make sure file boundaries work right */
8488 if (PerlIO_eof(fp))
8489 return 0;
8490 i = PerlIO_getc(fp);
8491 if (i != '\n') {
8492 if (i == -1)
8493 return 0;
8494 PerlIO_ungetc(fp,i);
8495 break;
8496 }
8497 } while (i != EOF);
8498 }
#####

Warning​: "Comparison result is always the same"
Analysis​:
https://lgtm.com/projects/g/Perl/perl5/alerts/?mode=tree&ruleFocus=2154840804
Warnings rule​: https://lgtm.com/rules/2154840804/

I'll attach the patch once I get an RT number.

Thank you very much.
Jim Keenan

Branch​: smoke-me/jkeenan/133703-sv-c

Patch​: 133703-0001-If-comparison-is-always-true-no-need-for-do-while-lo.patch

Note​: While the branch has been passing its test, we have to be alert to the possibility that the test suite might never have exercised the code in question.

Thank you very much.
--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 29, 2018

From @iabyn

On Thu, Nov 29, 2018 at 12​:53​:11PM -0800, James E Keenan via RT wrote​:

- if (rspara) { /* have to do this both before and after */
- do { /* to make sure file boundaries work right */
- if (PerlIO_eof(fp))
- return 0;
- i = PerlIO_getc(fp);
- if (i != '\n') {
- if (i == -1)
- return 0;
- PerlIO_ungetc(fp,i);
- break;
- }
- } while (i != EOF);
+ if (rspara) { /* have to do this both before and after */
+ /* to make sure file boundaries work right */
+ if (PerlIO_eof(fp))
+ return 0;
+ i = PerlIO_getc(fp);
+ if (i != '\n') {
+ if (i == -1)
+ return 0;
+ PerlIO_ungetc(fp,i);
+ }

That doesn't look right - you've completely removed the while loop, which
is there to gobble all extra \n's at the end of a paragraph when in
paragraph read mode ($/ = ""). With the new code, only a single \n will be
consumed.

Its odd that it doesn't trigger any test failures - its possible that some
other bit of code also gobbles and thus this gobble is redundant.

The condition which the code analyser was complaining about is that
(i != EOF) is always true, which means the code could better be written
as an unconditional loop​:

  do {
  ...
  } while (1);

or more simply,

  while (1) {
  ....
  }

--
Never work with children, animals, or actors.

@p5pRT
Copy link
Author

p5pRT commented Nov 30, 2018

@jkeenan - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Nov 30, 2018

From @jkeenan

On Thu, 29 Nov 2018 23​:19​:36 GMT, davem wrote​:

On Thu, Nov 29, 2018 at 12​:53​:11PM -0800, James E Keenan via RT wrote​:

- if (rspara) { /* have to do this both before and after */
- do { /* to make sure file boundaries work right */
- if (PerlIO_eof(fp))
- return 0;
- i = PerlIO_getc(fp);
- if (i != '\n') {
- if (i == -1)
- return 0;
- PerlIO_ungetc(fp,i);
- break;
- }
- } while (i != EOF);
+ if (rspara) { /* have to do this both before and after */
+ /* to make sure file boundaries work right */
+ if (PerlIO_eof(fp))
+ return 0;
+ i = PerlIO_getc(fp);
+ if (i != '\n') {
+ if (i == -1)
+ return 0;
+ PerlIO_ungetc(fp,i);
+ }

That doesn't look right - you've completely removed the while loop, which
is there to gobble all extra \n's at the end of a paragraph when in
paragraph read mode ($/ = ""). With the new code, only a single \n will be
consumed.

Its odd that it doesn't trigger any test failures - its possible that some
other bit of code also gobbles and thus this gobble is redundant.

The condition which the code analyser was complaining about is that
(i != EOF) is always true, which means the code could better be written
as an unconditional loop​:

do \{
    \.\.\.
\} while \(1\);

or more simply,

while \(1\) \{
    \.\.\.\.
\}

I agree that the absence of test breakage is puzzling. To investigate further, I wrote an additional program, 133703-paragraph-mode.t, attached. This program PASSes in perl-5.28.0 and in my branch.

I then created another branch, jkeenan/davem/133703-sv-c, in which I implemented your suggestion to replace the do..while loop with a simple while(1) loop. This branch PASSes both 133703-paragraph-mode.t and 'make test_harness'.

The fact that everything is PASSing is puzzling.

Thank you very much.
--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 30, 2018

From @jkeenan

133703-0001-If-comparison-is-always-true-while-1-should-suffice.patch
From 06e2209b8023d04c7fae3cc01e4aba4bc279dae0 Mon Sep 17 00:00:00 2001
From: James E Keenan <jkeenan@cpan.org>
Date: Fri, 30 Nov 2018 10:33:07 -0500
Subject: [PATCH] If comparison is always true, while (1) should suffice.

Per suggestion by Dave Mitchell in RT 133703.
---
 sv.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/sv.c b/sv.c
index 826228bc92..3dc5a4fe8d 100644
--- a/sv.c
+++ b/sv.c
@@ -8483,18 +8483,19 @@ Perl_sv_gets(pTHX_ SV *const sv, PerlIO *const fp, I32 append)
      * null assign is a placeholder. */
     rslast = rslen ? rsptr[rslen - 1] : '\0';
 
-    if (rspara) {		/* have to do this both before and after */
-	do {			/* to make sure file boundaries work right */
-	    if (PerlIO_eof(fp))
-		return 0;
-	    i = PerlIO_getc(fp);
-	    if (i != '\n') {
-		if (i == -1)
-		    return 0;
-		PerlIO_ungetc(fp,i);
-		break;
-	    }
-	} while (i != EOF);
+    if (rspara) {        /* have to do this both before and after */
+                         /* to make sure file boundaries work right */
+        while (1) {
+            if (PerlIO_eof(fp))
+                return 0;
+            i = PerlIO_getc(fp);
+            if (i != '\n') {
+                if (i == -1)
+                    return 0;
+                PerlIO_ungetc(fp,i);
+                break;
+            }
+        }
     }
 
     /* See if we know enough about I/O mechanism to cheat it ! */
-- 
2.17.1

@p5pRT
Copy link
Author

p5pRT commented Nov 30, 2018

From @jkeenan

use strict;
use warnings;
use Test​::More;
use Data​::Dumper;
use File​::Temp qw( tempfile );

my ($fh, $filename) = tempfile();
my @​chunks = (
  [ 1..3 ],
  [ 4..6 ],
  [ 7..9 ],
  [ 10 ],
);

open my $OUT, '>', $filename or die;
print $OUT "$_\n" for (
  @​{$chunks[0]},
  ("") x 1,
  @​{$chunks[1]},
  ("") x 2,
  @​{$chunks[2]},
  ("") x 3,
  @​{$chunks[3]}
);
close $OUT or die;

my @​expected = (
  ( join '' => ( map { "$_\n" } ( @​{$chunks[0]}, "" ) ) ),
  ( join '' => ( map { "$_\n" } ( @​{$chunks[1]}, "" ) ) ),
  ( join '' => ( map { "$_\n" } ( @​{$chunks[2]}, "" ) ) ),
  ( join '' => ( map { "$_\n" } ( @​{$chunks[3]} ) ) ),
);

my @​got;
{
  local $/ = ""; # paragraph mode
  open my $IN, '<', $filename or die;
  push @​got, $_ while (<$IN>);
  close $IN or die;
}

is_deeply(\@​got, \@​expected, q|$/ = "" -- Got expected paragraphs|)
  or print STDERR Dumper [ \@​got, \@​expected ];

done_testing();

@p5pRT
Copy link
Author

p5pRT commented Nov 30, 2018

From @jkeenan

On Fri, 30 Nov 2018 15​:56​:08 GMT, jkeenan wrote​:

On Thu, 29 Nov 2018 23​:19​:36 GMT, davem wrote​:

On Thu, Nov 29, 2018 at 12​:53​:11PM -0800, James E Keenan via RT
wrote​:

- if (rspara) { /* have to do this both before and
after */
- do { /* to make sure file boundaries work
right */
- if (PerlIO_eof(fp))
- return 0;
- i = PerlIO_getc(fp);
- if (i != '\n') {
- if (i == -1)
- return 0;
- PerlIO_ungetc(fp,i);
- break;
- }
- } while (i != EOF);
+ if (rspara) { /* have to do this both before and after
*/
+ /* to make sure file boundaries work
right */
+ if (PerlIO_eof(fp))
+ return 0;
+ i = PerlIO_getc(fp);
+ if (i != '\n') {
+ if (i == -1)
+ return 0;
+ PerlIO_ungetc(fp,i);
+ }

That doesn't look right - you've completely removed the while loop,
which
is there to gobble all extra \n's at the end of a paragraph when in
paragraph read mode ($/ = ""). With the new code, only a single \n
will be
consumed.

Its odd that it doesn't trigger any test failures - its possible that
some
other bit of code also gobbles and thus this gobble is redundant.

The condition which the code analyser was complaining about is that
(i != EOF) is always true, which means the code could better be
written
as an unconditional loop​:

do {
...
} while (1);

or more simply,

while (1) {
....
}

I agree that the absence of test breakage is puzzling. To investigate
further, I wrote an additional program, 133703-paragraph-mode.t,
attached. This program PASSes in perl-5.28.0 and in my branch.

I then created another branch, jkeenan/davem/133703-sv-c, in which I
implemented your suggestion to replace the do..while loop with a
simple while(1) loop. This branch PASSes both 133703-paragraph-mode.t
and 'make test_harness'.

The fact that everything is PASSing is puzzling.

In sv.c in blead we have​:

#####
8386 char *
8387 Perl_sv_gets(pTHX_ SV *const sv, PerlIO *const fp, I32 append)
8388 {
8389 const char *rsptr;
8390 STRLEN rslen;
8391 STDCHAR rslast;
8392 STDCHAR *bp;
8393 SSize_t cnt;
8394 int i = 0;
8395 int rspara = 0;
...
8429 if (IN_PERL_COMPILETIME) {
...
8460 else if (RsPARA(PL_rs)) {
8461 rsptr = "\n\n";
8462 rslen = 2;
8463 rspara = 1;
8464 }
...
8486 if (rspara) { /* have to do this both before and after */
8487 do { /* to make sure file boundaries work right */
8488 if (PerlIO_eof(fp))
8489 return 0;
8490 i = PerlIO_getc(fp);
8491 if (i != '\n') {
8492 if (i == -1)
8493 return 0;
8494 PerlIO_ungetc(fp,i);
8495 break;
8496 }
8497 } while (i != EOF);
8498 }
#####

The last subsection above is the one where LGTM saw an unnecessary comparison.

But then further down, towards the end of the definition of Perl_sv_gets, we have​:

#####
8820 if (rspara) { /* have to do this both before and after */
8821 while (i != EOF) { /* to make sure file boundaries work right */
8822 i = PerlIO_getc(fp);
8823 if (i != '\n') {
8824 PerlIO_ungetc(fp,i);
8825 break;
8826 }
8827 }
8828 }
#####

Is it possible that the test suite (e.g., t/base/rs.t) and my additional program are only exercising the latter block?

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 30, 2018

From @Leont

On Fri, Nov 30, 2018 at 5​:14 PM James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

In sv.c in blead we have​:

#####
8386 char *
8387 Perl_sv_gets(pTHX_ SV *const sv, PerlIO *const fp, I32 append)
8388 {
8389 const char *rsptr;
8390 STRLEN rslen;
8391 STDCHAR rslast;
8392 STDCHAR *bp;
8393 SSize_t cnt;
8394 int i = 0;
8395 int rspara = 0;
...
8429 if (IN_PERL_COMPILETIME) {
...
8460 else if (RsPARA(PL_rs)) {
8461 rsptr = "\n\n";
8462 rslen = 2;
8463 rspara = 1;
8464 }
...
8486 if (rspara) { /* have to do this both before and after */
8487 do { /* to make sure file boundaries work right */
8488 if (PerlIO_eof(fp))
8489 return 0;
8490 i = PerlIO_getc(fp);
8491 if (i != '\n') {
8492 if (i == -1)
8493 return 0;
8494 PerlIO_ungetc(fp,i);
8495 break;
8496 }
8497 } while (i != EOF);
8498 }
#####

The last subsection above is the one where LGTM saw an unnecessary comparison.

But then further down, towards the end of the definition of Perl_sv_gets, we have​:

#####
8820 if (rspara) { /* have to do this both before and after */
8821 while (i != EOF) { /* to make sure file boundaries work right */
8822 i = PerlIO_getc(fp);
8823 if (i != '\n') {
8824 PerlIO_ungetc(fp,i);
8825 break;
8826 }
8827 }
8828 }
#####

Is it possible that the test suite (e.g., t/base/rs.t) and my additional program are only exercising the latter block?

Yes.

The first section is before the actual readline code, and consumes any
newlines iff we're in paragraph mode; then the newline implementation
follows; and then similar code consumes all remaining newlines after
the line. So the most obvious way to trigger the first subsection
would be to create a file that starts with newlines, and read that in
paragraph mode, and indeed that would be broken by your patch.

We clearly should add a test for that.

Leon

@p5pRT
Copy link
Author

p5pRT commented Nov 30, 2018

From @jkeenan

On Fri, 30 Nov 2018 16​:37​:31 GMT, LeonT wrote​:

On Fri, Nov 30, 2018 at 5​:14 PM James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

In sv.c in blead we have​:

#####
8386 char *
8387 Perl_sv_gets(pTHX_ SV *const sv, PerlIO *const fp, I32 append)
8388 {
8389 const char *rsptr;
8390 STRLEN rslen;
8391 STDCHAR rslast;
8392 STDCHAR *bp;
8393 SSize_t cnt;
8394 int i = 0;
8395 int rspara = 0;
...
8429 if (IN_PERL_COMPILETIME) {
...
8460 else if (RsPARA(PL_rs)) {
8461 rsptr = "\n\n";
8462 rslen = 2;
8463 rspara = 1;
8464 }
...
8486 if (rspara) { /* have to do this both before and after
*/
8487 do { /* to make sure file boundaries work right
*/
8488 if (PerlIO_eof(fp))
8489 return 0;
8490 i = PerlIO_getc(fp);
8491 if (i != '\n') {
8492 if (i == -1)
8493 return 0;
8494 PerlIO_ungetc(fp,i);
8495 break;
8496 }
8497 } while (i != EOF);
8498 }
#####

The last subsection above is the one where LGTM saw an unnecessary
comparison.

But then further down, towards the end of the definition of
Perl_sv_gets, we have​:

#####
8820 if (rspara) { /* have to do this both before and after
*/
8821 while (i != EOF) { /* to make sure file boundaries work
right */
8822 i = PerlIO_getc(fp);
8823 if (i != '\n') {
8824 PerlIO_ungetc(fp,i);
8825 break;
8826 }
8827 }
8828 }
#####

Is it possible that the test suite (e.g., t/base/rs.t) and my
additional program are only exercising the latter block?

Yes.

The first section is before the actual readline code, and consumes any
newlines iff we're in paragraph mode; then the newline implementation
follows; and then similar code consumes all remaining newlines after
the line. So the most obvious way to trigger the first subsection
would be to create a file that starts with newlines, and read that in
paragraph mode, and indeed that would be broken by your patch.

We clearly should add a test for that.

Leon

Please see the latest attachment, 133703-more-paragraph-mode-tests.t. I believe this code exercises the 'if' block at line 8486 of sv.3 in blead.* If in the test file I have set up the expected results correctly, then we have a bug. The output I get from running this test either with perl-5.28.0 (via 'prove') or blead is​:

#####
$ prove -v 133703-more-paragraph-mode-tests.t
133703-more-paragraph-mode-tests.t ..
not ok 1 - Got expected paragraph 1
# Failed test 'Got expected paragraph 1'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '1
# 12345
#
# '
# expected​: '1
# '
not ok 2 - Got expected paragraph 2
# Failed test 'Got expected paragraph 2'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '123456
# '
# expected​: '12345
# '
not ok 3 - Got expected paragraph 3
# Failed test 'Got expected paragraph 3'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: undef
# expected​: '123456
# '
not ok 4 - Got expected paragraph 1
# Failed test 'Got expected paragraph 1'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '1
#
# '
# expected​: '1
# '
not ok 5 - Got expected paragraph 2
# Failed test 'Got expected paragraph 2'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '12345
#
# '
# expected​: '12345
# '
ok 6 - Got expected paragraph 3
not ok 7 - Got expected paragraph 1
# Failed test 'Got expected paragraph 1'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '1
#
# '
# expected​: '1
# '
not ok 8 - Got expected paragraph 2
# Failed test 'Got expected paragraph 2'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '12345
#
# '
# expected​: '12345
# '
ok 9 - Got expected paragraph 3
not ok 10 - Got expected paragraph 1
# Failed test 'Got expected paragraph 1'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '1
# 12345
#
# '
# expected​: '1
# '
not ok 11 - Got expected paragraph 2
# Failed test 'Got expected paragraph 2'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '123456
#
# '
# expected​: '12345
# '
not ok 12 - Got expected paragraph 3
# Failed test 'Got expected paragraph 3'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: undef
# expected​: '123456
# '
1..12
# Looks like you failed 10 tests of 12.
Dubious, test returned 10 (wstat 2560, 0xa00)
Failed 10/12 subtests

Test Summary Report


133703-more-paragraph-mode-tests.t (Wstat​: 2560 Tests​: 12 Failed​: 10)
  Failed tests​: 1-5, 7-8, 10-12
  Non-zero exit status​: 10
Files=1, Tests=12, 0 wallclock secs ( 0.02 usr 0.00 sys + 0.05 cusr 0.01 csys = 0.08 CPU)
Result​: FAIL#####
#####

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Nov 30, 2018

From @jkeenan

#!/usr/bin/env perl
use strict;
use warnings;
use File​::Temp qw( tempfile );
use Test​::More;

my $chunks = [ 1, 12345, 123456 ];
my $teststring;

$teststring = "\n\n\n$chunks->[0]\n$chunks->[1]\n\n$chunks->[2]\n";
test_paragraph_mode($chunks, $teststring);

$teststring = "\n\n\n$chunks->[0]\n\n$chunks->[1]\n\n$chunks->[2]\n";
test_paragraph_mode($chunks, $teststring);

$teststring = "\n\n\n$chunks->[0]\n\n\n$chunks->[1]\n\n$chunks->[2]\n";
test_paragraph_mode($chunks, $teststring);

$teststring = "\n\n\n$chunks->[0]\n$chunks->[1]\n\n$chunks->[2]\n\n";
test_paragraph_mode($chunks, $teststring);

done_testing();

sub test_paragraph_mode {
  my ($chunks, $teststring) = @​_;
  my ($fh, $filename) = tempfile();
  my @​expected = map { "$_\n" } @​{$chunks};

  open my $OUT, '>', $filename or die;
  binmode $OUT;
  print $OUT $teststring;
  close $OUT or die;

  local $/ = ""; # paragraph mode
  open my $IN, '<', $filename or die;
  binmode $IN;
  for (my $i = 0; $i &lt;= $#expected; $i++) {
  my $j = $i+1;
  is(<$IN>, $expected[$i], "Got expected paragraph $j");
  }
  close $IN or die;
}

@p5pRT
Copy link
Author

p5pRT commented Dec 1, 2018

From @jkeenan

On Fri, 30 Nov 2018 23​:36​:06 GMT, jkeenan wrote​:

On Fri, 30 Nov 2018 16​:37​:31 GMT, LeonT wrote​:

On Fri, Nov 30, 2018 at 5​:14 PM James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

In sv.c in blead we have​:

#####
8386 char *
8387 Perl_sv_gets(pTHX_ SV *const sv, PerlIO *const fp, I32 append)
8388 {
8389 const char *rsptr;
8390 STRLEN rslen;
8391 STDCHAR rslast;
8392 STDCHAR *bp;
8393 SSize_t cnt;
8394 int i = 0;
8395 int rspara = 0;
...
8429 if (IN_PERL_COMPILETIME) {
...
8460 else if (RsPARA(PL_rs)) {
8461 rsptr = "\n\n";
8462 rslen = 2;
8463 rspara = 1;
8464 }
...
8486 if (rspara) { /* have to do this both before and
after
*/
8487 do { /* to make sure file boundaries work right
*/
8488 if (PerlIO_eof(fp))
8489 return 0;
8490 i = PerlIO_getc(fp);
8491 if (i != '\n') {
8492 if (i == -1)
8493 return 0;
8494 PerlIO_ungetc(fp,i);
8495 break;
8496 }
8497 } while (i != EOF);
8498 }
#####

The last subsection above is the one where LGTM saw an unnecessary
comparison.

But then further down, towards the end of the definition of
Perl_sv_gets, we have​:

#####
8820 if (rspara) { /* have to do this both before and
after
*/
8821 while (i != EOF) { /* to make sure file boundaries
work
right */
8822 i = PerlIO_getc(fp);
8823 if (i != '\n') {
8824 PerlIO_ungetc(fp,i);
8825 break;
8826 }
8827 }
8828 }
#####

Is it possible that the test suite (e.g., t/base/rs.t) and my
additional program are only exercising the latter block?

Yes.

The first section is before the actual readline code, and consumes
any
newlines iff we're in paragraph mode; then the newline implementation
follows; and then similar code consumes all remaining newlines after
the line. So the most obvious way to trigger the first subsection
would be to create a file that starts with newlines, and read that in
paragraph mode, and indeed that would be broken by your patch.

We clearly should add a test for that.

Leon

Please see the latest attachment, 133703-more-paragraph-mode-tests.t.
I believe this code exercises the 'if' block at line 8486 of sv.3 in
blead.* If in the test file I have set up the expected results
correctly, then we have a bug. The output I get from running this
test either with perl-5.28.0 (via 'prove') or blead is​:

#####
$ prove -v 133703-more-paragraph-mode-tests.t
133703-more-paragraph-mode-tests.t ..
not ok 1 - Got expected paragraph 1
# Failed test 'Got expected paragraph 1'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '1
# 12345
#
# '
# expected​: '1
# '
not ok 2 - Got expected paragraph 2
# Failed test 'Got expected paragraph 2'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '123456
# '
# expected​: '12345
# '
not ok 3 - Got expected paragraph 3
# Failed test 'Got expected paragraph 3'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: undef
# expected​: '123456
# '
not ok 4 - Got expected paragraph 1
# Failed test 'Got expected paragraph 1'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '1
#
# '
# expected​: '1
# '
not ok 5 - Got expected paragraph 2
# Failed test 'Got expected paragraph 2'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '12345
#
# '
# expected​: '12345
# '
ok 6 - Got expected paragraph 3
not ok 7 - Got expected paragraph 1
# Failed test 'Got expected paragraph 1'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '1
#
# '
# expected​: '1
# '
not ok 8 - Got expected paragraph 2
# Failed test 'Got expected paragraph 2'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '12345
#
# '
# expected​: '12345
# '
ok 9 - Got expected paragraph 3
not ok 10 - Got expected paragraph 1
# Failed test 'Got expected paragraph 1'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '1
# 12345
#
# '
# expected​: '1
# '
not ok 11 - Got expected paragraph 2
# Failed test 'Got expected paragraph 2'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: '123456
#
# '
# expected​: '12345
# '
not ok 12 - Got expected paragraph 3
# Failed test 'Got expected paragraph 3'
# at 133703-more-paragraph-mode-tests.t line 40.
# got​: undef
# expected​: '123456
# '
1..12
# Looks like you failed 10 tests of 12.
Dubious, test returned 10 (wstat 2560, 0xa00)
Failed 10/12 subtests

Test Summary Report
-------------------
133703-more-paragraph-mode-tests.t (Wstat​: 2560 Tests​: 12 Failed​: 10)
Failed tests​: 1-5, 7-8, 10-12
Non-zero exit status​: 10
Files=1, Tests=12, 0 wallclock secs ( 0.02 usr 0.00 sys + 0.05 cusr
0.01 csys = 0.08 CPU)
Result​: FAIL#####
#####

The last test file I attached had some incorrect expectations. Please see the latest attachment, 133703-better-more-paragraph-mode-tests.t, which accounts for the fact that our definition of "paragraph" assumes that the 'graphs are separated by 2 or more newlines but may internally contain single newlines.

My impression is that "local $/ = '';" fails to strip superfluous newlines off the *first* paragraph it finds -- but this needs more eyeballs.

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Dec 1, 2018

From @jkeenan

#!/usr/bin/env perl
#use strict;
use 5.14.0;
use warnings;
use File​::Temp qw( tempfile );
use Test​::More;

my $chunks = [ 1, 12345, 123456 ];
my $teststring;

$teststring = "\n\n\n$chunks->[0]\n\n$chunks->[1]\n\n$chunks->[2]\n";
test_paragraph_mode($chunks, $teststring);
simulate_paragraph_mode($chunks, $teststring);

$teststring = "\n\n\n$chunks->[0]\n\n$chunks->[1]\n\n\n$chunks->[2]\n";
test_paragraph_mode($chunks, $teststring);
simulate_paragraph_mode($chunks, $teststring);

$teststring = "\n\n\n$chunks->[0]\n\n\n$chunks->[1]\n\n$chunks->[2]\n";
test_paragraph_mode($chunks, $teststring);
simulate_paragraph_mode($chunks, $teststring);

$teststring = "\n\n\n$chunks->[0]\n\n$chunks->[1]\n\n$chunks->[2]";
test_paragraph_mode($chunks, $teststring);
simulate_paragraph_mode($chunks, $teststring);

note("XXX​: Paragraphs with internal newlines");

$teststring = "\n\n\n$chunks->[0]\n$chunks->[1]\n\n$chunks->[2]\n";
test_paragraph_mode_internal_newline(
  $teststring,
  [ "$chunks->[0]\n$chunks->[1]\n", "$chunks->[2]\n" ]
);

$teststring = "\n\n\n$chunks->[0]\n$chunks->[1]\n\n\n$chunks->[2]\n";
test_paragraph_mode_internal_newline(
  $teststring,
  [ "$chunks->[0]\n$chunks->[1]\n", "$chunks->[2]\n" ]
);

$teststring = "\n\n\n$chunks->[0]\n$chunks->[1]\n\n$chunks->[2]";
test_paragraph_mode_internal_newline(
  $teststring,
  [ "$chunks->[0]\n$chunks->[1]\n", $chunks->[2] ]
);

done_testing();

##### SUBROUTINES #####

sub test_paragraph_mode {
  my ($chunks, $teststring) = @​_;
  my ($fh, $filename) = tempfile();
  my @​expected = map { "$_\n" } @​{$chunks};

  open my $OUT, '>', $filename or die;
  binmode $OUT;
  print $OUT $teststring;
  close $OUT or die;

  local $/ = ""; # paragraph mode
  open my $IN, '<', $filename or die;
  binmode $IN;
  for (my $i = 0; $i &lt;= $#expected; $i++) {
  my $j = $i+1;
  is(<$IN>, $expected[$i], "Got expected paragraph $j");
  }
  close $IN or die;
}

sub test_paragraph_mode_internal_newline {
  my ($teststring, $expect) = @​_;
  my ($fh, $filename) = tempfile();

  open my $OUT, '>', $filename or die;
  binmode $OUT;
  print $OUT $teststring;
  close $OUT or die;

  local $/ = ""; # paragraph mode
  open my $IN, '<', $filename or die;
  binmode $IN;
  for (my $i = 0; $i &lt;= $#{$expect}; $i++) {
  my $j = $i+1;
  is(<$IN>, $expect->[$i], "Got expected paragraph $j");
  }
  close $IN or die;
}

sub simulate_paragraph_mode {
  my ($chunks, $teststring) = @​_;
  my ($ts) = $teststring =~ s/^\n*//r;
  #print "AAA​: $ts\n";
  $ts =~ tr/\n/\n/s;
  is_deeply([ split /\n/, $ts ], $chunks, "Got expected simulation");
}

@p5pRT
Copy link
Author

p5pRT commented Dec 12, 2018

From @jkeenan

On Sat, 01 Dec 2018 04​:49​:35 GMT, jkeenan wrote​:
[snip]

The last test file I attached had some incorrect expectations. Please
see the latest attachment, 133703-better-more-paragraph-mode-tests.t,
which accounts for the fact that our definition of "paragraph" assumes
that the 'graphs are separated by 2 or more newlines but may
internally contain single newlines.

My impression is that "local $/ = '';" fails to strip superfluous
newlines off the *first* paragraph it finds -- but this needs more
eyeballs.

Thank you very much.

Closing this ticket and replacing it with https://rt-archive.perl.org/perl5/Ticket/Display.html?id=133722.

Thank you very much.
--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Dec 12, 2018

@jkeenan - Status changed from 'open' to 'rejected'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant