Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean up code handling various line endings in toke.c #4373

Open
p5pRT opened this issue Sep 4, 2001 · 8 comments
Open

clean up code handling various line endings in toke.c #4373

p5pRT opened this issue Sep 4, 2001 · 8 comments

Comments

@p5pRT
Copy link

p5pRT commented Sep 4, 2001

Migrated from rt.perl.org#7624 (status was 'open')

Searchable as RT7624$

@p5pRT
Copy link
Author

p5pRT commented Sep 4, 2001

From crowell01@lightandmatter.com

  OK, I'm a Perl newbie, so forgive my audacity, but I think
I've found something that, if not a bug in Perl, is definitely
a misfeature ;-)

  I use Perl on MacOS X, which can use both Unix and non-Unix newlines.
It seems that when Perl is interpreting a .pm file that contains
non-Unix newlines, it doesn't do either of the things that, IMHO,
would be reasonable​: (1) complain about non-Unix newlines, or (2)
treat the non-Unix newlines the same way it would treat Unix
newlines. Instead, it seems to treat the whole file as if it
was empty. The result is an error message saying simply that the
module didn't return a true value, which makes it hard to diagnose
the problem -- my editor (BBEdit) doesn't show any hint of which
flavor of newlines a file contains.

  I don't know for sure, but my guess is that it thinks the
whole file is just one long line, i.e. the newlines are interpreted
as if they were equivalent to blanks or tabs. Since the first line is
a #/usr/bin/perl comment, the whole file is treated as a comment line.

  I don't know if it would be ridiculously hard to make
Perl agnostic about the flavor of newlines, but I think it might
make sense for it at least to issue a warning if it detects that the
whole file contains no Unix newlines whatsoever, but does contain
a bunch of non-Unix newlines.

  Anyhow, I hope I'm not completely wasting your time here.
Thanks for all the work you folks have put into this wonderful
piece of free software!

  Ben Crowell

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2008

From p5p@spam.wizbit.be

This issue is not limited to Mac OS X.

Example (on linux)​:

$ echo -en 'print "1​: " . __FILE__;\n# comment\nprint "2​: " .
__FILE__;\n;1;\n' > Foo_n.pm
$ perl -MFoo_n -le1
1​: Foo_n.pm
2​: Foo_n.pm

$ echo -en 'print "1​: " . __FILE__;\r\n# comment\r\nprint "2​: " .
__FILE__;\r\n;1;\r\n' > Foo_rn.pm
$ perl -MFoo_rn -le1
1​: Foo_rn.pm
2​: Foo_rn.pm

$ echo -en 'print "1​: " . __FILE__;\r# comment\rprint "2​: " .
__FILE__;\r;1;\r' > Foo_r.pm
$ perl -MFoo_r -le1
1​: Foo_r.pm

(note the missing 2​: Foo_r.pm line)

@p5pRT
Copy link
Author

p5pRT commented Aug 9, 2013

From @jkeenan

On Tue Sep 04 14​:29​:26 2001, crowell01@​lightandmatter.com wrote​:
[snip]

I use Perl on MacOS X\, which can use both Unix and non\-Unix newlines\.

It seems that when Perl is interpreting a .pm file that contains
non-Unix newlines, it doesn't do either of the things that, IMHO,
would be reasonable​: (1) complain about non-Unix newlines, or (2)
treat the non-Unix newlines the same way it would treat Unix
newlines. Instead, it seems to treat the whole file as if it
was empty. The result is an error message saying simply that the
module didn't return a true value, which makes it hard to diagnose
the problem -- my editor (BBEdit) doesn't show any hint of which
flavor of newlines a file contains.

I don't know for sure\, but my guess is that it thinks the

whole file is just one long line, i.e. the newlines are interpreted
as if they were equivalent to blanks or tabs. Since the first line is
a #/usr/bin/perl comment, the whole file is treated as a comment line.

I don't know if it would be ridiculously hard to make

Perl agnostic about the flavor of newlines, but I think it might
make sense for it at least to issue a warning if it detects that the
whole file contains no Unix newlines whatsoever, but does contain
a bunch of non-Unix newlines.

And then, on Fri Jul 25 14​:37​:08 2008, animator wrote​:

This issue is not limited to Mac OS X.

Example (on linux)​:

$ echo -en 'print "1​: " . __FILE__;\n# comment\nprint "2​: " .
__FILE__;\n;1;\n' > Foo_n.pm
$ perl -MFoo_n -le1
1​: Foo_n.pm
2​: Foo_n.pm

$ echo -en 'print "1​: " . __FILE__;\r\n# comment\r\nprint "2​: " .
__FILE__;\r\n;1;\r\n' > Foo_rn.pm
$ perl -MFoo_rn -le1
1​: Foo_rn.pm
2​: Foo_rn.pm

$ echo -en 'print "1​: " . __FILE__;\r# comment\rprint "2​: " .
__FILE__;\r;1;\r' > Foo_r.pm
$ perl -MFoo_r -le1
1​: Foo_r.pm

(note the missing 2​: Foo_r.pm line)

1. Wouldn't it be more precise to say that this is the result of using
'\r' as the newline character?

$ echo -en 'print "1​: " . __FILE__;\r# comment\rprint "2​: " .
__FILE__;\r;1;\r' > Foo_r.pm

[# be sure to run above without the formatting imposed by RT ]

$ $ od -c Foo_r.pm
0000000 p r i n t " 1 : " . _ _
0000020 F I L E _ _ ; \r # c o m m e n
0000040 t \r p r i n t " 2 : " .
0000060 _ _ F I L E _ _ ; \r ; 1 ; \r
0000076

$ view Foo_r.pm

print "1​: " . __FILE__;^M# comment^Mprint "2​: " . __FILE__;^M;1;^M

2. Is this something we have to worry about these days?

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Aug 9, 2013

From @cpansprout

On Thu Aug 08 17​:02​:52 2013, jkeenan wrote​:

$ view Foo_r.pm

print "1​: " . __FILE__;^M# comment^Mprint "2​: " . __FILE__;^M;1;^M

2. Is this something we have to worry about these days?

toke.c (the lexer) has a *lot* of code to handle precisely this. It
just happens to be about the most buggy code I’ve ever seen. :-)

It really does need to be cleaned up. Either we fix the code
(preferable) or delete it (probably to backward-incompatible).

Anything dealing with line breaks and other whitespace in the lexer is
currently very complex, so this is a daunting task.

It’s definitely a to-do item, and this ticket can represent that item.

--

Father Chrysostomos

@xenu xenu removed the Severity Low label Dec 29, 2021
@khwilliamson khwilliamson changed the title handling of non-Unix newlines handling of files which have just \r meaning newlines May 2, 2022
@khwilliamson
Copy link
Contributor

@xenu says the only platform we ever supported that used \r for its line breaks was classic Mac, and that is no longer supported by Apple. So I think we can close this

@khwilliamson khwilliamson added the Closable? We might be able to close this ticket, but we need to check with the reporter label May 2, 2022
@hvds
Copy link
Contributor

hvds commented May 2, 2022

@xenu says the only platform we ever supported that used \r for its line breaks was classic Mac, and that is no longer supported by Apple. So I think we can close this

The most recent preceding comment from @cpansprout still seems relevant: if there's cleanup to do of code attempting to handle this, then either this ticket should be left open or a new ticket should be created for that cleanup.

@jkeenan
Copy link
Contributor

jkeenan commented Jul 1, 2022

@xenu says the only platform we ever supported that used \r for its line breaks was classic Mac, and that is no longer supported by Apple. So I think we can close this

The most recent preceding comment from @cpansprout still seems relevant: if there's cleanup to do of code attempting to handle this, then either this ticket should be left open or a new ticket should be created for that cleanup.

@hvds, if the code referenced by @cpansprout is ever going to be cleaned up, the discussion explaining that cleanup deserves its own ticket -- not a ticket opened more than 20 years ago.

Would you be able to open such a ticket? (I will then put this one out of its misery.)

Thank you very much.
Jim Keenan

@hvds hvds changed the title handling of files which have just \r meaning newlines clean up code handling various line endings in toke.c Jul 1, 2022
@hvds
Copy link
Contributor

hvds commented Jul 1, 2022

@hvds, if the code referenced by @cpansprout is ever going to be cleaned up, the discussion explaining that cleanup deserves its own ticket -- not a ticket opened more than 20 years ago.

Would you be able to open such a ticket? (I will then put this one out of its misery.)

The message from @cpansprout on Aug 9, 2013 is pretty clear, I see no particular value in copying it to a new ticket, and even less value to divorcing it from the context that gave rise to it.

There's some work to do here, but we're a volunteer organisation and it won't get done unless someone volunteers to do it. In the meantime it costs us very little to leave the ticket open - about the only way we can add value here is to change the subject of the ticket, so I've done that.

@jkeenan jkeenan removed the Closable? We might be able to close this ticket, but we need to check with the reporter label Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants