Skip Menu |
Report information
Id: 132154
Status: open
Priority: 0/
Queue: perl6

Owner: Nobody
Requestors: alex.jakimenko [at] gmail.com
Cc:
AdminCc:

Severity: (no value)
Tag: (no value)
Platform: (no value)
Patch Status: (no value)
VM: (no value)



Subject: slurp is mangling newlines, it should not (slurp ‘foo’)
Download (untitled) / with headers
text/plain 1.1k
This was noticed in this PR: https://github.com/rakudo/rakudo/pull/1171 Basically, slurp is changing \r\n to \n. This may seem like a reasonable thing, except that then it has to also change \r to \n. However, replacing \r\n seems like a meaningless thing to do anyway. See https://docs.perl6.org/language/traps#Splitting_the_Input_Data_Into_Lines for some context. Long story short, .lines is already splitting the text using all possible variants of a newline. There's no need to mangle the data in slurp. For example, let's say you want to split the data strictly by \n. One way to do it is to call .lines on a Handle, then it works fine. But if it happens that you must slurp the whole thing, this is what you have to do: ‘foo’.IO.slurp(:bin).decode.split(“\n”) Notice how we use :bin to prevent it from doing the decoding, only to call .decode later anyway. All that just because .slurp itself is doing unnecessary data mangling. How to resolve this ticket: Option 1: stop doing what we do now. For example, don't do \r\n → \n substitution if it's under “use v6.d” Option 2: include \r and make sure it acts exactly like .lines.join(“\n”)
I'm shocked to see \r\n→\n translation being defended without any reasoning. Why are we doing it at all?

“\r\n => \n transformation is done consistently” consistently what? And why? Even .IO.lines splitting on \r\n is wrong because it cuts \r that may be part of the data (which is fine in the middle of a line, but not on the end).

On 2017-09-25 05:14:26, cpan@zoffix.com wrote:
Show quoted text
> Some discussion from IRC: https://irclog.perlgeek.de/perl6-dev/2017-
> 09-25#i_15212848


RT-Send-CC: perl6-compiler [...] perl.org
Download (untitled) / with headers
text/plain 1.1k
On Thu, 28 Sep 2017 13:56:49 -0700, alex.jakimenko@gmail.com wrote: Show quoted text
> I'm shocked to see \r\n→\n translation being defended without any > reasoning.
Well, it's just a pretty common thing to do. Perl did it for ages. And IIRC we already had tickets on this because Proc::Async didn't do it. Show quoted text
> Why are we doing it at all?
So you could write code that works the same on Window and Linux. Show quoted text
> “\r\n => \n transformation is done consistently” consistently what?
Proc::Async, IO::Handle, Str.encode, and custom decoders do this consistently. The code the OP mentions just does it in IO::Path because it's a fastpath and the `:translate-nl` option is (probably?) not fully exposed to users yet. Show quoted text
> Basically, slurp is changing \r\n to \n. This may seem like a > reasonable thing, except that then it has to also change \r to \n.
AFAIK we don't support any platform that uses `\r` as line separator character. I don't know why Str.lines cuts on lone `\r` Show quoted text
> Long story short, .lines is already splitting the text using all > possible variants of a newline. There's no need to mangle the data in > slurp.
This isn't about making input `.lines` friendly tho.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org