New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setting $/ to read fixed records can corrupt valid UTF-8 input #10865
Comments
From @nwc10Created by @nwc10It's possible to get the perl interpreter to have corrupt internal state on [Command-line -C7 sets UTF-8 on STD{IN,OUT,ERR}, and $/ = \4096 sets reads to $ ./perl -C7 -e 'print "\x{20AC}" x 1366' | ./perl -C7 -e '$/ = \4096; $_ = <>; printf "%s\n", length $_' Note that unlike other concerns with the utf8 layer not trapping *in*valid Clearer to see is: $ ./perl -C7 -e 'print "\x{20AC}"' | ./perl -C7 -e '$/ = \2; $_ = <>; printf "%s\n", length $_' The input is truncated at 2 octets: $ ./perl -C7 -e 'print "\x{20AC}"' | ./perl -C7 -Ilib -MDevel::Peek -e '$/ = \2; $_ = <>; Dump $_' The dump should look like this: $ ./perl -C7 -Ilib -MDevel::Peek -e 'Dump "\x{20AC}"' Curiously there also seems to be range checking error in the dump code, as a $ ./perl -C7 -e 'print "\x{A3}"' | ./perl -Ilib -MDevel::Peek -C7 -we '$/ = \1; $_ = <>; Dump $_' The relevant code for this problem is in S_sv_gets_read_record(). It's not immediately obvious to me what the correct solution is. On the one hand, the user asked for a fixed record length, and on VMS we use a: refusing to read on UTF-8 file handles. (make it croak) Or we could try to do what read and sysread do, and treat the length parameter Nicholas Clark Perl Info
|
From @HugmeirOn Mon Nov 29 08:05:05 2010, nicholas wrote:
I'd say make it croak, maybe add a "consider using sysread() or binmode |
The RT System itself - Status changed from 'new' to 'open' |
From PeterCMartini@GMail.comOn Sep 28, 2011, at 12:23, "Brian Fraser via RT" <perlbug-followup@perl.org> wrote:
It's certainly never right, but it happens (not every writing process knows how to truncate UTF-8 properly). I like the idea of croaking immediately and pointing to binmode, hopefully with an additional note about how to recover from byte-wise rather than character-wise truncations. I've always thought of $/ as a tool for reading fixed-length packed records anyway, potentially with mixed binary and text, and utf8 mode would never be appropriate for that. |
From @cpansproutOn Mon Nov 29 08:05:05 2010, nicholas wrote:
I choose b: Someone may be using this feature on space-padded (or |
From @ikegamiOn Mon, Nov 29, 2010 at 11:05 AM, Nicholas Clark
It seems that the implication is that (a) or (b) would somehow allow records -- Scenario 1: Let's say a record consists of two 5 byte fields of UTF-8 text, a) Croaks. (b) and (c) aren't useful. -- Scenario 2: Let's say a record consists of two 5 character fields, and that a) Croaks. (c) is useful, but (b) isn't. -- So: Since (b) and (c) behave the same on binary handles, it seems to me that (c) Between (a) and (c), I prefer (c) since I don't see enough justification to - Eric |
From @ikegami
That should be C<< map decode "UTF-8", unpack "A5A5", $_ >>. |
From @cpansproutOn Sun Oct 02 22:24:46 2011, ikegami@adaelis.com wrote:
It can be useful if the records are all single fields.
|
From @ikegamiOn Sun, Oct 23, 2011 at 4:56 PM, Father Chrysostomos via RT <
ok, so we have: a: Refusing to read on UTF-8 file handles by croaking. handle aaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbb bytes ok: Returns record ok: Returns record ok: Returns utf8 ok: Croaks ok: Returns "record" ok: Returns utf8 ok: Croaks FAIL: Record can't be ok: Returns - Only (c) can handle both records measured in bytes and records measured in But: - Only (b) is backwards compatible with existing behaviour (although the - Eric |
From @davidnicolOn Mon, Oct 24, 2011 at 1:54 PM, Eric Brine <ikegami@adaelis.com> wrote:
Why would anyone possibly want fixed-length records in chars? Because
d: b, but there is a way to turn off the croaking, and when it has e: a, plus also croak earlier, at compile time if possible, by doing |
From @ikegamiOn Tue, Feb 21, 2012 at 11:33 PM, David Nicol <davidnicol@gmail.com> wrote:
If I read that correctly, (d) would mean that no strict "utf8"; # or whatever does when ($i==1) { $buf = "\xE2"; } I don't see how returning bytes that don't even exist in the file is of any - Eric |
From @khwilliamsonI can't find my proposal in the record of this ticket, nor anyone I think we need to do something on this for 5.16. At the minimum, we If even that isn't acceptable, we could add this to the |
From @nwc10On Thu, Mar 01, 2012 at 10:13:15AM -0800, Karl Williamson via RT wrote:
Specifically, the code is emulated on "everything else", but intended to #ifdef VMS perlvar.pod says: On VMS, record reads are done with the equivalent of C<sysread>,
So I'd like to know, if a programmer on VMS sets $/ to read records, but on (and if the answer is "their head examining", that's actually useful, as it Nicholas Clark |
From @ikegamiOn Thu, Mar 1, 2012 at 1:13 PM, Karl Williamson via RT <
Sounds like "can't make everyone happy, so make everyone unhappy". It doesn't help those who want $/ to control the number of elements read by It doesn't help those who want $/ to control the number of elements read by |
From @ikegamiOn Thu, Mar 1, 2012 at 2:22 PM, Eric Brine <ikegami@adaelis.com> wrote:
- Eric |
From @craigberryOn Mar 1, 2012, at 12:30 PM, Nicholas Clark wrote:
I think that would require making :utf8 into its own layer with its own buffer, which has been discussed over in [perl #100058].
I don't think this code is as meaningful as it used to be since unix I/O is the bottom layer for PerlIO now. Which means that PerlLIO_read and PerlIO_read (differing only by the "L") are really the same thing, i.e., both boil down to read(). I guess we can't simplify this code until and unless using stdio as the bottom layer is truly deprecated and expunged.
Yes, it's pretty daft to expect whole, varying-width characters to stay whole when you can only get a fixed-width chunk at a time and the chunks are measured in bytes. So far the only difference for VMS that I've thought of derives from this note in the CRTL help entry on read(): The read function does not span record boundaries in a So that means that if you set While it might be less of a corner case and more of a mainstream thing to do on VMS, I can't think of any way that this is substantively different from what would happen on any OS when reading through a pipe or a socket or a PerlIO layer or /dev/mumble that has a fixed-sized buffer measured in bytes. What happens on Unix when you have a pipe buffer that is 8192 bytes and you set $/ to 8193 and read a record containing UTF-8 data through the pipe? ________________________________________ "... getting out of a sonnet is much more |
From @sciuriusNicholas Clark <nick@ccl4.org> writes:
In VMS, and its predecessor RSX, the purpose of a file with fixed width In flat files systems, you can find record NNN by seeking to position -- Johan |
From @ikegamiOn Thu, Mar 1, 2012 at 6:17 PM, Craig A. Berry <craigberry@mac.com> wrote:
Perl requests 8K (formerly 4K) chunks until it has received enough. It |
From @craigberryOn Mar 2, 2012, at 3:07 AM, Eric Brine wrote:
I think you're thinking of the PerlIO buffer that I increased from 4K to the larger of 8K and BUFSIZ in 5.14, and which only applies to the perlio layer. But S_sv_gets_read_record calls PerlIO_read, which just retrieves the base layer (formerly stdio, currently unix) and calls its Read method, which is just read(). So there is no buffering under Perl's control. I was thinking of a situation where something external to Perl limits how much data you can get in one read and thus gives you less than the full amount requested by $/. I'm pretty sure you'll get mangled UTF-8 if you happen to be mid-character when you hit the end of the device buffer. To test this, you'd need to know something about the internals of your system's pipe implementation (or other device with a fixed buffer). ________________________________________ "... getting out of a sonnet is much more |
From @nwc10On Thu, Mar 01, 2012 at 05:17:27PM -0600, Craig A. Berry wrote:
I don't think you're correct on that one. read() is not stdio. It's (at least On Fri, Mar 02, 2012 at 08:11:10AM -0600, Craig A. Berry wrote:
You mean set $/ to \8193
I don't think that discussing this in terms of what non-VMS does with $/ set http://perl5.git.perl.org/perl.git/commitdiff/5b2b9c687790241e85aa7b76 ) In that, the whole bug report is about "what *should* this do?" because what The reason I'm specifically asking "what does a VMS programmer *want*?" is 1) is there a sane VMS native interpretation of "UTF-8 coming from a fixed and only when that's answered is there 2) what do we fake on other platforms? [and I think it's also premature to consider whether this needs :utf8 as a The possibly useful analogy is "what happens with a :utf8 layer on sysread?" goto more_bytes; ie - it's actually a different behaviour. It makes multiple syscalls. Blech. [and, thinking about it now, about 14 years later, possibly that non-VMS Nicholas Clark |
From @craigberryOn Mar 2, 2012, at 1:51 AM, Johan Vromans wrote:
Not really relevant to the discussion of $/ and UTF-8, but to be pedantically correct, relative access to files with fixed-length records is only one of several random access methods available on VMS. The others don't require the records to be fixed length. ________________________________________ "... getting out of a sonnet is much more |
From @craigberryOn Mar 2, 2012, at 9:15 AM, Nicholas Clark wrote:
Yes, I know read() is not stdio and that fread() is. That was my point. Unless I'm really missing something, there is no fread() involved anymore since unix is the bottom PerlIO layer. The comment in the code about avoiding fread() on VMS was only relevant when stdio was the bottom layer, and I believe it may have been the only layer at all when that comment was written. PerlIO_read (from the branch of the else that you snipped): { used to be a wrapper around fread() when stdio was the bottom layer, but is now a wrapper around read(). Which means that both branches of the if do exactly the same thing: call read(). Which means we could get rid of the VMS-specific code in S_sv_gets_read_record but *only* if we were willing to say that stdio can't ever be the bottom layer (as opposed to no longer being the default bottom layer).
Yes, calling read() grabs a single record. My point was just that unless we configure with -UUSEPERLIO, we'll currently get read() from both branches of that if.
No. Over in [perl #100058] I started but never sent a response to David Nicol's question that may be relevant here: On Fri, Oct 14, 2011 at 3:11 PM, David Nicol <davidnicol@gmail.com> wrote:
I work with record-oriented files, fixed-length and variable-length, almost every day and I have done so off and on for many years. My experience is certainly not comprehensive and my memory may be faulty, but there is no scenario I can remember or imagine where any character interpretation at all (even ASCII) would be imposed on a fixed-length record. The record may very well contain structured data, and some of the fields in it may contain character data. Interpreting that data has nothing to do with processing the records and vice versa.
Yeah, now for the hard part. I'm not much of a language designer and not much of a Unicode wonk, but my feeling is that reading a specific number of characters in one go when the characters are of variable size is a problem that is utterly different from and unrelated to dealing with fixed-length records. It is a third way of defining a record that is as different from defining it by length or delimiter as those two ways are from each other. I understand that something must be done to fix the mayhem that results when imposing :utf8 on a byte stream that may get truncated mid-character. I don't know the best way to do that, but I don't think pretending that the byte stream is not a byte stream makes sense.
Again, I'm almost positive we switched that fread() to read() when we switched the default bottom layer from stdio to unix. ________________________________________ "... getting out of a sonnet is much more |
From @ikegamiOn Fri, Mar 2, 2012 at 9:11 AM, Craig A. Berry <craigberry@mac.com> wrote:
I'm not "thinking" anything. I'm reporting what Perl does as seen by
Yes, I'm only reporting what happens on a standard unix build. But isn't I was thinking of a situation where something external to Perl limits how
That's exactly the situation I described. Here, let me provide the strace $ strace perl -e'$/=\40; <>;' < /dev/random
No, because Perl will just ask for more. You'll get mangled UTF-8 if you (If we were talking about sysread instead of readline or read, then yes, it - Eric |
From @craigberryOn Mar 2, 2012, at 11:10 AM, Craig A. Berry wrote:
Sigh. This was only about half right. It's true that there is no fread() involved anymore, and read() sits at the bottom in all cases, but PerlIO_read, as we currently do on non-VMS, does buffering, whereas PerlLIO_read (which in the non-threads case is actually just a macro defined as read) does not. On non-VMS the call stack when using $/ = \N looks like: read syscall whereas on VMS, the call stack looks like: read syscall, aka PerlLIO_read ________________________________________ "... getting out of a sonnet is much more |
From @ikegamiOn Fri, Mar 2, 2012 at 2:03 PM, Eric Brine <ikegami@adaelis.com> wrote:
I was thinking of a situation where something external to Perl limits how
And here's an example where one character is read using two reads: $ perl -C -e'print "a"x8191, chr(0x2660)' > x $ ls -l x $ perl -le'use open ":std", ":utf8"; $/=\8194; $_=<>; print $_ eq strace: read(0, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"..., 8192) = 8192 |
From @demerphqOn 1 March 2012 20:22, Eric Brine <ikegami@adaelis.com> wrote:
I think that controlling the leftmost layer, iow, bytes read from I mean, maybe the person doing the reading has a supposedly null yves -- |
From @craigberryOn Mar 3, 2012, at 1:29 AM, Eric Brine wrote:
Thanks for clarifying my muddy thinking, Eric. I was neglecting the effects of the buffering layer because it's not used for record mode on VMS and I had erroneously convinced myself that it's not used elsewhere either, but it is. As long as the perlio buffer is larger than the requested record size, it looks like it will insulate you from anything external to Perl giving you less than the requested size. So does your second example demonstrate that if you request something larger than the perlio buffer, then you can get caught mid-character on buffer boundaries as well as record boundaries? And does that first 8192-byte chunk get loaded into an SV that is then invalid if its UTF-8 flag is on? ________________________________________ "... getting out of a sonnet is much more |
From @ikegamiOn Sat, Mar 3, 2012 at 6:56 PM, Craig A. Berry <craigberry@mac.com> wrote:
then you can get caught mid-character on buffer boundaries as well as
I'm not sure what "caught" means here. It demonstrates that if a character is split across a chunk boundary (which And does that first 8192-byte chunk get loaded into an SV that is then
I don't know the implementation, so I don't know if it's possible for the I think the real question is: What happens if the second read fails? - Eric |
From @tonycozOn Sun, Oct 02, 2011 at 01:40:51PM -0700, Father Chrysostomos via RT wrote:
Except it only vaguely accidentally works now.* It's not counting bytes in the file, but bytes read from the top most tony@mars:~$ echo 'ABCDEDGH' | iconv -t UTF16LE | ~/perl-v5.15.8-76-g1cd16d8/bin/perl -MDevel::Peek -e 'binmode STDIN, ":encoding(utf16le)"; $/ = \4; $x = <STDIN>; Dump($x)' If that C< $/ = \4; > were returning that number of bytes from the tony@mars:~$ echo '€ABCDEFGH' | iconv -t UTF16LE | ~/perl-v5.15.8-76-g1cd16d8/bin/perl -MDevel::Peek -e 'binmode STDIN, ":encoding(utf16le)"; $/ = \4; $x = <STDIN>; Dump($x)' In this case I got lucky and managed to consume 4 bytes from the file. Expecting byte semantics out of a stream we've deliberately chosen to I think it should behave like read(). Tony * it isn't specifically this message, but several seem to assume that |
From @rjbsI think the list came up with three likely ways forward. I don't /think/ either was entirely shot (1) $/=\10 makes <$fh> behave like read $fh, $str, 10 -- meaning that $str is either (a) ten (2) $/=\10 being applied to a non-single-width encoded filehandle is fatal on read. (3) $/=\10 is left working as it does now, but there is a warning when it's in effect for a n-s- Option (1) seemed to get the least discussion, at least that I saw in my re-reading, but I think I'm not sure how feasible these are for implementation, as I imagine there will be some grotty |
From @ikegamiOn Mon, Mar 12, 2012 at 10:23 AM, Ricardo SIGNES via RT <
[...]
If it's the same behaviour as C<read($fh, - Eric meaning that $str is either (a) ten units longer Nit: "Long", not "longer". C<readline> doesn't have anything to which to If you're reading a handle that's decoding a variable length encoding, the
Nit: Any of zero, one or more dips can be needed for either "kind" of |
From @LeontOn Mon Mar 12 07:23:03 2012, rjbs wrote:
It doesn't really work like that. You can't know if a handle is Also, how does this work wrt crlf-translation? That is a similar issue Leon |
From @tonycozOn Mon, Mar 12, 2012 at 07:23:04AM -0700, Ricardo SIGNES via RT wrote:
Sort of implemented in tonyc/readline-fixed. The biggest issue is VMS, since that falls back to directly calling That make sense for a VMS record-based file, but if it's not record The other issue is that the :utf8 handle can return invalid utf-8 - Tony |
From @rjbs* Tony Cook <tony@develop-help.com> [2012-03-16T09:27:55]
Keen!
Okay. I will think about that, but hope to hear from Craig [or any other
This is getting worked on elsewhere and I assume that when it is addressed, -- |
From @craigberryOn Fri, Mar 16, 2012 at 9:05 AM, Ricardo Signes
I've browsed Tony's branch and his commit message correctly notes: - VMS is the elephant in the room - the conditional means that the new It is possible to determine whether a file we have open is <http://perl5.git.perl.org/perl.git/commit/8c8488cd4fce90cb5c03fb3f89e89c05e5275498> I think what Tony is suggesting is that if we are reading from a I think that could work for some definition of work. Adding a third |
From @khwilliamsonOn 03/16/2012 10:59 AM, Craig A. Berry wrote:
That is my understanding.
I am not convinced that it is the right thing to do to add this new I believe that it's possible to be "too clever by half" in the sense of |
From @tonycozOn Sat, Mar 17, 2012 at 12:27:55AM +1100, Tony Cook wrote:
Actually, thinking about it further - it doesn't make sense. Currently reading from a file with $/ = \N under VMS ignores So any encoding layer is ignored. But I suspect fixing that for record oriented files would require a Tony |
From @tonycozOn Fri, Mar 16, 2012 at 01:32:54PM -0600, Karl Williamson wrote:
This doesn't add a new interpretation, it replaces the existing From "stream is utf-8 ? read N bytes of the stream as encoded in UTF-8
The problem is the current behaviour of getline with C<$/ = \N;> is 1) it ignores the abstraction - the user has put the stream in a mode 2) the byte count only ever matches the underlying stream for utf-8 # read a latin1 stream, see the ticket for other examples 3) getline() with $/ = \N can leave the stream in an inconsistent 4) it can produce SVs marked as UTF-8 with invalid UTF-8 data My change is intended to produce the same behaviour as read() on the I can think of some other reasonable behaviours: a) croak() if the stream is in utf-8 mode - but in general (I think) b) always call PerlLIO_read() (aka sysread()), but in this case it c) extend the PerlIO_read_record() mentioned in my note on VMS Now I can't think of a case where I'd use $/ = \N or read($fh, Tony * this is probably a bug in sysread()**: $ perl -e 'print "foo\xE4"' | perl -MDevel::Peek -e 'binmode STDIN, ":encoding(latin1)"; sysread(STDIN, $x, 4); Dump $x' ** even if it's documented to work that way, ugh *** I think badly encoded UTF-8 SVs should cause a panic when detected |
From @rjbs* Karl Williamson <public@khwilliamson.com> [2012-03-16T15:32:54]
I would not say that we are adding a new meaning. I think we are making
I agree: I am not a fan of too much DWIM. I don't like the "figure out I don't see this as a weird made-up behavior to avoid an exception. I So, rather than "who would use this?" I wonder "which design is more The alternative is something like, "It means to read X bytes at a time, So: I think this is easier to explain, which is usually a good sign for But: I am open to being told that I'm wrong and nuts, especially if it -- |
From @craigberryOn Fri, Mar 16, 2012 at 5:10 PM, Tony Cook <tony@develop-help.com> wrote:
That's what it already does on VMS, but it currently does it The patch introduces a couple of new test failures (t/io/errno.t and The patch doesn't do anything for the problem with reading |
From @craigberry0001-Only-handle-PL_rs-differently-on-VMS-for-record-orie.patchFrom 4bce77cf02b9db2e4ca0e81ba33d9c3500faa1ff Mon Sep 17 00:00:00 2001
From: "Craig A. Berry" <craigberry@mac.com>
Date: Fri, 16 Mar 2012 14:20:29 -0500
Subject: [PATCH] Only handle PL_rs differently on VMS for record-oriented
files.
For stream-oriented files, the effects of buffering and other
layers should be exactly as they are on other platforms. For true,
record-oriented files, though, setting $/ = \number must provide
exactly one low-level read per record. If a read were ever to
return less than a full record (due to, for example, filling up
the perlio buffer), a subsequent read would get the *next* record,
losing whatever data remained in the partially-read record.
---
sv.c | 29 +++++++++++++++++++----------
1 files changed, 19 insertions(+), 10 deletions(-)
diff --git a/sv.c b/sv.c
index 6a303cc..6a784d7 100644
--- a/sv.c
+++ b/sv.c
@@ -7561,22 +7561,31 @@ S_sv_gets_read_record(pTHX_ SV *const sv, PerlIO *const fp, I32 append)
const STRLEN recsize = SvUV(SvRV(PL_rs)); /* RsRECORD() guarantees > 0. */
/* Grab the size of the record we're getting */
char *buffer = SvGROW(sv, (STRLEN)(recsize + append + 1)) + append;
+
+ /* Go yank in */
#ifdef VMS
+#include <rms.h>
int fd;
-#endif
+ Stat_t st;
- /* Go yank in */
-#ifdef VMS
- /* VMS wants read instead of fread, because fread doesn't respect */
- /* RMS record boundaries. This is not necessarily a good thing to be */
- /* doing, but we've got no other real choice - except avoid stdio
- as implementation - perhaps write a :vms layer ?
- */
+ /* With a true, record-oriented file on VMS, we need to use read directly
+ * to ensure that we respect RMS record boundaries. The user is responsible
+ * for providing a PL_rs value that corresponds to the FAB$W_MRS (maximum
+ * record size) field. N.B. This is likely to produce invalid results on
+ * varying-width character data when a record ends mid-character.
+ */
fd = PerlIO_fileno(fp);
- if (fd != -1) {
+ if (fd != -1
+ && PerlLIO_fstat(fd, &st) == 0
+ && (st.st_fab_rfm == FAB$C_VAR
+ || st.st_fab_rfm == FAB$C_VFC
+ || st.st_fab_rfm == FAB$C_FIX)) {
+
bytesread = PerlLIO_read(fd, buffer, recsize);
}
- else /* in-memory file from PerlIO::Scalar */
+ else /* in-memory file from PerlIO::Scalar
+ * or not a record-oriented file
+ */
#endif
{
bytesread = PerlIO_read(fp, buffer, recsize);
--
1.7.7.GIT
|
From @ikegamiFor those advocating $/ to control the number of bytes read at the lowest If you're going to create fixed-with record binary file whose records my $rec = encode($enc, $text); Why do you find it necessary to then go and treat it as a text file # :encoding($enc) when it's just as simple as treating it as a binary file? # :raw There is no need for $/ to indicate the number of bytes read at the lowest - Eric PS - The problems seems to boil down to the inability to tell Perl how to |
From @tonycozOn Mon, Mar 12, 2012 at 07:23:04AM -0700, Ricardo SIGNES via RT wrote:
An alternative approach for 5.16 - document the badness and warn the Also visible at: http://perl5.git.perl.org/perl.git/shortlog/refs/heads/tonyc/readline-fixed-doc Tony |
From @tonycoz0001-rt-79960-document-how-broken-N-is-for-unicode-stream.patchFrom 1d8aef309f72b162395b7beed4462bc5964d973f Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Sun, 18 Mar 2012 10:26:22 +1100
Subject: [PATCH] [rt #79960] document how broken $/ = \N is for unicode streams
It's kind of late in the release process to change how $/ = \N works
for unicode streams, briefly document how broken it is and let the
user know it may change.
---
pod/perlvar.pod | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/pod/perlvar.pod b/pod/perlvar.pod
index 72968f1..ea1f601 100644
--- a/pod/perlvar.pod
+++ b/pod/perlvar.pod
@@ -1348,6 +1348,13 @@ want to read in record mode is probably unusable in line mode.)
Non-VMS systems do normal I/O, so it's safe to mix record and
non-record reads of a file.
+If you perform a record read on a FILE with an encoding layer such as
+C<:encoding(latin1)> or C<:utf8>, you may get an invalid string as a
+result, may leave the FILE positioned between characters in the stream
+and may not be reading the number of bytes from the underlying file
+that you specified. This behaviour may change without warning in a
+future version of perl.
+
See also L<perlport/"Newlines">. Also see L</$.>.
Mnemonic: / delimits line boundaries when quoting poetry.
--
1.7.2.5
|
From @khwilliamsonOn 03/17/2012 04:53 AM, Eric Brine wrote:
I agree, and doing otherwise has been called crazy and insane by others I have come to the opinion that we don't have to accommodate this, |
From @khwilliamsonOn 03/16/2012 06:25 PM, Ricardo Signes wrote:
I am persuaded by your argument that this is a reasonable thing to do I do think something should be done for 5.16, and that a warning or |
@tonycoz - Status changed from 'open' to 'resolved' |
From @rjbs* Tony Cook via RT <perlbug-followup@perl.org> [2012-12-10T18:34:04]
Thanks very much! I'm glad this got applied in the end! -- |
Migrated from rt.perl.org#79960 (status was 'resolved')
Searchable as RT79960$
The text was updated successfully, but these errors were encountered: