Re: [LONG] Possible utf8 implementation #558

p5pRT · 1999-09-20T02:14:55Z

Migrated from rt.perl.org#1408 (status was 'resolved')

Searchable as RT1408$

p5pRT · 1999-09-20T02:14:55Z

From The RT System itself

I doubt it. Bear in mind \x{beef} is a placeholder for whatever it is
we choose to send out by default. Could just as easily be U+BEEF or
\uBEEF or \N{DEAD COW MEAT} or 뻯 or whatever.

: I don't mind the \x behaviour for error messages, but I'd really hate
: for it to happen when I'm writing what I think is raw data onto a rawsocket,
: and the data happens to contain unicode characters.
:
: Especially if I send out '"Content-length: ". length($var) ."\r\n"' before.
:
: If the socket is in raw data mode, and I don't have "use bytes;" in effect,
: it had _better_ die if I try to send a UTF8 string on the wire....

Death is not good. I reject death. I will stay away from trucks today.

If you're trying to send out Content-length without "use bytes" or its
equivalent then you'll get what you deserve. No guarantees will be
made about the length of the resulting string if you make Perl guess
how to translate utf8 to 8-bit. Contrariwise, if you do tell it how to
translate, then it'll do what you expect, up to and including dying
and losing track of some portion of the data you were trying to output.

Larry

p5pRT · 1999-09-20T02:36:12Z

From The RT System itself

:-)

If you're trying to send out Content-length without "use bytes" or its
equivalent then you'll get what you deserve. No guarantees will be
made about the length of the resulting string if you make Perl guess
how to translate utf8 to 8-bit. Contrariwise, if you do tell it how to
translate, then it'll do what you expect, up to and including dying
and losing track of some portion of the data you were trying to output.

"use bytes;" is inaccurate in many situations. And death isn't _that_
bad, especially seeing as you can throw an "eval{}" around it...

I, as a module author, do not want to be concerned with always testing
every string that gets passed to me. This goes for being compatible with
old code as well, as the old code is DEFINATELY not aware of any utf-8
issues.

At least a -w warning would be desirable. Similar to the warning
generated by:

$ perl -we '$a = "happy"; $b = "sad"; $c = $a + $b'
Name "main::c" used only once: possible typo at -e line 1.
-> Argument "sad" isn't numeric in add at -e line 1.
-> Argument "happy" isn't numeric in add at -e line 1.

Personally, i still think that if "use bytes;" is not used, and "use utf8"
is not used in a module, then it must be assumed that this module may not
know anything about utf8, and if it happens to write a utf8 string to
a socket not marked as being able to write utf8, well then, an error seems
the only *proper* thing to do.

mark

--
markm@nortelnetworks.com/mark@mielke.cc/markm@ncf.ca __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | CUE Development (4Y21)
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | Nortel Networks
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

p5pRT · 2003-04-22T15:40:16Z

@iabyn - Status changed from 'stalled' to 'resolved'

p5pRT · 2003-04-22T15:50:20Z

@iabyn - Status changed from 'stalled' to 'resolved'

p5pRT closed this as completed Apr 22, 2003

p5pRT added Severity Low documentation labels Oct 18, 2019

p5pRT mentioned this issue Oct 19, 2019

Program terminated with signal 11, Segmentation fault. #12469

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re: [LONG] Possible utf8 implementation #558

Re: [LONG] Possible utf8 implementation #558

p5pRT commented Sep 20, 1999

p5pRT commented Sep 20, 1999

p5pRT commented Sep 20, 1999

p5pRT commented Apr 22, 2003

p5pRT commented Apr 22, 2003

Re: [LONG] Possible utf8 implementation #558

Re: [LONG] Possible utf8 implementation #558

Comments

p5pRT commented Sep 20, 1999

p5pRT commented Sep 20, 1999

From The RT System itself

p5pRT commented Sep 20, 1999

From The RT System itself

p5pRT commented Apr 22, 2003

p5pRT commented Apr 22, 2003