Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Carp::croak doesn't quite replace die - missing <fh> line # phrase #11569

Closed
p5pRT opened this issue Aug 9, 2011 · 34 comments
Closed

Carp::croak doesn't quite replace die - missing <fh> line # phrase #11569

p5pRT opened this issue Aug 9, 2011 · 34 comments

Comments

@p5pRT
Copy link

p5pRT commented Aug 9, 2011

Migrated from rt.perl.org#96672 (status was 'resolved')

Searchable as RT96672$

@p5pRT
Copy link
Author

p5pRT commented Aug 9, 2011

From root@overkill.sb.litts.net

Created by tlhackque@yahoo.com

Carp​::croak is documented in the camel book as "working like die does, except misdirecting blame to the caller."

Perldoc says "croak - die of errors (from perspective of caller)".

This is almost true, however careful inspection reveals that croak does not include file position information ( ", <fh> line $." ) as die does (when it can).

It turns out that it *is* possible to determine the current file handle (although I wish it were easier).

Carp should do this and add it to the traceback, as it really is useful information.

Here's how to generate the missing phrase​:

$msg = join( '', @​_ ) . " at $script line $n";
if( defined $. ) {
  local $@​ = '';
  eval {
  die;
  };
  if($@​ =~ /^Died at .*(, &lt;.*?&gt; line \d+).$/ ) {
  $msg .= $1;
  }
}
$msg .= ".\n";

(add traceback)

For extra credit, it would be nice to have an better mechanism to get the current file handle name bound to $. - perhaps $$.?

Test cases (type <cr> to get the output)​:

a) die's behavior​:
  $ perl -MCarp -e 'my $x=<STDIN>; die "xx";'

  xx at -e line 1, <STDIN> line 1.

b) Carp​::croak's behavior​:
  $ perl -MCarp -e 'my $x=<STDIN>; croak "xx";'

  xx at -e line 1

Thanks for your consideration.

Perl Info

Flags:
    category=library
    severity=medium
    module=Carp

This perlbug was built using Perl 5.12.4 in the Fedora build system.
It is being executed now by Perl 5.12.4 - Tue Jun 21 14:57:54 UTC 2011.

Site configuration information for perl 5.12.4:

Configured by Red Hat, Inc. at Tue Jun 21 14:57:54 UTC 2011.

Summary of my perl5 (revision 5 version 12 subversion 4) configuration:
   
  Platform:
    osname=linux, osvers=2.6.32-131.2.1.el6.x86_64, archname=x86_64-linux-thread-multi
    uname='linux x86-02.phx2.fedoraproject.org 2.6.32-131.2.1.el6.x86_64 #1 smp wed may 18 07:07:37 edt 2011 x86_64 x86_64 x86_64 gnulinux '
    config_args='-des -Doptimize=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -Dccdlflags=-Wl,--enable-new-dtags -DDEBUGGING=-g -Dversion=5.12.4 -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl5 -Dsitearch=/usr/local/lib64/perl5 -Dprivlib=/usr/share/perl5 -Dvendorlib=/usr/share/perl5/vendor_perl -Darchlib=/usr/lib64/perl5 -Dvendorarch=/usr/lib64/perl5/vendor_perl -Darchname=x86_64-linux-thread-multi -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64 -Duseshrplib -Dusethreads -Duseithreads -Dusedtrace=/usr/bin/dtrace -Duselargefiles -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_proto -Ud_sethostent_r_proto -Ud_endprotoent_r_proto!
  -Ud_setprotoent_r_proto -Ud_endservent_r_proto -Ud_setservent_r_proto -Dscriptdir=/usr/bin'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.6.0 20110530 (Red Hat 4.6.0-9)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -fstack-protector'
    libpth=/usr/local/lib64 /lib64 /usr/lib64
    libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc -lgdbm_compat
    perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.14'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,--enable-new-dtags -Wl,-rpath,/usr/lib64/perl5/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic'

Locally applied patches:
    


@INC for perl 5.12.4:
    /usr/local/lib64/perl5
    /usr/local/share/perl5
    /usr/lib64/perl5/vendor_perl
    /usr/share/perl5/vendor_perl
    /usr/lib64/perl5
    /usr/share/perl5
    .


Environment for perl 5.12.4:
    HOME=/root
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/usr/lib64/qt-3.3/bin:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Dec 14, 2011

From @jkeenan

On Tue Aug 09 15​:53​:13 2011, root@​overkill.sb.litts.net wrote​:

Carp​::croak is documented in the camel book as "working like die does,
except misdirecting blame to the caller."

Perldoc says "croak - die of errors (from perspective of caller)".

This is almost true, however careful inspection reveals that croak
does not include file position information ( ", <fh> line $." ) as
die does (when it can).

It turns out that it *is* possible to determine the current file
handle (although I wish it were easier).

Carp should do this and add it to the traceback, as it really is
useful information.

Here's how to generate the missing phrase​:

$msg = join( '', @​_ ) . " at $script line $n";
if( defined $. ) {
local $@​ = '';
eval {
die;
};
if($@​ =~ /^Died at .*(, &lt;.*?&gt; line \d+).$/ ) {
$msg .= $1;
}
}
$msg .= ".\n";

(add traceback)
[snip]

Would it be possible for you to prepare this as a patch to
dist/Carp/lib/Carp.pm?

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Dec 14, 2011

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Dec 14, 2011

From @doy

On Wed, Dec 14, 2011 at 03​:45​:10PM -0800, James E Keenan via RT wrote​:

On Tue Aug 09 15​:53​:13 2011, root@​overkill.sb.litts.net wrote​:

Carp​::croak is documented in the camel book as "working like die does,
except misdirecting blame to the caller."

Perldoc says "croak - die of errors (from perspective of caller)".

This is almost true, however careful inspection reveals that croak
does not include file position information ( ", <fh> line $." ) as
die does (when it can).

It turns out that it *is* possible to determine the current file
handle (although I wish it were easier).

Carp should do this and add it to the traceback, as it really is
useful information.

Here's how to generate the missing phrase​:

$msg = join( '', @​_ ) . " at $script line $n";
if( defined $. ) {
local $@​ = '';
eval {
die;
};
if($@​ =~ /^Died at .*(, &lt;.*?&gt; line \d+).$/ ) {
$msg .= $1;
}
}
$msg .= ".\n";

(add traceback)
[snip]

Would it be possible for you to prepare this as a patch to
dist/Carp/lib/Carp.pm?

Thank you very much.
Jim Keenan

It is also not quite equivalent to die because it is missing the
trailing period​:

  $ perl -E'die foo'
  foo at -e line 1.
  $ perl -E'use Carp; croak foo'
  foo at -e line 1

I wonder if fixing this would be worth it or not.

-doy

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2011

From @cpansprout

On Wed Dec 14 15​:55​:20 2011, doy@​tozt.net wrote​:

On Wed, Dec 14, 2011 at 03​:45​:10PM -0800, James E Keenan via RT wrote​:

On Tue Aug 09 15​:53​:13 2011, root@​overkill.sb.litts.net wrote​:

Carp​::croak is documented in the camel book as "working like die does,
except misdirecting blame to the caller."

Perldoc says "croak - die of errors (from perspective of caller)".

This is almost true, however careful inspection reveals that croak
does not include file position information ( ", <fh> line $." ) as
die does (when it can).

It turns out that it *is* possible to determine the current file
handle (although I wish it were easier).

Carp should do this and add it to the traceback, as it really is
useful information.

Here's how to generate the missing phrase​:

$msg = join( '', @​_ ) . " at $script line $n";
if( defined $. ) {
local $@​ = '';
eval {
die;
};
if($@​ =~ /^Died at .*(, &lt;.*?&gt; line \d+).$/ ) {
$msg .= $1;
}
}
$msg .= ".\n";

(add traceback)
[snip]

Would it be possible for you to prepare this as a patch to
dist/Carp/lib/Carp.pm?

Thank you very much.
Jim Keenan

It is also not quite equivalent to die because it is missing the
trailing period​:

$ perl -E'die foo'
foo at -e line 1.
$ perl -E'use Carp; croak foo'
foo at -e line 1

I wonder if fixing this would be worth it or not.

That has bothered me, too. I think that could be considered a bug fix,
if we change it.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2011

From tlhackque@yahoo.com

On Wed Dec 14 15​:45​:10 2011, jkeenan wrote​:

Would it be possible for you to prepare this as a patch to
dist/Carp/lib/Carp.pm?

Thank you very much.
Jim Keenan

Sure, attached. Against v5.12.4. It's as in the original report,
except use mess instead of msg.

Note this also fixed the missing '.'.

# perl -MCarp -e 'my $x=<STDIN>; sub foo { croak "xx";}; foo();'

xx at -e line 1, <STDIN> line 1.
  main​::foo() called at -e line 1

Enjoy.

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2011

From tlhackque@yahoo.com

CarpFH.patch
--- /usr/share/perl5/Carp.pm~	2011-10-05 12:22:51.000000000 -0400
+++ /usr/share/perl5/Carp.pm	2011-12-14 20:53:52.082610049 -0500
@@ -199,11 +199,21 @@
     my $tid = threads->tid;
     $tid_msg = " thread $tid" if $tid;
   }
 
   my %i = caller_info($i);
-  $mess = "$err at $i{file} line $i{line}$tid_msg\n";
+  $mess = "$err at $i{file} line $i{line}$tid_msg";
+  if( defined $. ) {
+      local $@ = '';
+      eval {
+          die;
+      };
+      if($@ =~ /^Died at .*(, <.*?> line \d+).$/ ) {
+          $mess .= $1;
+      }
+  }
+  $mess .= ".\n";
 
   while (my %i = caller_info(++$i)) {
       $mess .= "\t$i{sub_name} called at $i{file} line $i{line}$tid_msg\n";
   }
   

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2011

From @davidnicol

$ perl -E'die foo'

foo at -e line 1.
$ perl -E'use Carp; croak foo'
foo at -e line 1

Conversely, "die" retains an unnecessary period which should be removed for
cleaner style, as was done in the later Carp​::croak. I motion we resolve
to take no action whatsoever on this terminal punctuation issue.

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2011

From @cpansprout

On Wed Dec 14 21​:11​:34 2011, davidnicol@​gmail.com wrote​:

$ perl -E'die foo'

foo at -e line 1.
$ perl -E'use Carp; croak foo'
foo at -e line 1

Conversely, "die" retains an unnecessary period which should be
removed for
cleaner style, as was done in the later Carp​::croak. I motion we resolve
to take no action whatsoever on this terminal punctuation issue.

I notice you are not following the cleaner style in your prose

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2011

From @davidnicol

On Wed, Dec 14, 2011 at 11​:32 PM, Father Chrysostomos via RT <
perlbug-followup@​perl.org> wrote​:

I notice you are not following the cleaner style in your prose

Am I the only one who has ever imagined that dot to be there to separate
the line number from a column position after it that the original
implementor never got around to putting there? Once I would have been sure
I wasn't.

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2011

From tlhackque@yahoo.com

I reported that croak doesn't replace die as documented.

die has the dot; the patch that I provided actually matches on that dot
when extracting the FH information from a die. So there's an immediate
existence proof that someone counts on the dot in die.

As for croak; I really don't care about this nit. I put the dot into
the patch to better match the output of die. If you want to block a
functional issue over this nit, feel free to remove the dot from the
patch.

Personally, I think the energy spent arguing over this could be better
spent on the "extra credit" issue in the original report​: there ought
to be a more robust way to get the FH info than eval'ing a die and
matching the output...

I'll leave resolution of this in your hands; I've done what I can, and
have no more energy for dot-quibbling...

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2011

From [Unknown Contact. See original ticket]

I reported that croak doesn't replace die as documented.

die has the dot; the patch that I provided actually matches on that dot
when extracting the FH information from a die. So there's an immediate
existence proof that someone counts on the dot in die.

As for croak; I really don't care about this nit. I put the dot into
the patch to better match the output of die. If you want to block a
functional issue over this nit, feel free to remove the dot from the
patch.

Personally, I think the energy spent arguing over this could be better
spent on the "extra credit" issue in the original report​: there ought
to be a more robust way to get the FH info than eval'ing a die and
matching the output...

I'll leave resolution of this in your hands; I've done what I can, and
have no more energy for dot-quibbling...

@p5pRT
Copy link
Author

p5pRT commented Dec 16, 2011

From @cpansprout

On Thu Dec 15 01​:20​:45 2011, tlhackque wrote​:

Personally, I think the energy spent arguing over this could be better
spent on the "extra credit" issue in the original report​: there ought
to be a more robust way to get the FH info than eval'ing a die and
matching the output...

Believe me, I’ve been thinking about that. $$. can’t work, as that
would require changing Perl’s syntax. Maybe an ${^ALL_CAPS} global
variable? ${^LAST_IO}? It’s actually a typeglob, though, not an IO
thingy. (It is a typeglob internally, bugs aside. select() returns a
typeglob. You need a typeglob to get a name out of it.) Maybe
${^LAST_HANDLE}?

I'll leave resolution of this in your hands; I've done what I can, and
have no more energy for dot-quibbling...

I’ve applied your patch without the dot in the interest of getting this
useful change through. The commit id is 89988fb.

Thank you.

I still think we should add the dot, but shouldn’t it apply to every
line of a backtrace?

BTW, we have a policy that every contributor be listed in the AUTHORS
file. We even have tests that fail if we don’t follow it. I have added
you as tlhackque, but if you would like your real name instead (or is
that your real name?), please let me know what it is.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Dec 16, 2011

From tlhackque@yahoo.com

On Thu Dec 15 22​:20​:38 2011, sprout wrote​:

On Thu Dec 15 01​:20​:45 2011, tlhackque wrote​:

Personally, I think the energy spent arguing over this could be
better
spent on the "extra credit" issue in the original report​: there
ought
to be a more robust way to get the FH info than eval'ing a die and
matching the output...

Believe me, I’ve been thinking about that. $$. can’t work, as that
would require changing Perl’s syntax. Maybe an ${^ALL_CAPS} global
variable? ${^LAST_IO}? It’s actually a typeglob, though, not an IO
thingy. (It is a typeglob internally, bugs aside. select() returns a
typeglob. You need a typeglob to get a name out of it.) Maybe
${^LAST_HANDLE}?

I guess it's a question of how much magic to apply. The thought behind
$$. was that dereferencing a numeric quantity is undefined and $. is
already magical. And it's short. But I'm not a Perl internals expert.

${^LAST_FH} or some such would work as a name. For efficiency, perhaps
it should be a dualvar returning the FH name as a string and the FH's
$. in numeric context. Or it could return a list ('MYFH', 161)...
Otherwise basic usage would require a pair of select()s and a $.
reference.

Obviously, once one has the name, it can be symbolically de-referenced
to get the handle if desired. (E.g. one might want to seek backward a
bit and display a few lines of context around an error. Not in Carp,
but in an application-specific handler.)

Documentation for such a variable should probably note that the last FH
is not linked to the select()ed FH - that is, last FH is the last FH on
which IO happened, but it may have been deselected since.

I'll leave resolution of this in your hands; I've done what I can,
and
have no more energy for dot-quibbling...

I’ve applied your patch without the dot in the interest of getting
this
useful change through. The commit id is 89988fb.

Thank you.
Thanks for accepting the patch.

I still think we should add the dot, but shouldn’t it apply to every
line of a backtrace?

I guess that would be more consistent, but slightly riskier. I felt
free to add it where I did because I was already changing the output to
add the missing FH info. Adding it to the backtrace lines could be
seen as a gratuitous change that might break any parser (crash
analyzer) that didn't expect it. If this were day one, I'd put the dot
on every line. Since it wasn't, I wanted to minimize risk and
controversy and maximize the chance the patch would be accepted.

BTW, we have a policy that every contributor be listed in the AUTHORS
file. We even have tests that fail if we don’t follow it. I have
added
you as tlhackque, but if you would like your real name instead (or is
that your real name?), please let me know what it is.

I'm OK either way. My real time is Timothe Litt, and my stable e-mail
address is litt _at_ acm _dot_ org.

@p5pRT
Copy link
Author

p5pRT commented Dec 16, 2011

From @cpansprout

On Fri Dec 16 03​:06​:17 2011, tlhackque wrote​:

I guess it's a question of how much magic to apply. The thought behind
$$. was that dereferencing a numeric quantity is undefined and $. is
already magical. And it's short. But I'm not a Perl internals expert.

Because there is a variable named $$, punctuation variables need braces
for dereferencing (${$.} or $${.}). Otherwise code like
"tmp-".$$.rand() would not work.

$. is magical, but it works like a tied variable (as do most [all?]
magical variables)​: it has internal fetch/store routines. The value
returned by the internal fetch routine is a plain number. So, for
consistency’s sake, dereferencing has to be just like ${"2"}, unless we
make $. return an overloaded object. But there is no precedent for
anything like that elsewhere in the core, and it is good to keep things
at least somewhat consistent.

(I know I’m being verbose, but I thought you might want to see my
reasoning.)

${^LAST_FH} or some such would work as a name.

I like that name.

For efficiency, perhaps
it should be a dualvar returning the FH name as a string and the FH's
$. in numeric context. Or it could return a list ('MYFH', 161)...
Otherwise basic usage would require a pair of select()s and a $.
reference.

I don’t understand that last sentence.

If ${^LAST_FH} is a typeglob (not a problem, BTW, since typeglobs are
actually a special kind of scalar), you can already get the name by
stringifying it and removing the initial *. You don’t need to access
${^LAST_FH} for the line number, since you already have ‘$.’.

Obviously, once one has the name, it can be symbolically de-referenced
to get the handle if desired.

Not necessarily. open my $fh... where $fh is undef creates a new
typeglob not referenced by its name. It is called main​::$fh or
something similar. But *{'main​::$fh'}{IO} doesn’t point to it.

Documentation for such a variable should probably note that the last FH
is not linked to the select()ed FH - that is, last FH is the last FH on
which IO happened, but it may have been deselected since.

Certainly.

I'll leave resolution of this in your hands; I've done what I can,
and
have no more energy for dot-quibbling...

I’ve applied your patch without the dot in the interest of getting
this
useful change through. The commit id is 89988fb.

Thank you.
Thanks for accepting the patch.

I still think we should add the dot, but shouldn’t it apply to every
line of a backtrace?

I guess that would be more consistent, but slightly riskier. I felt
free to add it where I did because I was already changing the output to
add the missing FH info. Adding it to the backtrace lines could be
seen as a gratuitous change that might break any parser (crash
analyzer) that didn't expect it. If this were day one, I'd put the dot
on every line. Since it wasn't, I wanted to minimize risk and
controversy and maximize the chance the patch would be accepted.

Speaking of controversy, the deadline for controversial changes before
5.16 is next Tuesday. I would still like to see the dot, but I think it
will have to wait till after 5.16.

BTW, we have a policy that every contributor be listed in the AUTHORS
file. We even have tests that fail if we don’t follow it. I have
added
you as tlhackque, but if you would like your real name instead (or is
that your real name?), please let me know what it is.

I'm OK either way. My real time is Timothe Litt,

Added with commit 5aaeb4d.

and my stable e-mail
address is litt _at_ acm _dot_ org.

Considering your use of underscores, I did not change the e-mail address
in AUTHORS, which could be abused by spammers. But if you want me to,
just say so.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Dec 16, 2011

From tlhackque@yahoo.com

On Fri Dec 16 08​:39​:20 2011, sprout wrote​:

On Fri Dec 16 03​:06​:17 2011, tlhackque wrote​:

I guess it's a question of how much magic to apply. The thought
behind
$$. was that dereferencing a numeric quantity is undefined and $.
is
already magical. And it's short. But I'm not a Perl internals
expert.

Because there is a variable named $$, punctuation variables need
braces
for dereferencing (${$.} or $${.}). Otherwise code like
"tmp-".$$.rand() would not work.

$. is magical, but it works like a tied variable (as do most [all?]
magical variables)​: it has internal fetch/store routines. The value
returned by the internal fetch routine is a plain number. So, for
consistency’s sake, dereferencing has to be just like ${"2"}, unless
we
make $. return an overloaded object. But there is no precedent for
anything like that elsewhere in the core, and it is good to keep
things
at least somewhat consistent.

(I know I’m being verbose, but I thought you might want to see my
reasoning.)

Fair enough. You're doing the work, your reasoning wins.

${^LAST_FH} or some such would work as a name.

I like that name.

For efficiency, perhaps
it should be a dualvar returning the FH name as a string and the
FH's
$. in numeric context. Or it could return a list ('MYFH', 161)...
Otherwise basic usage would require a pair of select()s and a $.
reference.

I don’t understand that last sentence.

Sorry, I'll expand. Since (as noted later), the LAST_FH can't be
assumed to be select()ed, and $. returns the record/line number of the
select()ed FH, the code in Carp would have to be​:
$LAST_FH = *{${^LAST_FH}}; $curFH = select( LAST_FH ); $errorLine = $.;
select( $curFH ); (Yes, this can be abbreviated, but it's still 2
select calls.) Then to construct the error line it would be (roughly)
$msg .= "&lt;${^LAST_FH}>, line $errorLine".

The alternative would be to make @​{^LAST_FH} return something like (
name, line, handle). Then usage would be (again, roughly) $msg .=
sprintf( "<%s>, line %u", @​{^LAST_FH} ); This seems simpler. (The 3rd
element would make the actual FH accessible for extended error
reporting.) This approach is consistent with the way caller() returns
a bunch of internal state. Advanced users who hate to type would
probably use my( $name, $line, $fh ) = @​{^LAST_FH}; for convenience...

If ${^LAST_FH} is a typeglob (not a problem, BTW, since typeglobs are
actually a special kind of scalar), you can already get the name by
stringifying it and removing the initial *. You don’t need to access
${^LAST_FH} for the line number, since you already have ‘$.’.

Obviously, once one has the name, it can be symbolically de-
referenced
to get the handle if desired.

Not necessarily. open my $fh... where $fh is undef creates a new
typeglob not referenced by its name. It is called main​::$fh or
something similar. But *{'main​::$fh'}{IO} doesn’t point to it.

Documentation for such a variable should probably note that the
last FH
is not linked to the select()ed FH - that is, last FH is the last
FH on
which IO happened, but it may have been deselected since.

Certainly.

I'll leave resolution of this in your hands; I've done what I
can,
and
have no more energy for dot-quibbling...

I’ve applied your patch without the dot in the interest of
getting
this
useful change through. The commit id is 89988fb.

Thank you.
Thanks for accepting the patch.

I still think we should add the dot, but shouldn’t it apply to
every
line of a backtrace?

I guess that would be more consistent, but slightly riskier. I
felt
free to add it where I did because I was already changing the
output to
add the missing FH info. Adding it to the backtrace lines could be
seen as a gratuitous change that might break any parser (crash
analyzer) that didn't expect it. If this were day one, I'd put the
dot
on every line. Since it wasn't, I wanted to minimize risk and
controversy and maximize the chance the patch would be accepted.

Speaking of controversy, the deadline for controversial changes before
5.16 is next Tuesday. I would still like to see the dot, but I think
it
will have to wait till after 5.16.

BTW, we have a policy that every contributor be listed in the
AUTHORS
file. We even have tests that fail if we don’t follow it. I
have
added
you as tlhackque, but if you would like your real name instead
(or is
that your real name?), please let me know what it is.

I'm OK either way. My real time is Timothe Litt,

Added with commit 5aaeb4d.

and my stable e-mail
address is litt _at_ acm _dot_ org.

Considering your use of underscores, I did not change the e-mail
address
in AUTHORS, which could be abused by spammers. But if you want me to,
just say so.

@p5pRT
Copy link
Author

p5pRT commented Dec 16, 2011

From tlhackque@yahoo.com

Sorry, my reply escaped before I was done.

On Fri Dec 16 09​:12​:46 2011, tlhackque wrote​:

On Fri Dec 16 08​:39​:20 2011, sprout wrote​:

On Fri Dec 16 03​:06​:17 2011, tlhackque wrote​:

I guess it's a question of how much magic to apply. The thought
behind
$$. was that dereferencing a numeric quantity is undefined and $.
is
already magical. And it's short. But I'm not a Perl internals
expert.

Because there is a variable named $$, punctuation variables need
braces
for dereferencing (${$.} or $${.}). Otherwise code like
"tmp-".$$.rand() would not work.

$. is magical, but it works like a tied variable (as do most [all?]
magical variables)​: it has internal fetch/store routines. The value
returned by the internal fetch routine is a plain number. So, for
consistency’s sake, dereferencing has to be just like ${"2"},
unless
we
make $. return an overloaded object. But there is no precedent for
anything like that elsewhere in the core, and it is good to keep
things
at least somewhat consistent.

(I know I’m being verbose, but I thought you might want to see my
reasoning.)

Fair enough. You're doing the work, your reasoning wins.

${^LAST_FH} or some such would work as a name.

I like that name.

For efficiency, perhaps
it should be a dualvar returning the FH name as a string and the
FH's
$. in numeric context. Or it could return a list ('MYFH',
161)...
Otherwise basic usage would require a pair of select()s and a $.
reference.

I don’t understand that last sentence.

Sorry, I'll expand. Since (as noted later), the LAST_FH can't be
assumed to be select()ed, and $. returns the record/line number of the
select()ed FH, the code in Carp would have to be​:
$LAST_FH = *{${^LAST_FH}}; $curFH = select( LAST_FH ); $errorLine = $.;
select( $curFH ); (Yes, this can be abbreviated, but it's still 2
select calls.) Then to construct the error line it would be (roughly)
$msg .= "&lt;${^LAST_FH}>, line $errorLine".

The alternative would be to make @​{^LAST_FH} return something like (
name, line, handle). Then usage would be (again, roughly) $msg .=
sprintf( "<%s>, line %u", @​{^LAST_FH} ); This seems simpler. (The 3rd
element would make the actual FH accessible for extended error
reporting.) This approach is consistent with the way caller() returns
a bunch of internal state. Advanced users who hate to type would
probably use my( $name, $line, $fh ) = @​{^LAST_FH}; for convenience...

If ${^LAST_FH} is a typeglob (not a problem, BTW, since typeglobs
are
actually a special kind of scalar), you can already get the name by
stringifying it and removing the initial *. You don’t need to
access
${^LAST_FH} for the line number, since you already have ‘$.’.

As noted above, you only have $. if the LAST_FH is select()ed.
Otherwise you need to select() to get it and select() again to restore
the previous selection.

Obviously, once one has the name, it can be symbolically de-
referenced
to get the handle if desired.

Not necessarily. open my $fh... where $fh is undef creates a new
typeglob not referenced by its name. It is called main​::$fh or
something similar. But *{'main​::$fh'}{IO} doesn’t point to it.

Returning the FH as the third element of a list would handle all these.

Documentation for such a variable should probably note that the
last FH
is not linked to the select()ed FH - that is, last FH is the last
FH on
which IO happened, but it may have been deselected since.

Certainly.

I'll leave resolution of this in your hands; I've done what I
can,
and
have no more energy for dot-quibbling...

I’ve applied your patch without the dot in the interest of
getting
this
useful change through. The commit id is 89988fb.

Thank you.
Thanks for accepting the patch.

I still think we should add the dot, but shouldn’t it apply to
every
line of a backtrace?

I guess that would be more consistent, but slightly riskier. I
felt
free to add it where I did because I was already changing the
output to
add the missing FH info. Adding it to the backtrace lines could
be
seen as a gratuitous change that might break any parser (crash
analyzer) that didn't expect it. If this were day one, I'd put
the
dot
on every line. Since it wasn't, I wanted to minimize risk and
controversy and maximize the chance the patch would be accepted.

Speaking of controversy, the deadline for controversial changes
before
5.16 is next Tuesday. I would still like to see the dot, but I
think
it
will have to wait till after 5.16.

OK, that's in your hands. I wasn't aware of the schedule. Maybe it
should be a separate ticket, since out main concern is getting the
missing information included. The dot is really cosmetic consistency
and less critical.

BTW, we have a policy that every contributor be listed in the
AUTHORS
file. We even have tests that fail if we don’t follow it. I
have
added
you as tlhackque, but if you would like your real name instead
(or is
that your real name?), please let me know what it is.

I'm OK either way. My real time is Timothe Litt,

Added with commit 5aaeb4d.

and my stable e-mail
address is litt _at_ acm _dot_ org.

Considering your use of underscores, I did not change the e-mail
address
in AUTHORS, which could be abused by spammers. But if you want me
to,
just say so.

I was more worried about e-mail harvesting on this (rt) site than the
AUTHORS file. The acm address is associated with my real name;
tlhackque stands alone (for environments where the probability of abuse
is higher.)

I appreciate all your effort on this.

@p5pRT
Copy link
Author

p5pRT commented Dec 16, 2011

From @doy

On Fri, Dec 16, 2011 at 09​:26​:01AM -0800, tlhackque via RT wrote​:

Sorry, I'll expand. Since (as noted later), the LAST_FH can't be
assumed to be select()ed, and $. returns the record/line number of the
select()ed FH, the code in Carp would have to be​:
$LAST_FH = *{${^LAST_FH}}; $curFH = select( LAST_FH ); $errorLine = $.;
select( $curFH ); (Yes, this can be abbreviated, but it's still 2
select calls.) Then to construct the error line it would be (roughly)
$msg .= "&lt;${^LAST_FH}>, line $errorLine".

Alternatively, ${^LAST_FH}->input_line_number.

-doy

@p5pRT
Copy link
Author

p5pRT commented Dec 16, 2011

From tlhackque@yahoo.com

One more thought.

Returning an array would seem to allow using @​. as the variable name.

The use english version could be @​{^LAST_FH}.

@p5pRT
Copy link
Author

p5pRT commented Dec 16, 2011

From [Unknown Contact. See original ticket]

One more thought.

Returning an array would seem to allow using @​. as the variable name.

The use english version could be @​{^LAST_FH}.

@p5pRT
Copy link
Author

p5pRT commented Dec 17, 2011

From @cpansprout

On Fri Dec 16 09​:26​:01 2011, tlhackque wrote​:

Sorry, my reply escaped before I was done.

On Fri Dec 16 09​:12​:46 2011, tlhackque wrote​:

On Fri Dec 16 08​:39​:20 2011, sprout wrote​:

On Fri Dec 16 03​:06​:17 2011, tlhackque wrote​:

I guess it's a question of how much magic to apply. The thought
behind
$$. was that dereferencing a numeric quantity is undefined and $.
is
already magical. And it's short. But I'm not a Perl internals
expert.

Because there is a variable named $$, punctuation variables need
braces
for dereferencing (${$.} or $${.}). Otherwise code like
"tmp-".$$.rand() would not work.

$. is magical, but it works like a tied variable (as do most [all?]
magical variables)​: it has internal fetch/store routines. The value
returned by the internal fetch routine is a plain number. So, for
consistency’s sake, dereferencing has to be just like ${"2"},
unless
we
make $. return an overloaded object. But there is no precedent for
anything like that elsewhere in the core, and it is good to keep
things
at least somewhat consistent.

(I know I’m being verbose, but I thought you might want to see my
reasoning.)

Fair enough. You're doing the work, your reasoning wins.

${^LAST_FH} or some such would work as a name.

I like that name.

For efficiency, perhaps
it should be a dualvar returning the FH name as a string and the
FH's
$. in numeric context. Or it could return a list ('MYFH',
161)...
Otherwise basic usage would require a pair of select()s and a $.
reference.

I don’t understand that last sentence.

Sorry, I'll expand. Since (as noted later), the LAST_FH can't be
assumed to be select()ed, and $. returns the record/line number of the
select()ed FH,

I’ll admit I had to check, but $. is bound to the last-read handle,
according to perlvar and mg.c (the latter being what actually counts).

the code in Carp would have to be​:
$LAST_FH = *{${^LAST_FH}}; $curFH = select( LAST_FH ); $errorLine = $.;
select( $curFH ); (Yes, this can be abbreviated, but it's still 2
select calls.) Then to construct the error line it would be (roughly)
$msg .= "&lt;${^LAST_FH}>, line $errorLine".

The alternative would be to make @​{^LAST_FH} return something like (
name, line, handle). Then usage would be (again, roughly) $msg .=
sprintf( "<%s>, line %u", @​{^LAST_FH} ); This seems simpler.

I’m afraid I have to disagree with you there. Having a special array
with different types of things at different indices seems more
complicated, to my mind.

I was more worried about e-mail harvesting on this (rt) site than the
AUTHORS file. The acm address is associated with my real name;
tlhackque stands alone (for environments where the probability of abuse
is higher.)

I appreciate all your effort on this.

OK. I’ve changed the e-mail address with commit 9fcef2a.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Dec 17, 2011

From @ap

So I’m seeing a whole rigmarole for designing a special magical variable
here​: should it have a short or long punctuation name? Piggyback on one
or be a new one? Should it simply be a dualvar? Or return an array in
some magic way even though it is a scalar? (Why not an array punctuation
variable then?) Does it need to produce the line number or is $. enough?
Maybe it should yield a reference?

Then I see this​:

* tlhackque via RT <perlbug-followup@​perl.org> [2011-12-16 18​:45]​:

This approach is consistent with the way caller() returns a bunch of
internal state.

And my first thought is​: so why not simply make a `lastread` function
and sidestep all that complexity of trying to make a variable do this
job? (And​: haven’t we got enough punctuation variables anyway?)

It could be a weak keyword (preferably) or a feature (and for once
a case of something that would fit into feature.pm without making
semantic trouble).

Regards,
--
Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

p5pRT commented Dec 17, 2011

From @cpansprout

On Fri Dec 16 18​:13​:18 2011, aristotle wrote​:

And my first thought is​: so why not simply make a `lastread` function
and sidestep all that complexity of trying to make a variable do this
job? (And​: haven’t we got enough punctuation variables anyway?)

It could be a weak keyword (preferably) or a feature (and for once
a case of something that would fit into feature.pm without making
semantic trouble).

Would it be an lvalue keyword, like pos?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Dec 17, 2011

From @davidnicol

On Fri, Dec 16, 2011 at 11​:40 PM, Father Chrysostomos via RT <
perlbug-followup@​perl.org> wrote​:

Would it be an lvalue keyword, like pos?

What possible circumstance could require changing the identity of the
last-read-from file handle?

On the other hand, if you could localize it, you could hide it and restore
it, which is in keeping with the spirit of Carp, giving methods to return
an error from the perspective of the code that called the library. Could it
be tacked on to the end of caller's information, so after Carp has figured
out where it is supposed to report the error happened, it could report what
the last read file was at that point? that would require another
store/delete at every subroutine call and is therefore a non-starter. Carp
could install a hint saying to save the info, which would go in
caller()[11] which would be a dynamic hash that is usually empty, so other
things that need to go there can have names instead of having to be there
always.

@p5pRT
Copy link
Author

p5pRT commented Dec 17, 2011

From @ap

* Father Chrysostomos via RT <perlbug-followup@​perl.org> [2011-12-17 06​:45]​:

Would it be an lvalue keyword, like pos?

Definite maybe. And it can always be made lvalue later if someone finds
a need for it, so I’d just shelf that question for now.

* David Nicol <davidnicol@​gmail.com> [2011-12-17 08​:20]​:

Could it be tacked on to the end of caller's information, so after
Carp has figured out where it is supposed to report the error
happened, it could report what the last read file was at that point?

It sits very oddly in `caller`’s return.

Do you get different values depending on whether you ask for `caller 0`
or `caller 1`? If yes, what for, who needs that? If not – and I don’t
think anyone needs it to –, then this is mistaken unification.

I don’t think that is the right place for it.

Regards,
--
Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

p5pRT commented Dec 17, 2011

From @cpansprout

On Sat Dec 17 03​:16​:09 2011, aristotle wrote​:

* Father Chrysostomos via RT <perlbug-followup@​perl.org> [2011-12-17
06​:45]​:

Would it be an lvalue keyword, like pos?

Definite maybe. And it can always be made lvalue later if someone finds
a need for it, so I’d just shelf that question for now.

I meant that almost as an ironic question. Having a special keyword
that is like a variable would be something new, with no precedent. So I
don’t think we shelve it for now.

Do we want to follow the interface of select(), and have
lastread(lastread) set it? Or do we want to follow the interface of
${^GLOBAL_PHASE}, which could easily allow making it modifiable later,
without too much weirdness.

Another, more troubling question​: I have just discovered that I was
wrong about select(). It returns the name of the handle, if it is
accessible under its name (but I believe there are false positives
sometimes), or a reference to a GV otherwise. Should ${^LAST_FH} or
lastread or whatever do the same?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Dec 18, 2011

From @cpansprout

Attached is a proof of concept. It still needs tests and docs. It will
probably have to wait till after 5.16.

With it applied​:

$ ./perl -Ilib -le '<foo>; print ${^LAST_FH}; print \${^LAST_FH}; print
${^LAST_FH} =~ s/^\*//r; print ${^LAST_FH}->input_line_number;'
*main​::foo
GLOB(0x823280)
main​::foo
0

So with a single new variable, you can get a real typeglob that you can
select() or whatever. And you can also get the name and current line
number from it.

Who could ask for anything more?

Later we could even allowed it to be assigned to, such that ${^LAST_FH}
= *foo would be more or less equivalent to tell(*foo), but without
touching $!.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Dec 18, 2011

From @cpansprout

Inline Patch
diff --git a/gv.c b/gv.c
index 3a978f2..9bfea53 100644
--- a/gv.c
+++ b/gv.c
@@ -1818,6 +1818,10 @@ Perl_gv_fetchpvn_flags(pTHX_ const char *nambeg, STRLEN full_len, I32 flags,
 		if (strEQ(name2, "LOBAL_PHASE"))
 		    goto ro_magicalize;
 		break;
+	    case '\014':	/* $^LAST_FH */
+		if (strEQ(name2, "AST_FH"))
+		    goto ro_magicalize;
+		break;
             case '\015':        /* $^MATCH */
                 if (strEQ(name2, "ATCH"))
 		    goto magicalize;
diff --git a/mg.c b/mg.c
index c55ca63..b1865b7 100644
--- a/mg.c
+++ b/mg.c
@@ -882,6 +882,15 @@ Perl_magic_get(pTHX_ SV *sv, MAGIC *mg)
     case '\011':		/* ^I */ /* NOT \t in EBCDIC */
 	sv_setpv(sv, PL_inplace); /* Will undefine sv if PL_inplace is NULL */
 	break;
+    case '\014':		/* ^LAST_FH */
+	if (strEQ(remaining, "AST_FH"))
+	    sv_setsv_nomg(
+		sv,
+		PL_last_in_gv && isGV_with_GP(PL_last_in_gv)
+		   ? (SV *)PL_last_in_gv
+		   : NULL
+	    );
+	break;
     case '\017':		/* ^O & ^OPEN */
 	if (nextchar == '\0') {
 	    sv_setpv(sv, PL_osname);

@p5pRT
Copy link
Author

p5pRT commented Dec 19, 2011

From @rjbs

* Father Chrysostomos via RT <perlbug-followup@​perl.org> [2011-12-16T11​:39​:21]

Speaking of controversy, the deadline for controversial changes before
5.16 is next Tuesday. I would still like to see the dot, but I think it
will have to wait till after 5.16.

(I agree on both counts.)

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Sep 15, 2012

From @jkeenan

On Sun Dec 18 00​:20​:57 2011, sprout wrote​:

Attached is a proof of concept. It still needs tests and docs. It will
probably have to wait till after 5.16.

With it applied​:

$ ./perl -Ilib -le '<foo>; print ${^LAST_FH}; print \${^LAST_FH}; print
${^LAST_FH} =~ s/^\*//r; print ${^LAST_FH}->input_line_number;'
*main​::foo
GLOB(0x823280)
main​::foo
0

So with a single new variable, you can get a real typeglob that you can
select() or whatever. And you can also get the name and current line
number from it.

Who could ask for anything more?

Later we could even allowed it to be assigned to, such that ${^LAST_FH}
= *foo would be more or less equivalent to tell(*foo), but without
touching $!.

Are we going to follow up on the approach taken in this patch? Tests?

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Sep 15, 2012

From @cpansprout

On Fri Sep 14 18​:29​:29 2012, jkeenan wrote​:

On Sun Dec 18 00​:20​:57 2011, sprout wrote​:

Attached is a proof of concept. It still needs tests and docs. It will
probably have to wait till after 5.16.

With it applied​:

$ ./perl -Ilib -le '<foo>; print ${^LAST_FH}; print \${^LAST_FH}; print
${^LAST_FH} =~ s/^\*//r; print ${^LAST_FH}->input_line_number;'
*main​::foo
GLOB(0x823280)
main​::foo
0

So with a single new variable, you can get a real typeglob that you can
select() or whatever. And you can also get the name and current line
number from it.

Who could ask for anything more?

Later we could even allowed it to be assigned to, such that ${^LAST_FH}
= *foo would be more or less equivalent to tell(*foo), but without
touching $!.

Are we going to follow up on the approach taken in this patch? Tests?

I had nearly forgotten about this. It was in the back of my mind, but
quite far back. :-)

I found out after writing that patch that I was mistaken in thinking
that methods cannot be called on unblessed globrefs. This works​:

my $globref = \*STDIN;
$globref->input_line_number;

But that only works if *STDIN{IO} is defined. Otherwise it dies. But
the same applies to *STDIN.

So that leads me to ask? What is the best value for ${^LAST_FH}? My
reasoning for making it a typeglob was that method calls would not work
on a reference. But that reasoning is false.

In either case, we are not following existing precedent. select()
returns a string if possible, or a globref otherwise. If we did that,
we would have​:

*${^LAST_FH}; # glob itself, possibly a strict error
\*${^LAST_FH}; # glob reference; possible strict error
ref ${^LAST_FH} ? substr *${^LAST_FH}, 1 : ${^LAST_FH}
  # glob name
${^LAST_FH}->input_line_number
  # line number

That third line is not pretty.

If we always use a reference​:

*${^LAST_FH}; # glob itself
${^LAST_FH}; # reference
substr *${^LAST_FH}, 1; # glob name
${^LAST_FH}->input_line_number # line number

# If ${^LAST_FH} is undef​:
*${^LAST_FH}; # error
${^LAST_FH}; # undef
substr *${^LAST_FH}, 1; # error
${^LAST_FH}->input_line_number # error

If we make ${^LAST_FH} itself a typeglob​:

${^LAST_FH}; # typeglob, but a copy of the
  # original, sharing the same slots
\${^LAST_FH}; # reference to glob copy
substr ${^LAST_FH}, 1; # glob name
${^LAST_FH}->input_line_number # line number

# If ${^LAST_FH} is undef​:
${^LAST_FH}; # undef
\${^LAST_FH}; # \undef
substr ${^LAST_FH}, 1; # empty string; warnings
${^LAST_FH}->input_line_number # error

Despite the errors with undef, I think I now prefer a reference.

Any thoughts?

I can probably cook up a proper patch once I have finished fixing the
most recent CPAN breakage and rewritten smart match and fixed and merged
lexical subs and the new experimental warnings mechanism and finished
fixing heredoc parsing (one more bug left) and utf8 length bugs.

Er, when am I going to get to this? :-)

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Sep 18, 2012

From @rjbs

* Father Chrysostomos via RT <perlbug-followup@​perl.org> [2012-09-15T01​:28​:10]

Despite the errors with undef, I think I now prefer a reference.

Same.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Sep 18, 2012

From @cpansprout

On Mon Sep 17 17​:59​:19 2012, perl.p5p@​rjbs.manxome.org wrote​:

* Father Chrysostomos via RT <perlbug-followup@​perl.org> [2012-09-
15T01​:28​:10]

Despite the errors with undef, I think I now prefer a reference.

Same.

I have now applied it as b932b26.
--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Sep 18, 2012

@cpansprout - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant