Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PATCH] v5.8.8 pod2html -- Convert RFC links to point ot IETF pages #8820

Closed
p5pRT opened this issue Mar 4, 2007 · 10 comments
Closed

[PATCH] v5.8.8 pod2html -- Convert RFC links to point ot IETF pages #8820

p5pRT opened this issue Mar 4, 2007 · 10 comments

Comments

@p5pRT
Copy link

p5pRT commented Mar 4, 2007

Migrated from rt.perl.org#41691 (status was 'resolved')

Searchable as RT41691$

@p5pRT
Copy link
Author

p5pRT commented Mar 4, 2007

From @jaalto

Created by jaalto@cante.cante.net

The manual pages many times have references to the internet standards,
like

  See RFC 822 and HTTP protocol RFC 2616

The following patch converts these to point to corresponding official
IETF pages.

  Note, that this patch build on top of my previous patch titled​:
  "[perl #41687] perlbug AutoReply​: [PATCH] v5.8.8 pod2html -- Add
  --[no]fragmentuniq to support more readable <a name=..> refs"

=== modified file 'Html.pm'

Inline Patch
--- Html.pm     2007-03-04 00:43:23 +0000
+++ Html.pm     2007-03-04 11:53:59 +0000
@@ -1538,10 +1538,29 @@
 sub process_text {
     return if $Ignore;
     my( $tref ) = @_;
-    my $res = process_text1( 0, $tref );
+    my $res;
+    $res = process_text1( 0, $tref );
     $$tref = $res;
 }

+sub process_text_rfc_links {
+    my $text = shift;
+
+    #  For every "RFCnnnn" or "RFC nnn" link it to the authorative
+    #  source. Do not use (i) option here. Require RFC to be written in
+    #  in capital letters.
+
+    $text =~ s{
+         (?=^\S)            # positive lookahead, make dure this is "text" paragraph
+         (.*?)              # $1: Grab leading text
+         (?<=[^<>])         # Make sure this is not an URL already
+           (RFC\s*(\d{1,5}))\b # max 5 digits
+    }
+    {$1<a href="http://www.ietf.org/rfc/rfc$3.txt">$2</a>}gx;
+
+    $text;
+}
+
 sub process_text1($$;$$){
     my( $lev, $rstr, $func, $closing ) = @_;
     my $res = '';
@@ -1730,6 +1749,8 @@
        } else {
            warn "$0: $Podfile: undelimited $func<> in paragraph $Paragraph.\n" unless $Quiet;
        }
+
+        $res = process_text_rfc_links($res);
     }
     return $res;
 }
Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl v5.8.8:

Configured by Debian Project at Wed Dec  6 23:17:41 UTC 2006.

Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
  Platform:
    osname=linux, osvers=2.6.18.3, archname=i486-linux-gnu-thread-multi
    uname='linux saens 2.6.18.3 #1 smp sat nov 25 13:39:52 est 2006 i686 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i486-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.8 -Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.8 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include'
    ccversion='', gccversion='4.1.2 20061115 (prerelease) (Debian 4.1.1-20)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.3.6.so, so=so, useshrplib=true, libperl=libperl.so.5.8.8
    gnulibc_version='2.3.6'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    


@INC for perl v5.8.8:
    /home/jaalto/var/lib/code/perl
    /etc/perl
    /usr/local/lib/perl/5.8.8
    /usr/local/share/perl/5.8.8
    /usr/lib/perl5
    /usr/share/perl5
    /usr/lib/perl/5.8
    /usr/share/perl/5.8
    /usr/local/lib/site_perl
    /usr/local/lib/perl/5.8.7
    /usr/local/share/perl/5.8.7
    /usr/local/lib/perl/5.8.4
    /usr/local/share/perl/5.8.4
    .


Environment for perl v5.8.8:
    HOME=/home/jaalto
    LANG (unset)
    LANGUAGE (unset)
    LC_ALL=en_US
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/usr/local/bin:/home/jaalto/var/link/bin:/sbin:/bin:/usr/bin:/usr/sbin:/usr/share/bin:/usr/bin/X11:/usr/games
    PERL5LIB=/home/jaalto/var/lib/code/perl
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Mar 4, 2007

From @jaalto

A small improvement to original patch. The regexp must not match​:

  rfc225.txt

The following patch maked end boundary more strict.

Jari

=== modified file 'Html.pm'

Inline Patch
--- Html.pm     2007-03-04 11:55:55 +0000
+++ Html.pm     2007-03-04 17:47:19 +0000
@@ -1554,7 +1554,7 @@
          (?=^\S)            # positive lookahead, make dure this is 
"text" paragraph   \(\.\*?\) \# $1​: Grab leading text   \(?\<=\[^\<>\]\) \# Make sure this is not an URL already \- \(RFC\\s\*\(\\d\{1\,5\}\)\)\\b \# max 5 digits \+ \(RFC\\s\*\(\\d\{1\,5\}\)\)\(?=\\s\) \# max 5 digits   \}   \{$1\$2\\}gx;

@p5pRT
Copy link
Author

p5pRT commented Mar 4, 2007

@jaalto - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Mar 15, 2007

From @jaalto

Rsynced to the latest Html.pm as of today 2007-03-14. Here is patch
that should apply cleanly.

=== modified file 'Html.pm'

Inline Patch
--- Html.pm     2007-03-14 18:50:47 +0000
+++ Html.pm     2007-03-14 18:53:22 +0000
@@ -1505,6 +1505,24 @@
     $$tref = $res;
 }

+sub process_text_rfc_links {
+    my $text = shift;
+
+    #  For every "RFCnnnn" or "RFC nnn" link it to the authorative
+    #  source. Do not use (i) option here. Require RFC to be written in
+    #  in capital letters.
+
+    $text =~ s{
+         (?=^\S)            # positive lookahead, make dure this is "text" paragraph
+         (.*?)              # $1: Grab leading text
+         (?<=[^<>])         # Make sure this is not an URL already
+           (RFC\s*(\d{1,5}))(?=\s) # max 5 digits
+    }
+    {$1<a href="http://www.ietf.org/rfc/rfc$3.txt">$2</a>}gx;
+
+    $text;
+}
+
 sub process_text1($$;$$){
     my( $lev, $rstr, $func, $closing ) = @_;
     my $res = '';
@@ -1693,6 +1711,7 @@
        } else {
            warn "$0: $Podfile: undelimited $func<> in paragraph $Paragraph.\n" unless $Quiet;
        }
+        $res = process_text_rfc_links($res);
     }
     return $res;
 }

@p5pRT
Copy link
Author

p5pRT commented Mar 16, 2007

@rgs - Status changed from 'open' to 'resolved'

@p5pRT p5pRT closed this as completed Mar 16, 2007
@p5pRT
Copy link
Author

p5pRT commented Mar 17, 2007

From @rgarcia

On 14/03/07, Jari Aalto <jari.aalto@​cante.net> wrote​:

Rsynced to the latest Html.pm as of today 2007-03-14. Here is patch
that should apply cleanly.

Thanks, applied as #30604 to bleadperl.

@p5pRT
Copy link
Author

p5pRT commented Mar 17, 2007

From @tamias

On Fri, Mar 16, 2007 at 04​:18​:40PM +0100, Rafael Garcia-Suarez wrote​:

On 14/03/07, Jari Aalto <jari.aalto@​cante.net> wrote​:

Rsynced to the latest Html.pm as of today 2007-03-14. Here is patch
that should apply cleanly.

Thanks, applied as #30604 to bleadperl.

I should have sent this feedback sooner, sorry...

On Wed, Mar 14, 2007 at 09​:56​:14PM +0300, Jari Aalto wrote​:

Rsynced to the latest Html.pm as of today 2007-03-14. Here is patch
that should apply cleanly.

=== modified file 'Html.pm'
--- Html.pm 2007-03-14 18​:50​:47 +0000
+++ Html.pm 2007-03-14 18​:53​:22 +0000

Patches are easier to apply if you use the full path from the base
directory. (lib/Pod/Html.pm in this case)

@​@​ -1505,6 +1505,24 @​@​
$$tref = $res;
}

+sub process_text_rfc_links {
+ my $text = shift;
+
+ # For every "RFCnnnn" or "RFC nnn" link it to the authorative
+ # source. Do not use (i) option here. Require RFC to be written in
+ # in capital letters.

I wasn't sure what you meant by "(i) option" at first; I would call it the
"/i modifier".

+
+ $text =~ s{
+ (?=^\S) # positive lookahead, make dure this is "text" paragraph
+ (.*?) # $1​: Grab leading text
+ (?<=[^<>]) # Make sure this is not an URL already
+ (RFC\s*(\d{1,5}))(?=\s) # max 5 digits
+ }
+ {$1<a href="http​://www.ietf.org/rfc/rfc$3.txt">$2</a>}gx;

This regex will only match the first RFC reference in the first line of the
text. Further references on the first line, or any references on following
lines, will not be matched.

I don't think the positive lookahead to make sure it's a "text" paragraph
is even necessary; process_text() isn't called on verbatim paragraphs
anyway.

Also, why do you require whitespace after the number? This will fail to
match something like "Refer to RFC 1234."

Try something like this instead​:

$text =~ s{
  (?<=[^<>]) # Make sure it's not already a URL
  (RFC\s*(\d{1,5}))(?!\d) # max 5 digits
  }
  {<a href="http​://www.ietf.org/rfc/rfc$2.txt">$1</a>}gx;

+
+ $text;

return $text;

Ronald

@p5pRT
Copy link
Author

p5pRT commented Mar 17, 2007

From @jaalto

rjk-perl-p5p@​tamias.net (Ronald J Kimball) writes​:

=== modified file 'Html.pm'
--- Html.pm 2007-03-14 18​:50​:47 +0000
+++ Html.pm 2007-03-14 18​:53​:22 +0000

Patches are easier to apply if you use the full path from the base
directory. (lib/Pod/Html.pm in this case)

I'm working in localized version control directory, which does not
have all perl sources in VCS. One way to apply the patch is​:

  (cd lib/Pod/ && patch -p0 < /path/to/that.patch)

This regex will only match the first RFC reference in the first line of the
text. Further references on the first line, or any references on following
lines, will not be matched.

I don't think the positive lookahead to make sure it's a "text" paragraph
is even necessary; process_text() isn't called on verbatim paragraphs
anyway.

Also, why do you require whitespace after the number? This will fail to
match something like "Refer to RFC 1234."

Try something like this instead​:

$text =~ s{
(?<=[^<>]) # Make sure it's not already a URL
(RFC\s*(\d{1,5}))(?!\d) # max 5 digits
}
{<a href="http​://www.ietf.org/rfc/rfc$2.txt">$1</a>}gx;

Thanks, here is updated patch (rsync'd against latest bledperl 2007-03-17)

Jari

=== modified file 'Html.pm'

Inline Patch
--- Html.pm     2007-03-17 07:44:10 +0000
+++ Html.pm     2007-03-17 08:07:05 +0000
@@ -1509,16 +1509,14 @@
     my $text = shift;

     #  For every "RFCnnnn" or "RFC nnn" link it to the authoritative
-    #  source. Do not use (i) option here. Require RFC to be written in
+    #  source. Do not use /i modifier here. Require RFC to be written in
     #  in capital letters.

     $text =~ s{
-         (?=^\S)            # positive lookahead, make sure this is "text" paragraph
-         (.*?)              # Grab leading text
-         (?<=[^<>])         # Make sure this is not an URL already
-           (RFC\s*(\d{1,5}))(?=\s) # max 5 digits
+         (?<=[^<>[:alpha:]])         # Make sure this is not an URL already
+         (RFC\s*(\d{1,5}))(?!\d) # max 5 digits
     }
-    {$1<a href="http://www.ietf.org/rfc/rfc$3.txt">$2</a>}gx;
+    {<a href="http://www.ietf.org/rfc/rfc$3.txt" class="rfc">$1</a>}gx;

     $text;
 }

@p5pRT
Copy link
Author

p5pRT commented Mar 17, 2007

From nospam-abuse@bloodgate.com

-----BEGIN PGP SIGNED MESSAGE-----
Hash​: SHA1

Moin,

On Saturday 17 March 2007 01​:19​:25 Ronald J Kimball wrote​:

On Fri, Mar 16, 2007 at 04​:18​:40PM +0100, Rafael Garcia-Suarez wrote​:

On 14/03/07, Jari Aalto <jari.aalto@​cante.net> wrote​:

Rsynced to the latest Html.pm as of today 2007-03-14. Here is patch
that should apply cleanly.

Thanks, applied as #30604 to bleadperl.

I should have sent this feedback sooner, sorry...

[snip]

+
+ $text;

return $text;

The return is surely superfluous? Or is there any case where it makes a
difference, except slowing things down?

It might be good to be consistent in the code, though.

All the best,

Tels

- --
Signed on Sat Mar 17 11​:03​:04 2007 with key 0x93B84C15.
View my photo gallery​: http​://bloodgate.com/photos
PGP key on http​://bloodgate.com/tels.asc or per email.

This email violates EU patent EP0394160​:

  [ ########## 66% ####
  ]

-----BEGIN PGP SIGNATURE-----
Version​: GnuPG v1.4.2 (GNU/Linux)

iQEVAwUBRfvLJ3cLPEOTuEwVAQLA5Qf/UpGoIk/M3D5f5z52DaFWBd9r8nQkPP21
94AFK9L485ZBNT0qI5T0g5tSqxEX1jxXkWSSTKW4+scCKOB0HIu8HPNMq5TC/fJO
UASCuSTE1/T+lLTA4AsJNfZ4FDF/d7rocat6Xg5SnQ6qDoVO1VSgUHzEtmTAxXLc
I7fIpWyQDCmZnDqXga0zErvyKfH1SV38FDDsImlt4Fnq4TdxgUSY7fJG0qpChdEG
XfgLyBDgmRLw5gkD5PMeCqloVcqgBCAYQawGmcd9nx30KjB0OWiNdgcxvrygQoe7
BHQ7zfcLGouDQWcnK1awjaVnA4qIbW13R5jSj+iUjqaUicyHN/J1AQ==
=YHp2
-----END PGP SIGNATURE-----

@p5pRT
Copy link
Author

p5pRT commented Mar 20, 2007

From @rgarcia

On 17/03/07, Jari Aalto <jari.aalto@​cante.net> wrote​:

Thanks, here is updated patch (rsync'd against latest bledperl 2007-03-17)

Thanks, applied as #30631.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant