New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PATCH] Docs: perlfunc: Rewrite `split' #11342
Comments
From mfwitten@gmail.comSubject: [PATCH] Docs: perlfunc: Rewrite `split' I couldn't stand the way the documenation for `split' was written; This variation completes sentences, adds new ones, rearranges ideas, While the original text seemed to be written in a way that touched upon Signed-off-by: Michael Witten <mfwitten@gmail.com> pod/perlfunc.pod | 200 ++++++++++++++++++++++++++++++++---------------------- Inline Patchdiff --git a/pod/perlfunc.pod b/pod/perlfunc.pod
index 2a1b20a..d1ff454 100644
--- a/pod/perlfunc.pod
+++ b/pod/perlfunc.pod
@@ -5816,117 +5816,155 @@ X<split>
=item split
+Splits the string EXPR into a list of strings and returns data
+about that list; in scalar context, the return value is the
+number of fields found, and in list context, the return value
+is the list itself.
+
+If EXPR is omitted (requiring LIMIT to be omitted as well), then
+EXPR defaults to the C<$_> string.
+
+Anything in EXPR that matches PATTERN is taken to be a delimiter
+that separates the EXPR into substrings (called "I<fields>") that do
+B<not> include the delimiter; note that a delimiter may be longer
+than one character or even have no characters at all (the empty string,
+a zero-width match). In the case of the empty string, the EXPR is
+split at the match position (between characters). As an example,
+the following:
+
+ print join(':', split('b', 'abc')), "\n";
+
+uses the 'b' as a delimiter to produce the output 'a:c'. However,
+this:
+
+ print join(':', split('', 'abc')), "\n";
+
+uses the empty string as delimiters to produce the output 'a:b:c'; thus,
+the empty string may be used to split EXPR into a list of its component
+characters; as a special case for C<split>, the empty pattern C<//>
+specifically matches the empty string, which is contrary to the normal
+use of an empty pattern to mean the last successful match.
+
+If PATTERN is C</^/>, then it is treated as if it used the
+L<multiline modifier|perlreref/operators> (C</^/m>), since it isn't much
+use otherwise.
+
+As another special PATTERN case, C<split> emulates the default behavior of
+the command line tool B<awk> when the PATTERN is a I<string> composed of a
+single space character (S<C<' '>>); in this case, any leading whitespace
+in EXPR is removed before splitting occurs, and the PATTERN is instead
+treated as if it were the L<match operator|perlop/m_> syntax C</\s+/>; in
+particular, this means that I<any> whitespace (not just a single space
+character) is used as a delimiter. However, this special treatment can be
+avoided by specifying the PATTERN using the match operator syntax (rather
+than a plain string), thereby allowing a single space character to be a
+delimiter: S<C</ />>.
+
+If PATTERN is omitted, then PATTERN defaults to C<' '> (that is, the
+aformentioned special case).
+
+The PATTERN need not be constant; an expression may be used to specify
+a pattern that varies at runtime. However, it takes time to compile a
+regular expression, so such runtime variation
+L<should be optimized|perlretut/Optimizing pattern evaluation> where
+possible by using e.g. the L<compile modifier (//o)|perlreref/operators>.
+
+If LIMIT is specified and positive, it represents the maximum number
+of fields into which the EXPR may be split; in other words, LIMIT is
+one greater than the maximum number of times EXPR may be split; thus,
+the LIMIT value C<1> means that EXPR may be split a maximum of zero
+times, producing a maximum of one field (namely, the entire value of
+EXPR); for instance:
+
+ print join(':', split(//, 'abc', 1)), "\n";
+
+produces the output 'abc', and this:
+
+ print join(':', split(//, 'abc', 2)), "\n";
+
+produces the output 'a:bc', and each of these:
+
+ print join(':', split(//, 'abc', 3)), "\n";
+ print join(':', split(//, 'abc', 4)), "\n";
+
+produces the output 'a:b:c'.
+
+If LIMIT is negative, it is treated as if it were instead arbitrarily
+large; as many fields as possible are produced.
+
+If LIMIT is omitted (or, equivalently, zero), then it is usually
+treated as if it were instead negative but with the exception that
+trailing empty fields are stripped (empty leading fields are always
+preserved); if all fields are empty, then all fields are considered to
+be trailing (and are thus stripped in this case). Thus, the following:
+
+ print join(':', split(',', 'a,b,c,,,')), "\n";
+
+produces the output 'a:b:c', but the following:
+
+ print join(':', split(',', 'a,b,c,,,', -1)), "\n";
+
+produces the output 'a:b:c:::'.
+
+In time-critical applications, it is worthwhile to avoid splitting
+into more fields than necessary. Thus, when assigning to a list,
+if LIMIT is omitted (or zero), then C<split> is implicitly given
+a LIMIT that is one larger than the number of variables in the
+list; for the following, LIMIT is implicitly 4:
+
+ ($login, $passwd, $remainder) = split(/:/);
+
+Note that splitting an EXPR that evaluates to the empty string always
+produces zero fields, regardless of the LIMIT specified.
+
+Empty leading fields are produced when there are positive-width matches at
+the beginning of the string; for instance:
+
+ print join(':', split(/ /, ' abc')), "\n";
+
+produces the output ':abc'. However, a zero-width match at the
+beginning of the string never produces an empty field; for example:
+
+ print join(':', split(//, 'abc'));
+
+produces the output 'a:b:c' (rather than ':a:b:c').
+
+Empty trailing fields, on the other hand, are produced when there is a
+match at the end of the string, regardless of the length of the match
+(of course, unless a non-zero LIMIT is given explicitly, such fields are
+removed, as in the last example); the following:
+
+ print join(':', split(//, 'abc', -1)), "\n";
+
+produces the output 'a:b:c:'.
+
+If the PATTERN contains
+L<regular expression groups|perlretut/Grouping things and hierarchical matching>,
+then for each delimiter, additional fields are produced from the substrings
+captured by each group (in the order in which the groups are specified,
+as per L<backreferences|perlretut/Backreferences>); if any group does not
+match, then it captures C<undef> instead of a substring. Also, note that
+such additional fields are produced whenever there is a delimiter (that
+is, whenever a split occurs), and such additional fields do B<not> count
+towards the LIMIT (in some sense, then, it is better to think of LIMIT as
+one greater than the maximum number of splits that may occur). Consider the
+following expressions evaluated in list context (the returned lists are
+provided in the associated comments):
+
+ split(/-|,/, "1-10,20", 3)
+ # ('1', '10', '20')
+
+ split(/(-|,)/, "1-10,20", 3)
+ # ('1', '-', '10', ',', '20')
+
+ split(/-|(,)/, "1-10,20", 3)
+ # ('1', undef, '10', ',', '20')
+
+ split(/(-)|,/, "1-10,20", 3)
+ # ('1', '-', '10', 'undef', '20')
+
+ split(/(-)|(,)/, "1-10,20", 3)
+ # ('1', '-', undef, '10', undef, ',', '20')
-Splits the string EXPR into a list of strings and returns that list. By
-default, empty leading fields are preserved, and empty trailing ones are
-deleted. (If all fields are empty, they are considered to be trailing.)
-
-In scalar context, returns the number of fields found.
-
-If EXPR is omitted, splits the C<$_> string. If PATTERN is also omitted,
-splits on whitespace (after skipping any leading whitespace). Anything
-matching PATTERN is taken to be a delimiter separating the fields. (Note
-that the delimiter may be longer than one character.)
-
-If LIMIT is specified and positive, it represents the maximum number
-of fields the EXPR will be split into, though the actual number of
-fields returned depends on the number of times PATTERN matches within
-EXPR. If LIMIT is unspecified or zero, trailing null fields are
-stripped (which potential users of C<pop> would do well to remember).
-If LIMIT is negative, it is treated as if an arbitrarily large LIMIT
-had been specified. Note that splitting an EXPR that evaluates to the
-empty string always returns the empty list, regardless of the LIMIT
-specified.
-
-A pattern matching the empty string (not to be confused with
-an empty pattern C<//>, which is just one member of the set of patterns
-matching the epmty string), splits EXPR into individual
-characters. For example:
-
- print join(':', split(/ */, 'hi there')), "\n";
-
-produces the output 'h:i:t:h:e:r:e'.
-
-As a special case for C<split>, the empty pattern C<//> specifically
-matches the empty string; this is not be confused with the normal use
-of an empty pattern to mean the last successful match. So to split
-a string into individual characters, the following:
-
- print join(':', split(//, 'hi there')), "\n";
-
-produces the output 'h:i: :t:h:e:r:e'.
-
-Empty leading fields are produced when there are positive-width matches at
-the beginning of the string; a zero-width match at the beginning of
-the string does not produce an empty field. For example:
-
- print join(':', split(/(?=\w)/, 'hi there!'));
-
-produces the output 'h:i :t:h:e:r:e!'. Empty trailing fields, on the other
-hand, are produced when there is a match at the end of the string (and
-when LIMIT is given and is not 0), regardless of the length of the match.
-For example:
-
- print join(':', split(//, 'hi there!', -1)), "\n";
- print join(':', split(/\W/, 'hi there!', -1)), "\n";
-
-produce the output 'h:i: :t:h:e:r:e:!:' and 'hi:there:', respectively,
-both with an empty trailing field.
-
-The LIMIT parameter can be used to split a line partially
-
- ($login, $passwd, $remainder) = split(/:/, $_, 3);
-
-When assigning to a list, if LIMIT is omitted, or zero, Perl supplies
-a LIMIT one larger than the number of variables in the list, to avoid
-unnecessary work. For the list above LIMIT would have been 4 by
-default. In time critical applications it behooves you not to split
-into more fields than you really need.
-
-If the PATTERN contains parentheses, additional list elements are
-created from each matching substring in the delimiter.
-
- split(/([,-])/, "1-10,20", 3);
-
-produces the list value
-
- (1, '-', 10, ',', 20)
-
-If you had the entire header of a normal Unix email message in $header,
-you could split it up into fields and their values this way:
-
- $header =~ s/\n(?=\s)//g; # fix continuation lines
- %hdrs = (UNIX_FROM => split /^(\S*?):\s*/m, $header);
-
-The pattern C</PATTERN/> may be replaced with an expression to specify
-patterns that vary at runtime. (To do runtime compilation only once,
-use C</$variable/o>.)
-
-As a special case, specifying a PATTERN of space (S<C<' '>>) will split on
-white space just as C<split> with no arguments does. Thus, S<C<split(' ')>> can
-be used to emulate B<awk>'s default behavior, whereas S<C<split(/ /)>>
-will give you as many initial null fields (empty string) as there are leading spaces.
-A C<split> on C</\s+/> is like a S<C<split(' ')>> except that any leading
-whitespace produces a null first field. A C<split> with no arguments
-really does a S<C<split(' ', $_)>> internally.
-
-A PATTERN of C</^/> is treated as if it were C</^/m>, since it isn't
-much use otherwise.
-
-Example:
-
- open(PASSWD, '/etc/passwd');
- while (<PASSWD>) {
- chomp;
- ($login, $passwd, $uid, $gid,
- $gcos, $home, $shell) = split(/:/);
- #...
- }
-
-As with regular pattern matching, any capturing parentheses that are not
-matched in a C<split()> will be set to C<undef> when returned:
-
- @fields = split /(A)|B/, "1A2B3";
- # @fields is (1, 'A', 2, undef, 3)
=item sprintf FORMAT, LIST
X<sprintf>
--
Flags: |
From tchrist@perl.com
I think you'll find that description fits quite a bit of the documentation. It's a real problem. --tom |
The RT System itself - Status changed from 'new' to 'open' |
From bmb@Mail.Libs.UGA.EDUOn Sun, May 15, 2011 at 3:09 PM, Michael Witten
On a quick read, I'd like to make one suggestion. You have A period let's me take a mental breath. With a semicolon, I -- |
From tsibley@cpan.orgThere's a quoted undef in the second to last split example. |
From mfwitten@gmail.comHere is an updated patch. Save this message /tmp/p and then apply it as follows: $ cd /path/to/perl/repo 8<-----------8<-----------8<-----------8<-----------8<-----------8<----------- Date: Sun, 10 Apr 2011 23:23:21 +0000 I couldn't stand the way the documenation for `split' was written; This variation completes sentences, adds new ones, rearranges ideas, While the original text seemed to be written in a way that touched upon Thanks to Brad Baxter and Thomas R. Sibley for their constructive Signed-off-by: Michael Witten <mfwitten@gmail.com> pod/perlfunc.pod | 206 +++++++++++++++++++++++++++++++++--------------------- Inline Patchdiff --git a/pod/perlfunc.pod b/pod/perlfunc.pod
index 2a1b20a..11432d8 100644
--- a/pod/perlfunc.pod
+++ b/pod/perlfunc.pod
@@ -5816,117 +5816,161 @@ X<split>
=item split
+Splits the string EXPR into a list of strings and returns data
+about that list: In scalar context, the return value is the
+number of fields found, and in list context, the return value
+is the list itself.
+
+If EXPR is omitted (requiring LIMIT to be omitted as well), then
+EXPR defaults to the C<$_> string.
+
+Anything in EXPR that matches PATTERN is taken to be a delimiter
+that separates the EXPR into substrings (called "I<fields>") that
+do B<not> include the delimiter. Note that a delimiter may be
+longer than one character or even have no characters at all (the
+empty string, which is a zero-width match).
+
+The PATTERN need not be constant; an expression may be used
+to specify a pattern that varies at runtime. However, it
+takes time to compile a regular expression, so such runtime
+variation L<should be optimized|perlretut/Optimizing pattern
+evaluation> where possible by using e.g. the L<compile modifier
+(//o)|perlreref/operators>.
+
+If PATTERN matches the empty string, the EXPR is split at the match
+position (between characters). As an example, the following:
+
+ print join(':', split('b', 'abc')), "\n";
+
+uses the 'b' in 'abc' as a delimiter to produce the output 'a:c'.
+However, this:
+
+ print join(':', split('', 'abc')), "\n";
+
+uses empty string matches as delimiters to produce the output
+'a:b:c'; thus, the empty string may be used to split EXPR into a
+list of its component characters.
+
+As a special case for C<split>, the empty pattern given in
+L<match operator|perlop/m_> syntax (C<//>) specifically matches
+the empty string, which is contrary to its usual interpretation
+as the last successful match.
+
+If PATTERN is C</^/>, then it is treated as if it used the
+L<multiline modifier|perlreref/operators> (C</^/m>), since it
+isn't much use otherwise.
+
+As another special case, C<split> emulates the default behavior of
+the command line tool B<awk> when the PATTERN is a I<string> composed
+of a single space character (S<C<' '>>): Any leading whitespace in
+EXPR is removed before splitting occurs, and the PATTERN is instead
+treated as if it were C</\s+/>; in particular, this means that I<any>
+whitespace (not just a single space character) is used as a delimiter.
+However, this special treatment can be avoided by specifying the
+PATTERN using the match operator syntax (rather than a plain string),
+thereby allowing a single space character to be a delimiter: S<C</
+/>>.
+
+If PATTERN is omitted, then PATTERN defaults to a string composed
+of a single space character (S<C<' '>>), which invokes the
+aformentioned B<awk> emulation.
+
+If LIMIT is specified and positive, it represents the maximum number
+of fields into which the EXPR may be split; in other words, LIMIT is
+one greater than the maximum number of times EXPR may be split. Thus,
+the LIMIT value C<1> means that EXPR may be split a maximum of zero
+times, producing a maximum of one field (namely, the entire value of
+EXPR). For instance:
+
+ print join(':', split(//, 'abc', 1)), "\n";
+
+produces the output 'abc', and this:
+
+ print join(':', split(//, 'abc', 2)), "\n";
+
+produces the output 'a:bc', and each of these:
+
+ print join(':', split(//, 'abc', 3)), "\n";
+ print join(':', split(//, 'abc', 4)), "\n";
+
+produces the output 'a:b:c'.
+
+If LIMIT is negative, it is treated as if it were instead arbitrarily
+large; as many fields as possible are produced.
+
+If LIMIT is omitted (or, equivalently, zero), then it is usually
+treated as if it were instead negative but with the exception that
+trailing empty fields are stripped (empty leading fields are always
+preserved); if all fields are empty, then all fields are considered to
+be trailing (and are thus stripped in this case). Thus, the following:
+
+ print join(':', split(',', 'a,b,c,,,')), "\n";
+
+produces the output 'a:b:c', but the following:
+
+ print join(':', split(',', 'a,b,c,,,', -1)), "\n";
+
+produces the output 'a:b:c:::'.
+
+In time-critical applications, it is worthwhile to avoid splitting
+into more fields than necessary. Thus, when assigning to a list,
+if LIMIT is omitted (or zero), then LIMIT is treated as though it
+were one larger than the number of variables in the list; for the
+following, LIMIT is implicitly 4:
+
+ ($login, $passwd, $remainder) = split(/:/);
+
+Note that splitting an EXPR that evaluates to the empty string always
+produces zero fields, regardless of the LIMIT specified.
+
+An empty leading field is produced when there is a positive-width
+match at the beginning of EXPR. For instance:
+
+ print join(':', split(/ /, ' abc')), "\n";
+
+produces the output ':abc'. However, a zero-width match at the
+beginning of EXPR never produces an empty field, so that:
+
+ print join(':', split(//, ' abc'));
+
+produces the output S<' :a:b:c'> (rather than S<': :a:b:c'>).
+
+An empty trailing field, on the other hand, is produced when there is a
+match at the end of EXPR, regardless of the length of the match
+(of course, unless a non-zero LIMIT is given explicitly, such fields are
+removed, as in the last example). Thus:
+
+ print join(':', split(//, ' abc', -1)), "\n";
+
+produces the output S<' :a:b:c:'>.
+
+If the PATTERN contains
+L<regular expression groups|perlretut/Grouping things and hierarchical matching>,
+then for each delimiter, an additional field is produced for each substring
+captured by a group (in the order in which the groups are specified,
+as per L<backreferences|perlretut/Backreferences>); if any group does not
+match, then it captures the C<undef> value instead of a substring. Also, note
+that any such additional field is produced whenever there is a delimiter (that
+is, whenever a split occurs), and such an additional field does B<not> count
+towards the LIMIT (in some sense, then, it is better to think of LIMIT as
+one greater than the maximum number of splits that may occur). Consider the
+following expressions evaluated in list context (each returned list is provided
+in the associated comment):
+
+ split(/-|,/, "1-10,20", 3)
+ # ('1', '10', '20')
+
+ split(/(-|,)/, "1-10,20", 3)
+ # ('1', '-', '10', ',', '20')
+
+ split(/-|(,)/, "1-10,20", 3)
+ # ('1', undef, '10', ',', '20')
+
+ split(/(-)|,/, "1-10,20", 3)
+ # ('1', '-', '10', undef, '20')
+
+ split(/(-)|(,)/, "1-10,20", 3)
+ # ('1', '-', undef, '10', undef, ',', '20')
-Splits the string EXPR into a list of strings and returns that list. By
-default, empty leading fields are preserved, and empty trailing ones are
-deleted. (If all fields are empty, they are considered to be trailing.)
-
-In scalar context, returns the number of fields found.
-
-If EXPR is omitted, splits the C<$_> string. If PATTERN is also omitted,
-splits on whitespace (after skipping any leading whitespace). Anything
-matching PATTERN is taken to be a delimiter separating the fields. (Note
-that the delimiter may be longer than one character.)
-
-If LIMIT is specified and positive, it represents the maximum number
-of fields the EXPR will be split into, though the actual number of
-fields returned depends on the number of times PATTERN matches within
-EXPR. If LIMIT is unspecified or zero, trailing null fields are
-stripped (which potential users of C<pop> would do well to remember).
-If LIMIT is negative, it is treated as if an arbitrarily large LIMIT
-had been specified. Note that splitting an EXPR that evaluates to the
-empty string always returns the empty list, regardless of the LIMIT
-specified.
-
-A pattern matching the empty string (not to be confused with
-an empty pattern C<//>, which is just one member of the set of patterns
-matching the epmty string), splits EXPR into individual
-characters. For example:
-
- print join(':', split(/ */, 'hi there')), "\n";
-
-produces the output 'h:i:t:h:e:r:e'.
-
-As a special case for C<split>, the empty pattern C<//> specifically
-matches the empty string; this is not be confused with the normal use
-of an empty pattern to mean the last successful match. So to split
-a string into individual characters, the following:
-
- print join(':', split(//, 'hi there')), "\n";
-
-produces the output 'h:i: :t:h:e:r:e'.
-
-Empty leading fields are produced when there are positive-width matches at
-the beginning of the string; a zero-width match at the beginning of
-the string does not produce an empty field. For example:
-
- print join(':', split(/(?=\w)/, 'hi there!'));
-
-produces the output 'h:i :t:h:e:r:e!'. Empty trailing fields, on the other
-hand, are produced when there is a match at the end of the string (and
-when LIMIT is given and is not 0), regardless of the length of the match.
-For example:
-
- print join(':', split(//, 'hi there!', -1)), "\n";
- print join(':', split(/\W/, 'hi there!', -1)), "\n";
-
-produce the output 'h:i: :t:h:e:r:e:!:' and 'hi:there:', respectively,
-both with an empty trailing field.
-
-The LIMIT parameter can be used to split a line partially
-
- ($login, $passwd, $remainder) = split(/:/, $_, 3);
-
-When assigning to a list, if LIMIT is omitted, or zero, Perl supplies
-a LIMIT one larger than the number of variables in the list, to avoid
-unnecessary work. For the list above LIMIT would have been 4 by
-default. In time critical applications it behooves you not to split
-into more fields than you really need.
-
-If the PATTERN contains parentheses, additional list elements are
-created from each matching substring in the delimiter.
-
- split(/([,-])/, "1-10,20", 3);
-
-produces the list value
-
- (1, '-', 10, ',', 20)
-
-If you had the entire header of a normal Unix email message in $header,
-you could split it up into fields and their values this way:
-
- $header =~ s/\n(?=\s)//g; # fix continuation lines
- %hdrs = (UNIX_FROM => split /^(\S*?):\s*/m, $header);
-
-The pattern C</PATTERN/> may be replaced with an expression to specify
-patterns that vary at runtime. (To do runtime compilation only once,
-use C</$variable/o>.)
-
-As a special case, specifying a PATTERN of space (S<C<' '>>) will split on
-white space just as C<split> with no arguments does. Thus, S<C<split(' ')>> can
-be used to emulate B<awk>'s default behavior, whereas S<C<split(/ /)>>
-will give you as many initial null fields (empty string) as there are leading spaces.
-A C<split> on C</\s+/> is like a S<C<split(' ')>> except that any leading
-whitespace produces a null first field. A C<split> with no arguments
-really does a S<C<split(' ', $_)>> internally.
-
-A PATTERN of C</^/> is treated as if it were C</^/m>, since it isn't
-much use otherwise.
-
-Example:
-
- open(PASSWD, '/etc/passwd');
- while (<PASSWD>) {
- chomp;
- ($login, $passwd, $uid, $gid,
- $gcos, $home, $shell) = split(/:/);
- #...
- }
-
-As with regular pattern matching, any capturing parentheses that are not
-matched in a C<split()> will be set to C<undef> when returned:
-
- @fields = split /(A)|B/, "1A2B3";
- # @fields is (1, 'A', 2, undef, 3)
=item sprintf FORMAT, LIST
X<sprintf>
--
1.7.4.18.g68fe8 |
From bmb@Mail.Libs.UGA.EDUI won't belabor this point beyond this post, but I did want Note that every change I'm suggesting is simply to replace Cheers ... I think the following sentence is fine. The PATTERN need not be constant; an expression may be used But I'd change this: However, this: print join(':', split('', 'abc')), "\n"; uses empty string matches as delimiters to produce the output to this: However, this: print join(':', split('', 'abc')), "\n"; uses empty string matches as delimiters to produce the output I would change this: As another special case, C<split> emulates the default behavior of to this: As another special case, C<split> emulates the default behavior of And change this: If LIMIT is specified and positive, it represents the maximum number to this: If LIMIT is specified and positive, it represents the maximum number I think this is fine: If LIMIT is negative, it is treated as if it were instead arbitrarily But I'd change this: If LIMIT is omitted (or, equivalently, zero), then it is usually to this: If LIMIT is omitted (or, equivalently, zero), then it is usually This: In time-critical applications, it is worthwhile to avoid splitting ($login, $passwd, $remainder) = split(/:/); to this: In time-critical applications, it is worthwhile to avoid splitting ($login, $passwd, $remainder) = split(/:/); And this: If the PATTERN contains to this: If the PATTERN contains -- |
From bmb@Mail.Libs.UGA.EDUOn Tue, May 17, 2011 at 3:56 PM, Michael Witten <mfwitten@gmail.com> wrote:
FWIW, I'm uneasy about this patch. After getting over the semicolons, Splits the string EXPR into a list of substrings (or I<fields>). If EXPR and LIMIT are omitted, EXPR defaults to the C<$_> string. But then I started not being able to really say whether the patch I wonder if a split tutorial might be a better way to include the extra Regards, Brad |
From mfwitten@gmail.comOn Wed, May 18, 2011 at 00:50, Brad Baxter <bmb@mail.libs.uga.edu> wrote:
Do not think that I don't appreciate the fact that you've taken the I do agree that I was a little too liberal with the semicolons on the To me, the semicolon provides a means for connecting 2 highly related |
From mfwitten@gmail.comOn Tue, 17 May 2011 22:05:21 -0400, Brad Baxter wrote:
I think the problem here is that I use `strings' and then `fields'. First, Anything in EXPR that matches PATTERN is taken to be a delimiter Therefore, I think it is best to view the first paragraph as a quick Splits the string EXPR into a list of fields and returns data
Oh, come on, now; don't be so melodramatic. I get the feeling that you are a lot like I am: The second things feel However, that is the very purpose of discussion: To figure out what is
I don't think you are in a position to make that judgment until you have I took another look at the existing text and my resolve is stronger than Sincerely, |
From @cpansproutOn Tue May 17 13:06:39 2011, mfwitten wrote:
I have some criticism to offer, which I hope you will find more
There is no need to mention /o. In fact, I find it confusing. /o does
I don’t think that’s a valid pod link. It should be
The split " " behaviour only applies to a literal string, not to a
s/thereby allowing/thereby only allowing/
Please say ‘capturing groups’. Just ‘groups’ would include (?:...).
I think that parenthetical remark can be removed, as it repeats Also, do you want to document this? :-) ()=@==split" ","Just another Perl hacker,\n"; |
From @ap* Brad Baxter <bmb@mail.libs.uga.edu> [2011-05-18 04:10]:
“Returns data about the list” for something usually used to get
Agreed. In fact, I suggest cutting away even more: Splits the string EXPR into a list of strings and returns the If only PATTERN is given, EXPR defaults to C<$_>. Regards, |
From bmb@Mail.Libs.UGA.EDUOn Wed, May 18, 2011 at 12:01 PM, Michael Witten <mfwitten@gmail.com> wrote:
And I think that the fact those sentences are in the same paragraph -- |
From bmb@Mail.Libs.UGA.EDUOn Wed, May 18, 2011 at 12:34 PM, Michael Witten <mfwitten@gmail.com> wrote:
Not my intention. :-) More accurately, I'm not sure I have the spare -- |
From mfwitten@gmail.comOn Sun, 22 May 2011 13:34:36 -0700, Father Chrysostomos wrote:
I figured as much, but I felt that it would be best to be told to As for `qr//', I actually did mention it in a previous, unpublished However, removing the discussion of `/o' (and leaving out qr//) The PATTERN need not be constant; an expression may be used to Sincerely, |
From mfwitten@gmail.comOn Sun, 22 May 2011 13:34:36 -0700, Father Chrysostomos wrote:
Well, when I was first writing this patch, I was going to do exactly "L<name>" -- a hyperlink Thus, the obvious choice (your suggestion) is incorrect. At the time, I "E<escape>" -- a character escape ... · "E<sol>" = a literal / (solidus) The above [is] optional except in other formatting codes, like this: L<match operator|perlop/mE<sol>PATTERNE<sol>msixpodualgc> which ends up producing an HTML hyperlink with the following `href': /pod/perlop.html#me_sol_patterne_sol_msixpodualgc Unfortunately, that doesn't actually link to anything because the m_ which is what I ended up using directly. In fact, if you use the following [disallowed] POD text: L<match operator|perlop/m/anything you want to write> then you'll get a link that works, because `pod2html' produces an HTML m_ Consequently, I'm actually led to believe that the `pod2html' is =item m/PATTERN/msixpodualgc produce the following corresponding HTML anchor values: m_ How does that make sense? They should probably produce whatever these corresponding POD lines L<match operator|perlop/mE<sol>PATTERNE<sol>msixpodualgc> So... Where do we go from here? Sincerely, |
From mfwitten@gmail.comOn Sun, 22 May 2011 13:34:36 -0700, Father Chrysostomos wrote:
Here's my attempt to incorporate those points: As another special case, C<split> emulates the default behavior of the If PATTERN is omitted, then PATTERN defaults to a literal string composed Sincerely, |
From tchrist@perl.com
such as C<" "> or C<"\x20">
Leading whitespace in EXPR is ignored, and the PATTERN C</\s+/> is used
any stretch of one or more whitespace characters
!!!!!!!!!!!!!!s/delimiter/separator/!!!!!!!!!!!!!!!!!!!!
avoid this by specifying the pattern C</ /> instead of the string C<" ">
!!!!!!!!!!!!!!s/delimiter/separator/!!!!!!!!!!!!!!!!!!!!
If omitted, PATTERN defaults to a single space, C<" ">, triggering --tom |
From tchrist@perl.com
As another special case, when PATTERN is a single space character as a --tom |
From mfwitten@gmail.comThanks for taking a look, Tom. On Tue, 24 May 2011 13:54:10 -0600, Tom Christiansen <tchrist@perl.com> wrote:
I'll add that as an example.
YOU would actually probably want to say: Any stretch of one or more leading white space characters as per below :-) In any case, I like my wording better because it is is more explicit
For one thing, I dislike mixing "one" with "more" because it leaves in limbo any stretch of at least one whitespace character In any case, I already use `leading whitespace' with that same meaning, and
Actually, I tried to use `delimiter' very strictly throughout this text, and Anything in EXPR that matches PATTERN is taken to be a delimiter Of course, that is actually a very loose definition, and I think I can do I'll make an improvement there and then send an email about it.
How about incorporating that into the examples? See below.
How about incorporating that into the awk emulation introduction? As another special case, C<split> emulates the default behavior of the Is that POD link to `core_perl::constant' correct? Sincerely, |
From tchrist@perl.com
But that's wrong. A delimiter is a surrounder, not a separator. I feel much more strongly about this than I have here expressed, --tom |
From mfwitten@gmail.comOn Wed, May 25, 2011 at 16:03, Tom Christiansen <tchrist@perl.com> wrote:
Perhaps so. I can buy that; I'll use `separator' then.
What on earth do you mean? Are you holding back, good man? |
From tchrist@perl.com
Good. Thanks. Check perlglossary: alternatives BLOCK delimiter field LIST separator terminator
Why certainly, but it's better that way. But as you insist, here is a taste. Consider the string: :foo:bar: You get a different number of fields if that string is considered "Separator", "terminator", and "delimiter" are three distinct terms for If you collapse the distinction to make any two of those terms map to I feel that it is important in technical works to preserve meaningful That would just make more work for us all, not less. --tom |
From mfwitten@gmail.comOn Wed, May 25, 2011 at 16:30, Tom Christiansen <tchrist@perl.com> wrote:
Awesome. Perhaps all of the documentation should link to the glossary
Outstanding! I, for one, appreciate your robust expostulation. |
From @cpansproutOn Wed May 25 09:44:41 2011, mfwitten wrote:
I’ve taken your patch and incorporated suggestions from others as best I I have applied the result as bd46758. Thank you. -- Father Chrysostomos |
@cpansprout - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#90632 (status was 'resolved')
Searchable as RT90632$
The text was updated successfully, but these errors were encountered: