Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split() with capturing parentheses captures "" instead of undef #4508

Closed
p5pRT opened this issue Oct 16, 2001 · 2 comments
Closed

split() with capturing parentheses captures "" instead of undef #4508

p5pRT opened this issue Oct 16, 2001 · 2 comments

Comments

@p5pRT
Copy link

p5pRT commented Oct 16, 2001

Migrated from rt.perl.org#7823 (status was 'resolved')

Searchable as RT7823$

@p5pRT
Copy link
Author

p5pRT commented Oct 16, 2001

From @mjdominus

If I do

  @​parens = ('-A-B-' =~ /(A)()|(B)()/g);

I get @​parens = ('A', '', undef, undef
  undef, undef, 'B', '')
 
because on the first match, $3 and $4 are not used, and on the second
match, $1 and $2 are not used. This is correct behavior.

However, the behavior for split() is not analogous.

  @​a = split /(A)()|(B)()/, '-A-B-', -1;

This should construct

  @​a = ('-',
  'A', '', undef, undef,
  '-',
  undef, undef, 'B', '',
  '-');

However​:

  use Data​::Dumper;
  print Dumper(\@​a);

Says​:

  # Indentation modified for comparison with example above
  $VAR1 = [
  '-',
  'A', '', '', '',
  '-',
  '', '', 'B','',
  '-'
  ];

The elements that should be undefined are instead defined empty strings.

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl v5.6.1:

Configured by mjd at Mon Apr  9 13:10:50 EDT 2001.

Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration:
  Platform:
    osname=linux, osvers=2.2.16, archname=i586-linux
    uname='linux plover 2.2.16 #5 wed sep 27 19:05:46 edt 2000 i586 unknown '
    config_args='-des'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
  Compiler:
    cc='cc', ccflags ='-fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-fno-strict-aliasing -I/usr/local/include'
    ccversion='', gccversion='egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, usemymalloc=n, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lc -lposix -lcrypt -lutil
    perllibs=-lnsl -ldl -lm -lc -lposix -lcrypt -lutil
    libc=/lib/libc-2.1.3.so, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    


@INC for perl v5.6.1:
    /usr/local/lib/perl5/5.6.1/i586-linux
    /usr/local/lib/perl5/5.6.1
    /usr/local/lib/perl5/site_perl/5.6.1/i586-linux
    /usr/local/lib/perl5/site_perl/5.6.1
    /usr/local/lib/perl5/site_perl/5.6.0/i586-linux
    /usr/local/lib/perl5/site_perl/5.6.0
    /usr/local/lib/perl5/site_perl
    .


Environment for perl v5.6.1:
    HOME=/home/mjd
    LANG (unset)
    LANGUAGE (unset)
    LC_ALL=POSIX
    LD_LIBRARY_PATH=/lib:/usr/lib:/usr/X11R6/lib
    LOGDIR (unset)
    PATH=/home/mjd/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11/bin:/usr/games:/sbin:/usr/sbin:/usr/local/bin/X11:/usr/local/bin/mh:/data/mysql/bin:/usr/local/bin/pbm:/usr/local/bin/ezmlm:/home/mjd/TPI/bin:/usr/local/teTeX/bin:/usr/local/mysql/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Oct 24, 2001

From [Unknown Contact. See original ticket]

If I do

   @​parens = \('\-A\-B\-' =~ /\(A\)\(\)|\(B\)\(\)/g\);

I get @​parens = ('A', '', undef, undef
undef, undef, 'B', '')

because on the first match, $3 and $4 are not used, and on the second
match, $1 and $2 are not used. This is correct behavior.

However, the behavior for split() is not analogous.

   @​a = split /\(A\)\(\)|\(B\)\(\)/\, '\-A\-B\-'\, \-1;

That has been patched for bleadperl.

--
Jeff "japhy" Pinyan japhy@​pobox.com http​://www.pobox.com/~japhy/
RPI Acacia brother #734 http​://www.perlmonks.org/ http​://www.cpan.org/
** Look for "Regular Expressions in Perl" published by Manning, in 2002 **

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant