Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"\" does not escape meta chars if also delim #12144

Open
p5pRT opened this issue May 29, 2012 · 12 comments
Open

"\" does not escape meta chars if also delim #12144

p5pRT opened this issue May 29, 2012 · 12 comments

Comments

@p5pRT
Copy link

p5pRT commented May 29, 2012

Migrated from rt.perl.org#113420 (status was 'open')

Searchable as RT113420$

@p5pRT
Copy link
Author

p5pRT commented May 29, 2012

From @ikegami

Created by @ikegami

About "\", perlre says​:

"Quote the next metacharacter."

"So anything that looks like \\, \(, \), \<, \>, \{, or \} is always
interpreted as a literal character, not a metacharacter."

"Any single character matches itself, unless it is a metacharacter with a
special meaning described here or above. You can cause characters that
normally function as metacharacters to be interpreted literally by
prefixing them with a "\" (e.g., "\." matches a ".", not any character;
"\\" matches a "\"). This escape mechanism is also required for the
character used as the pattern delimiter."

Yet when "\" is used to escape a char that's both delimiter and meta, the
escaped character doesn't cease being meta as documented.

perl -E" say 'a' =~ m/\./ ? 'XXX' : 'ok' "
ok

perl -E" say 'a' =~ m.\.. ? 'XXX' : 'ok' "
XXX

- Eric

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl 5.14.2:

Configured by gecko at Fri Oct  7 15:46:51 2011.

Summary of my perl5 (revision 5 version 14 subversion 2) configuration:

  Platform:
    osname=MSWin32, osvers=5.2, archname=MSWin32-x86-multi-thread
    uname=''
    config_args='undef'
    hint=recommended, useposix=true, d_sigaction=undef
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cl', ccflags ='-nologo -GF -W3 -MD -Zi -DNDEBUG -O1 -DWIN32
-D_CONSOLE -DNO_STRICT -DPERL_TEXTMODE_SCRIPTS -DUSE_SITECUSTOMIZE
-DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO
-D_USE_32BIT_TIME_T',
    optimize='-MD -Zi -DNDEBUG -O1',
    cppflags='-DWIN32'
    ccversion='16.0.40219', gccversion='', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=8
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='__int64',
lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='link', ldflags ='-nologo -nodefaultlib -debug -opt:ref,icf
-libpath:"C:\progs\perl5142-ap1402\lib\CORE"  -machine:x86'
    libpth=\lib
    libs=oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib  netapi32.lib
uuid.lib ws2_32.lib mpr.lib winmm.lib  version.lib odbc32.lib odbccp32.lib
comctl32.lib msvcrt.lib
    perllibs=oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib  netapi32.lib
uuid.lib ws2_32.lib mpr.lib winmm.lib  version.lib odbc32.lib odbccp32.lib
comctl32.lib msvcrt.lib
    libc=msvcrt.lib, so=dll, useshrplib=true, libperl=perl514.lib
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags='-dll -nologo -nodefaultlib -debug
-opt:ref,icf  -libpath:"C:\progs\perl5142-ap1402\lib\CORE"  -machine:x86'

Locally applied patches:
    ACTIVEPERL_LOCAL_PATCHES_ENTRY


@INC for perl 5.14.2:
    C:/Progs/perl5142-ap1402/site/lib
    C:/Progs/perl5142-ap1402/lib
    .


Environment for perl 5.14.2:
    HOME (unset)
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=c:\Program Files (x86)\Microsoft Visual Studio
10.0\Common7\IDE\;c:\Program Files (x86)\Microsoft Visual Studio
10.0\VC\BIN;c:\Program Files (x86)\Microsoft Visual Studio
10.0\Common7\Tools;C:\Windows\Microsoft.NET\Framework\v4.0.30319;C:\Windows\Microsoft.NET\Framework\v3.5;c:\Program
Files (x86)\Microsoft Visual Studio 10.0\VC\VCPackages;C:\Program Files
(x86)\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools;C:\Program Files
(x86)\Microsoft
SDKs\Windows\v7.0A\bin;C:\progs\perl5142-ap1402\site\bin;C:\progs\perl5142-ap1402\bin;C:\Program
Files (x86)\NVIDIA
Corporation\PhysX\Common;c:\bin;C:\Progs\perl5140-ap1400\site\bin;C:\Progs\perl5140-ap1400\bin;C:\Program
Files
(x86)\UltraEdit;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program
Files\Common Files\Microsoft Shared\Windows Live;C:\Program Files
(x86)\Common Files\Microsoft Shared\Windows Live;C:\Program
Files\WIDCOMM\Bluetooth Software\;C:\Program Files\WIDCOMM\Bluetooth
Software\syswow64;C:\Program Files (x86)\Common Files\Ulead
Systems\MPEG;C:\Program Files\Microsoft Windows Performance
Toolkit\;C:\Program Files (x86)\Windows Live\Shared;C:\Program Files
(x86)\QuickTime\QTSystem\
    PERL_BADLANG (unset)
    SHELL (unset)

@p5pRT
Copy link
Author

p5pRT commented May 29, 2012

From @demerphq

On 29 May 2012 08​:39, Eric Brine <perlbug-followup@​perl.org> wrote​:

# New Ticket Created by  "Eric Brine"
# Please include the string​:  [perl #113420]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=113420 >

This is a bug report for perl from ikegami@​adaelis.com,
generated with the help of perlbug 1.39 running under perl 5.14.2.

-----------------------------------------------------------------
[Please describe your issue here]

About "\", perlre says​:

"Quote the next metacharacter."

"So anything that looks like \\, \(, \), \<, \>, \{, or \} is always
interpreted as a literal character, not a metacharacter."

"Any single character matches itself, unless it is a metacharacter with a
special meaning described here or above. You can cause characters that
normally function as metacharacters to be interpreted literally by
prefixing them with a "\" (e.g., "\." matches a ".", not any character;
"\\" matches a "\"). This escape mechanism is also required for the
character used as the pattern delimiter."

Yet when "\" is used to escape a char that's both delimiter and meta, the
escaped character doesn't cease being meta as documented.

perl -E" say 'a' =~ m/\./ ? 'XXX' : 'ok' "
ok

perl -E" say 'a' =~ m.\.. ? 'XXX' : 'ok' "
XXX

I dont think this is a bug, but I do plan to work on changing it to DWIM.

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented May 29, 2012

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented May 29, 2012

From vadim.konovalov@alcatel-lucent.com

perl -E" say 'a' =~ m/\./ ? 'XXX' : 'ok' "
ok

perl -E" say 'a' =~ m.\.. ? 'XXX' : 'ok' "
XXX

I dont think this is a bug, but I do plan to work on changing
it to DWIM.

what is DWIM here?

IMO - if anything - an explanation that Eric refers should be expanded
to explain this situation with delimeters.
(yet I am slightly against adding more bits into already good perlre.pod)

@p5pRT
Copy link
Author

p5pRT commented May 29, 2012

From @demerphq

On 29 May 2012 09​:56, Konovalov, Vadim (Vadim)** CTR **
<vadim.konovalov@​alcatel-lucent.com> wrote​:

perl -E" say 'a' =~ m/\./ ? 'XXX' : 'ok' "
ok

perl -E" say 'a' =~ m.\.. ? 'XXX' : 'ok' "
XXX

I dont think this is a bug, but I do plan to work on changing
it to DWIM.

what is DWIM here?

DWIM is that an escaped metacharacter that happens to match a
delimiter is passed through to regex engine verbatim.

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented May 29, 2012

From @ikegami

On Tue, May 29, 2012 at 3​:48 AM, demerphq <demerphq@​gmail.com> wrote​:

I dont think this is a bug, but I do plan to work on changing it to DWIM.

You pointed out elsewhere that perlop documents the actual behaviour.

When the docs don't agree with what actually happens, it's a bug.

When the docs don't agree with each other, it's a bug.

The bug could be as simple as​: perlre talks about literals/operators when
it shouldn't. (perlre talks about literals when it mentions delimiters.
Only literals/operators have delimiters)

On Tue, May 29, 2012 at 3​:56 AM, Konovalov, Vadim (Vadim)** CTR ** <
vadim.konovalov@​alcatel-lucent.com> wrote​:

(yet I am slightly against adding more bits into already good perlre.pod)

This can be addressed by simply removing from perlre, although that would
leave us with correct but unclear and dangerous documentation.

- Eric

@p5pRT
Copy link
Author

p5pRT commented May 29, 2012

From @demerphq

On 29 May 2012 10​:49, Eric Brine <ikegami@​adaelis.com> wrote​:

On Tue, May 29, 2012 at 3​:48 AM, demerphq <demerphq@​gmail.com> wrote​:

I dont think this is a bug, but I do plan to work on changing it to DWIM.

You pointed out elsewhere that perlop documents the actual behaviour.

When the docs don't agree with what actually happens, it's a bug.

The docs seem to agree with what actually happens.

When the docs don't agree with each other, it's a bug.

I haven't so far been convinced that the docs disagree with each
other. Forgive me for being thick but could reframe what you think
contradicts.

I mean, we are kinda quibbling here as we both agree the behavior
should change, er, don't we?

The bug could be as simple as​: perlre talks about literals/operators when it
shouldn't. (perlre talks about literals when it mentions delimiters. Only
literals/operators have delimiters)

Interesting angle.

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented May 29, 2012

From @cpansprout

On Tue May 29 00​:49​:24 2012, demerphq wrote​:

On 29 May 2012 08​:39, Eric Brine <perlbug-followup@​perl.org> wrote​:

perl -E" say 'a' =~ m/\./ ? 'XXX' : 'ok' "
ok

perl -E" say 'a' =~ m.\.. ? 'XXX' : 'ok' "
XXX

I dont think this is a bug, but I do plan to work on changing it to DWIM.

I think that will break a lot of code.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented May 29, 2012

From @dmcbride

On Tuesday May 29 2012 9​:18​:06 AM Father Chrysostomos via RT wrote​:

On Tue May 29 00​:49​:24 2012, demerphq wrote​:

On 29 May 2012 08​:39, Eric Brine <perlbug-followup@​perl.org> wrote​:

perl -E" say 'a' =~ m/\./ ? 'XXX' : 'ok' "

ok

perl -E" say 'a' =~ m.\.. ? 'XXX' : 'ok' "

XXX

I dont think this is a bug, but I do plan to work on changing it to DWIM.

I think that will break a lot of code.

Honestly, I would have expected the need to extra-escape in the second
situation​:

m.\\\..

However, as has been pointed out already, this doesn't (currently) work,
either.

I'm not sure fixing that would break much/any code in practice. And I'm not
sure it's really much of an improvement over the workarounds​:

$dot = qr/\./;
m.$dot.

or

m.[\.].

@p5pRT
Copy link
Author

p5pRT commented May 29, 2012

From @demerphq

On 29 May 2012 18​:18, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

On Tue May 29 00​:49​:24 2012, demerphq wrote​:

On 29 May 2012 08​:39, Eric Brine <perlbug-followup@​perl.org> wrote​:

perl -E" say 'a' =~ m/\./ ? 'XXX' : 'ok' "
ok

perl -E" say 'a' =~ m.\.. ? 'XXX' : 'ok' "
XXX

I dont think this is a bug, but I do plan to work on changing it to DWIM.

I think that will break a lot of code.

Hrm, I hope not. At least no less than say, retiring the empty pattern.

OTOH, that comment might make it just a bit easier to say "it's too
much work" :-)

cheers
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented May 29, 2012

From @cpansprout

On Tue May 29 13​:15​:11 2012, demerphq wrote​:

On 29 May 2012 18​:18, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

On Tue May 29 00​:49​:24 2012, demerphq wrote​:

On 29 May 2012 08​:39, Eric Brine <perlbug-followup@​perl.org> wrote​:

perl -E" say 'a' =~ m/\./ ? 'XXX' : 'ok' "
ok

perl -E" say 'a' =~ m.\.. ? 'XXX' : 'ok' "
XXX

I dont think this is a bug, but I do plan to work on changing it to
DWIM.

I think that will break a lot of code.

Hrm, I hope not. At least no less than say, retiring the empty pattern.

I’ve known about that aspect of parsing for about eight years. I learnt
it by reading the documentation.

Since then, I’ve written regular expressions with that in mind, thinking
it was just common knowledge.

I tend to pick my delimiters more or less at random, so I won’t be
surprised if this breaks my code in several places--the breakage being
so subtle I won’t notice it till it causes a real problem.

This is the sort of change that worries me.

OTOH, that comment might make it just a bit easier to say "it's too
much work" :-)

:-)

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented May 30, 2012

From 2bfjdsla52kztwejndzdstsxl9athp@gmail.com

Quoth Darin McBride​:

On Tuesday May 29 2012 9​:18​:06 AM Father Chrysostomos via RT wrote​:

On Tue May 29 00​:49​:24 2012, demerphq wrote​:

On 29 May 2012 08​:39, Eric Brine <perlbug-followup@​perl.org> wrote​:

perl -E" say 'a' =~ m/\./ ? 'XXX' : 'ok' "

ok

perl -E" say 'a' =~ m.\.. ? 'XXX' : 'ok' "

XXX

I dont think this is a bug, but I do plan to work on changing it to DWIM.

I think that will break a lot of code.

Honestly, I would have expected the need to extra-escape in the second
situation​:

m.\\\..

However, as has been pointed out already, this doesn't (currently) work,
either.

I'm not sure fixing that would break much/any code in practice. And I'm not
sure it's really much of an improvement over the workarounds​:

$dot = qr/\./;
m.$dot.

or

m.[\.].

or even

  m.\Q\.\E.

/Bo Lindbergh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants