Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File::Spec->canonpath("a\\..\\..\\b") returns wrong value for Win 32 #7133

Closed
p5pRT opened this issue Feb 24, 2004 · 35 comments
Closed

File::Spec->canonpath("a\\..\\..\\b") returns wrong value for Win 32 #7133

p5pRT opened this issue Feb 24, 2004 · 35 comments

Comments

@p5pRT
Copy link

p5pRT commented Feb 24, 2004

Migrated from rt.perl.org#27052 (status was 'resolved')

Searchable as RT27052$

@p5pRT
Copy link
Author

p5pRT commented Feb 24, 2004

From Tom.Dinger@Scansoft.com

Created by tom.dinger@scansoft.com

Under Windows (using File​::Spec​::Win32), the call
File​::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value​: 'b\\c'.
It should return '..\\b\\c'.

This bug first appeared in the Perl 5.8.1 release; Perl 5.8.0 does not
have this problem.

In Perl 5.8.0, the behavior of canonpath() for Win32.pm was very similar
to that for Unix.pm​: no ".." directory entries were removed from the
path. For UNIX systems this is appropriate, as symbolic and hard links
may make "a/b/../c" different than "a/c".

In Perl 5.8.1, a number of multiple-dot transformations were added to the
Win32 support​: changing "a\\...\\b" into "a\\..\\..\\b", removing redundant
double-dots, etc. If all paths handed to canonpath() are absolute paths,
then it works correctly, but if the path is a relative path, it may make
a mistake when trying to eliminate the ".." path elements -- apparently
it sometimes treats a relative path as absolute, and since '\\..\\a' is the
same as 'a' on Windows, it was changing '..\\a' to 'a' as well in some
situations.

Compounding the problem is that canonpath() is used by other functions
as well, so catfile() and catpath() also suffer errors.

Attached is acontext-diff of the changes I have made and tested for the
canonpath() function in lib/File/Spec/Win32.pm, and additional tests for
the problems (and fix) in lib/File/Spec/t/Spec.t.

###################################################################

Inline Patch
--- lib\File\Spec\Win32.pm.orig	Tue Sep 16 01:54:20 2003
+++ lib\File\Spec\Win32.pm	Tue Feb 24 13:05:22 2004
@@ -5,7 +5,7 @@
 use vars qw(@ISA $VERSION);
 require File::Spec::Unix;
 
-$VERSION = '1.4';
+$VERSION = '1.5';
 
 @ISA = qw(File::Spec::Unix);
 
@@ -140,28 +140,53 @@
     $path =~ s{^\\\.\.$}{\\};                      # \..    -> \
     1 while $path =~ s{^\\\.\.}{};                 # \..\xx -> \xx
 
-    my ($vol,$dirs,$file) = $self->splitpath($path);
-    my @dirs = $self->splitdir($dirs);
-    my (@base_dirs, @path_dirs);
-    my $dest = \@base_dirs;
-    for my $dir (@dirs){
-	$dest = \@path_dirs if $dir eq $self->updir;
-	push @$dest, $dir;
-    }
-    # for each .. in @path_dirs pop one item from 
-    # @base_dirs
-    while (my $dir = shift @path_dirs){ 
-	unless ($dir eq $self->updir){
-	    unshift @path_dirs, $dir;
-	    last;
-	}
-	pop @base_dirs;
-    }
-    $path = $self->catpath( 
-			   $vol, 
-			   $self->catdir(@base_dirs, @path_dirs), 
-			   $file
-			  );
+    my ($vol,$dirs,$file) = $self->splitpath($path,1);
+
+    # The previous version of the code did completely the wrong thing in
+    # at least one important case:
+    #
+    #   input         result   should be
+    #   -----------   ------   ---------
+    #   a\..\..\b\c   b\c      ..\b\c
+    #
+    # So this is a simpler rewrite.
+
+    my @dirs = $self->splitdir( $dirs );
+
+    # We walk through the list, looking for pairs ("x\..") to toss,
+    # e.g. "a\x\..\b" --> "a\b".
+    # But if the previous element is '..' we cannot toss this one.
+    # If the previous element is '', we can toss this one, but not the
+    # previous one (it stands for the root of the file system).
+
+    for ( my $i = 1; $i < @dirs; )
+    {
+        if ( ($i == 0) or ($dirs[$i] ne $self->updir) )
+        {
+            ++$i;   # skip this one, nothing to do.
+        }
+        elsif ( ($i == 1) and ($dirs[$i-1] eq '') )
+        {
+            # E.g. "C:\..\stuff" --> "C:\stuff"
+            splice @dirs, $i, 1;  # and leave $i alone.
+        }
+        elsif ( $dirs[$i-1] ne $self->updir )
+        {
+            # E.g. "a\b\..\c" --> "a\c"
+            splice @dirs, $i - 1, 2;
+            --$i;   # the next one to check
+        }
+        else
+        {
+            ++$i; # don't remove this one.
+        }
+    }
+
+    $path = $self->catpath( $vol, $self->catdir(@dirs), $file );
+
+#   $path .= '.' if substr($path,-1,1) eq ':'; # change "X:" into "X:."
+    $path .= '.' if $path eq '';               # change "" into "."
+
     return $path;
 }
 
--- lib\File\Spec\t\Spec.t.orig	Fri Dec 19 09:01:46 2003
+++ lib\File\Spec\t\Spec.t	Tue Feb 24 13:07:32 2004
@@ -201,6 +201,10 @@
 [ "Win32->catfile('.\\a','b','c')",      'a\\b\\c'  ],
 [ "Win32->catfile('c')",                'c' ],
 [ "Win32->catfile('.\\c')",              'c' ],
+[ "Win32->catfile('a','b','..','.\\c')",      'a\\c'  ],
+[ "Win32->catfile('a','..','b','.\\c')",      'b\\c'  ],
+[ "Win32->catfile('a','..','..','b','.\\c')", '..\\b\\c'  ],
+[ "Win32->catfile('..','a','b','.\\c')", '..\\a\\b\\c'  ],
 
 
 [ "Win32->canonpath('')",               ''                    ],
@@ -224,6 +228,15 @@
 [ "Win32->canonpath('\\..\\')",         '\\'                  ],
 [ "Win32->canonpath('/../')",           '\\'                  ],
 [ "Win32->canonpath('/..\\')",          '\\'                  ],
+[ "Win32->canonpath('..\\a')",          '..\\a'               ],
+[ "Win32->canonpath('..\\..\\a')",      '..\\..\\a'           ],
+[ "Win32->canonpath('a\\..\\..\\b')",   '..\\b'               ],
+[ "Win32->canonpath('a\\..\\..\\b\\..\\c\\d\\..\\e')",   '..\\c\\e'  ],
+[ "Win32->canonpath('c:\\a\\..')",      'C:\\'                ],
+[ "Win32->canonpath('c:a\\..')",        'C:'                  ],
+[ "Win32->canonpath('c:\\a\\..\\')",    'C:\\'                ],
+[ "Win32->canonpath('c:a\\..\\')",      'C:'                  ],
+[ "Win32->canonpath('a\\..\\')",        '.'                   ],
 [ "Win32->can('_cwd')",                 '/CODE/'              ],
 
 # FakeWin32 subclass (see below) just sets CWD to C:\one\two
###################################################################
Perl Info

Flags:
    category=library
    severity=high

Site configuration information for perl v5.8.3:

Configured by ActiveState at Tue Feb  3 00:28:38 2004.

Summary of my perl5 (revision 5 version 8 subversion 3) configuration:
  Platform:
    osname=MSWin32, osvers=4.0, archname=MSWin32-x86-multi-thread
    uname=''
    config_args='undef'
    hint=recommended, useposix=true, d_sigaction=undef
    usethreads=undef use5005threads=undef useithreads=define
usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cl', ccflags ='-nologo -Gf -W3 -MD -Zi -DNDEBUG -O1 -DWIN32
-D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT  -DNO_HASH_SEED
-DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO
-DPERL_MSVCRT_READFIX',
    optimize='-MD -Zi -DNDEBUG -O1',
    cppflags='-DWIN32'
    ccversion='', gccversion='', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=10
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='__int64',
lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='link', ldflags ='-nologo -nodefaultlib -debug -opt:ref,icf
-libpath:"C:\Perl583\lib\CORE"  -machine:x86'
    libpth=C:\PROGRA~1\MICROS~3\VC98\lib
    libs=  oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib  netapi32.lib
uuid.lib wsock32.lib mpr.lib winmm.lib  version.lib odbc32.lib odbccp32.lib
msvcrt.lib
    perllibs=  oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib  netapi32.lib
uuid.lib wsock32.lib mpr.lib winmm.lib  version.lib odbc32.lib odbccp32.lib
msvcrt.lib
    libc=msvcrt.lib, so=dll, useshrplib=yes, libperl=perl58.lib
    gnulibc_version='undef'
  Dynamic Linking:
    dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags='-dll -nologo -nodefaultlib -debug
-opt:ref,icf  -libpath:"C:\Perl583\lib\CORE"  -machine:x86'

Locally applied patches:
    ACTIVEPERL_LOCAL_PATCHES_ENTRY
    22218 Remove the caveat about detached threads crashing on Windows
    22201 Avoid threads+win32 crash by freeing Perl interpreter slightly
later
    22169 Display 'out of memeory' errors using low-level I/O
    22159 Upgrade to Time::Hires 1.55
    22120 Make 'Configure -Dcf_by=...' work
    22051 Upgrade to Time::HiRes 1.54
    21540 Fix backward-compatibility issues in if.pm


@INC for perl v5.8.3:
    c:\perl583\site\lib
    c:\perl583\lib
    c:\lib\perl5
    c:\lib\perl
    c:/Perl583/lib
    c:/Perl583/site/lib
    .


Environment for perl v5.8.3:
    HOME (unset)
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
 
PATH=C:\VisStd71\Common7\IDE;C:\VisStd71\VC7\BIN;C:\VisStd71\Common7\Tools;C
:\VisStd71\Common7\Tools\bin\prerelease;C:\VisStd71\Common7\Tools\bin;C:\Vis
Std71\SDK\v1.1\bin;C:\VisStd71\SDK\v1.1\v1.1.4322;c:\ScanSoft\ntbin;c:\ScanS
oft\bin;c:\bin\win32;c:\bin\nt;c:\bin\arepa;c:\bin\dos;C:\WINNT\system32;C:\
WINNT;C:\WINNT\System32\Wbem
    PERL5LIB=c:\perl583\site\lib;c:\perl583\lib;c:\lib\perl5;c:\lib\perl
    PERL_BADLANG (unset)
    PERL_DIR=C:\perl582
    SHELL (unset)

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2005

From @schwern

[tomdinger - Tue Feb 24 10​:15​:24 2004]​:

Under Windows (using File​::Spec​::Win32), the call
File​::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value​: 'b\\c'.
It should return '..\\b\\c'.

Hey Ken, want to have a look at this? There's a patch and everything!

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2005

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2005

From @kenahoo

On Jul 6, 2005, at 11​:29 AM, Michael G Schwern via RT wrote​:

[tomdinger - Tue Feb 24 10​:15​:24 2004]​:

Under Windows (using File​::Spec​::Win32), the call
File​::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value​:
'b\\c'.
It should return '..\\b\\c'.

Hey Ken, want to have a look at this? There's a patch and everything!

If Win32 is going to be handled like Unix, then it should return the
input verbatim (".." is not cleaned up). Is there a reason it should
be different on Win32?

  -Ken

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2005

From @schwern

On Wed, Jul 06, 2005 at 02​:06​:19PM -0500, Ken Williams wrote​:

On Jul 6, 2005, at 11​:29 AM, Michael G Schwern via RT wrote​:

[tomdinger - Tue Feb 24 10​:15​:24 2004]​:

Under Windows (using File​::Spec​::Win32), the call
File​::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value​:
'b\\c'.
It should return '..\\b\\c'.

Hey Ken, want to have a look at this? There's a patch and everything!

If Win32 is going to be handled like Unix, then it should return the
input verbatim (".." is not cleaned up). Is there a reason it should
be different on Win32?

That's not the issue. b\c is right out wrong. Examine the path closer.

  a\..\..\b\c

"a\.." cancels out leaving "..\b\c".

Or, if you don't want to clean up .. it can be a\..\..\b\c. But b\c is
clearly wrong.

PS Offhand the reason I can think for cleaning up .. on Win32 is because it
tends to not have symlinks to worry about so a\..\a\ should equal a\

--
Michael G Schwern schwern@​pobox.com http​://www.pobox.com/~schwern
Just call me 'Moron Sugar'.
  http​://www.somethingpositive.net/sp05182002.shtml

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2005

From @kenahoo

On Jul 6, 2005, at 2​:50 PM, Michael G Schwern wrote​:

On Wed, Jul 06, 2005 at 02​:06​:19PM -0500, Ken Williams wrote​:

On Jul 6, 2005, at 11​:29 AM, Michael G Schwern via RT wrote​:

[tomdinger - Tue Feb 24 10​:15​:24 2004]​:

Under Windows (using File​::Spec​::Win32), the call
File​::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value​:
'b\\c'.
It should return '..\\b\\c'.

Hey Ken, want to have a look at this? There's a patch and
everything!

If Win32 is going to be handled like Unix, then it should return the
input verbatim (".." is not cleaned up). Is there a reason it should
be different on Win32?

That's not the issue. b\c is right out wrong. Examine the path
closer.

a\\\.\.\\\.\.\\b\\c

"a\.." cancels out leaving "..\b\c".

Right, I understand the bug, I'm just trying to get the right fix.
Note that I suggested that it return that particular input string
verbatim.

PS Offhand the reason I can think for cleaning up .. on Win32 is
because it
tends to not have symlinks to worry about so a\..\a\ should equal a\

"tends not to" isn't a good enough reason in this case. If it treats
them differently, though, that might be a good enough reason.

  -Ken

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2005

From @kenahoo

On Jul 6, 2005, at 4​:00 PM, Dinger, Tom wrote​:

To be perfectly honest, I don't care which way it is "fixed", as long
as the
result still points to the right file.

Of course. That's what I'm asking​: is "bar" guaranteed on Windows to
be "the right file" when the input is "foo\\..\\bar"? On Unix, it's
not.

I've never disputed the bug. The current behavior is obviously wrong.

  -Ken

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2005

From perl@nevcal.com

On approximately 7/6/2005 1​:10 PM, came the following characters from
the keyboard of Ken Williams​:

On Jul 6, 2005, at 2​:50 PM, Michael G Schwern wrote​:

That's not the issue. b\c is right out wrong. Examine the path closer.

a\\\.\.\\\.\.\\b\\c

"a\.." cancels out leaving "..\b\c".

Right, I understand the bug, I'm just trying to get the right fix. Note
that I suggested that it return that particular input string verbatim.

PS Offhand the reason I can think for cleaning up .. on Win32 is
because it
tends to not have symlinks to worry about so a\..\a\ should equal a\

"tends not to" isn't a good enough reason in this case. If it treats
them differently, though, that might be a good enough reason.

So how do Windows' "shortcuts" fit into this? They act a lot like
softlinks interactively in Windows Explorer (but not, AFAICT, from the
file system)?

And then the newer versions of NTFS support "reparse points" and
"junction points". How do they fit into this? I have no experience
with these, but it sounds like they act a lot more like Unix links.

Hence potentially making a\.. elimination problematical. Which may bea
good reason for name canonicalization on Windows to not attempt to
remove .. sequences... there is lots of Windows code out there that does
do that, of course, I've written some of it myself... but as Windows
changes, compatibility to itself is not guaranteed.

--
Glenn -- http​://nevcal.com/

Having identified a vast realm of ignorance, Wolfram is saying that much
of this realm lies forever outside the light cone of human knowledge.
  -- Michael Swaine, Dr Dobbs Journal, Sept 2002

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2005

From Tom.Dinger@Scansoft.com

To be perfectly honest, I don't care which way it is "fixed", as long as the
result still points to the right file.

And now some history​: the version of File​::Spec​::Win32 in Perl 5.8.0 did no
".." processing in canonpath(), and that was fine. As of Perl 5.8.1, the
canonpath() started doing the broken ".." processing. So, this patch assumed
that the ".." simplification was desired, and fixed that.

Either roll back that method to the 5.8.0 version, or apply the patch. The
current behavior is the worst of the three choices, IMHO ...

TD

-----Original Message-----
From​: Ken Williams
To​: Michael G Schwern
Cc​: Tom.Dinger@​Scansoft.com; perlbug-followup@​perl.org
Sent​: 7/6/2005 4​:10 PM
Subject​: Re​: [perl #27052] File​::Spec->canonpath("a\\..\\..\\b") returns
wrong value for Win 32

On Jul 6, 2005, at 2​:50 PM, Michael G Schwern wrote​:

On Wed, Jul 06, 2005 at 02​:06​:19PM -0500, Ken Williams wrote​:

On Jul 6, 2005, at 11​:29 AM, Michael G Schwern via RT wrote​:

[tomdinger - Tue Feb 24 10​:15​:24 2004]​:

Under Windows (using File​::Spec​::Win32), the call
File​::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value​:
'b\\c'.
It should return '..\\b\\c'.

Hey Ken, want to have a look at this? There's a patch and
everything!

If Win32 is going to be handled like Unix, then it should return the
input verbatim (".." is not cleaned up). Is there a reason it should
be different on Win32?

That's not the issue. b\c is right out wrong. Examine the path
closer.

a\\\.\.\\\.\.\\b\\c

"a\.." cancels out leaving "..\b\c".

Right, I understand the bug, I'm just trying to get the right fix.
Note that I suggested that it return that particular input string
verbatim.

PS Offhand the reason I can think for cleaning up .. on Win32 is
because it
tends to not have symlinks to worry about so a\..\a\ should equal a\

"tends not to" isn't a good enough reason in this case. If it treats
them differently, though, that might be a good enough reason.

  -Ken

@p5pRT
Copy link
Author

p5pRT commented Jul 7, 2005

From @JohnPeacock

Glenn Linderman wrote​:

So how do Windows' "shortcuts" fit into this? They act a lot like
softlinks interactively in Windows Explorer (but not, AFAICT, from the
file system)?

Shortcuts are an abomination; don't try and support them (AFAIK only Explorer
and newer versions of Office can deal with them).

And then the newer versions of NTFS support "reparse points" and
"junction points". How do they fit into this? I have no experience
with these, but it sounds like they act a lot more like Unix links.

NTFS has had hard links for files for some time, but only with a developer tool.
  Junctions are more like soft links, but only for directories (including remote
folders on different machines). Reparse points are filesystem objects that give
additional behaviors to files and folders (it is how they implemented junctions
actually). None of which has any bearing on software running on top of NTFS,
which has no reason to know that there is anything special about these files (in
that sense they are more like hardlinks under *nix).

/These aren't the droids we're looking for. Move along, move along!/

John

--
John Peacock
Director of Information Research and Technology
Rowman & Littlefield Publishing Group
4720 Boston Way
Lanham, MD 20706
301-459-3366 x.5010
fax 301-429-5747

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From @demerphq

On 7/6/05, Ken Williams <ken@​mathforum.org> wrote​:

On Jul 6, 2005, at 4​:00 PM, Dinger, Tom wrote​:

To be perfectly honest, I don't care which way it is "fixed", as long
as the
result still points to the right file.

Of course. That's what I'm asking​: is "bar" guaranteed on Windows to
be "the right file" when the input is "foo\\..\\bar"? On Unix, it's
not.

I've never disputed the bug. The current behavior is obviously wrong.

Im not sure if this is useful, but many of the things that File​::Spec
tries to do on win32 are actually supported directly by the Win32 API.
IMO at least some of File​::Spec's behaviour could take advantage of
this API.

Win32​::GetFullPathName() is the one i have in mind when I say this.

perl -e "use Win32; print Win32​::GetFullPathName(qq[foo\\..\\bar]);"

outputs "CWD\bar".

So if you strip off the CWD from the result of
Win32​::GetFullPathName() you get the OS'es solution of this problem,
which should bypass all of these issues. Ie, crudely​:

use Win32;
use Cwd qw(cwd);

sub canonpath {
  my $path=shift;
 
  $ret=Win32​::GetFullPathName($path);
  if ($path!~/^\w​:/) {
  (my $cwd=cwd()."/")=~s!/!\\!g;
  while (substr($cwd,0,1) eq substr($ret,0,1)) {
  substr($cwd,0,1,"");
  substr($ret,0,1,"");
  }
  }
  return $ret;
}

print canonpath("foo/../../../bar/baz");

BTW, i say crudely, because I dont think that canonpath is very well
defined. Is it a relative path or not? Should it function differently?

Likewise, File​::Spec​::rel2abs() should be rewritten to be a
passthrough to Win32​::GetFullPathName().

Anyway, the point is that using Win32​::GetFullPathName() is available
to resolve a bunch of these issues (at least as far as Win32 goes).

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From @kenahoo

On Jul 8, 2005, at 8​:51 AM, yves orton via RT wrote​:

Im not sure if this is useful, but many of the things that File​::Spec
tries to do on win32 are actually supported directly by the Win32 API.
IMO at least some of File​::Spec's behaviour could take advantage of
this API.

Yeah, very true. However, the Win32 docs say that GetFullPathName()
first appeared in perl 5.6, so before that we'd still need to emulate.
Alternatively, we could add an XS implementation of it for 5.005 on
Windows, since the underlying C API should be available.

BTW, i say crudely, because I dont think that canonpath is very well
defined. Is it a relative path or not?

If the input is relative, the output is relative. If the input is
absolute, the output is absolute.

Likewise, File​::Spec​::rel2abs() should be rewritten to be a
passthrough to Win32​::GetFullPathName().

True. The only problem is that it'll make it harder for me to test,
because I don't test on Windows.

Your basic point is well-taken, though - we should indeed use Win32​::*
stuff whenever appropriate.

  -Ken

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From nick@ing-simmons.net

Ken Williams <ken@​mathforum.org> writes​:

On Jul 6, 2005, at 11​:29 AM, Michael G Schwern via RT wrote​:

[tomdinger - Tue Feb 24 10​:15​:24 2004]​:

Under Windows (using File​::Spec​::Win32), the call
File​::Spec->canonpath('a\\..\\..\\b\\c') returns the wrong value​:
'b\\c'.
It should return '..\\b\\c'.

Hey Ken, want to have a look at this? There's a patch and everything!

If Win32 is going to be handled like Unix, then it should return the
input verbatim (".." is not cleaned up). Is there a reason it should
be different on Win32?

Without (sym)links Win32 has simpler .. semantics than Unix.
But there are still pitfalls crossing "mount" points.

-Ken

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From @schwern

On Fri, Jul 08, 2005 at 03​:50​:49PM +0200, demerphq wrote​:

Im not sure if this is useful, but many of the things that File​::Spec
tries to do on win32 are actually supported directly by the Win32 API.
IMO at least some of File​::Spec's behaviour could take advantage of
this API.

Win32​::GetFullPathName() is the one i have in mind when I say this.

perl -e "use Win32; print Win32​::GetFullPathName(qq[foo\\..\\bar]);"

outputs "CWD\bar".

So if you strip off the CWD from the result of
Win32​::GetFullPathName() you get the OS'es solution of this problem,
which should bypass all of these issues.

Does it? It still leaves us asking the question​: can we assume foo\..\bar ==
bar on Windows? Just because a system call does it that way doesn't mean
its right.

Anyhow, should this discussion drag on any longer without resolution there's
a simple yardstick to use​: Which retains the most information? Not
collapsing .. does. So given that the current implementation is clearly
wrong, and if we can't decide between the two right implementations, pick
the one that's safest to at least give a correct answer. Then we can
discuss some more and maybe have switch to the other one.

--
Michael G Schwern schwern@​pobox.com http​://www.pobox.com/~schwern
'All anyone gets in a mirror is themselves,' she said. 'But what you
gets in a good gumbo is everything.'
  -- "Witches Abroad" by Terry Prachett

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From @demerphq

On 7/8/05, Michael G Schwern <schwern@​pobox.com> wrote​:

On Fri, Jul 08, 2005 at 03​:50​:49PM +0200, demerphq wrote​:

Im not sure if this is useful, but many of the things that File​::Spec
tries to do on win32 are actually supported directly by the Win32 API.
IMO at least some of File​::Spec's behaviour could take advantage of
this API.

Win32​::GetFullPathName() is the one i have in mind when I say this.

perl -e "use Win32; print Win32​::GetFullPathName(qq[foo\\..\\bar]);"

outputs "CWD\bar".

So if you strip off the CWD from the result of
Win32​::GetFullPathName() you get the OS'es solution of this problem,
which should bypass all of these issues.

Does it? It still leaves us asking the question​: can we assume foo\..\bar ==
bar on Windows? Just because a system call does it that way doesn't mean
its right.

Well, i suppose you are correct. Im not entirely sure what scenario I
should be testing here, but i beleive the problem you are thinking of
is due to symlinks to a directory? If so then the win32 equivelent
would be a junction I think and in that case yes, foo\..\bar == bar.

D​:\dev\junct>junction foo

Junction v1.03 - Win2K junction creator and reparse point viewer
Copyright (C) 2000-2002 Mark Russinovich
Systems Internals - http​://www.sysinternals.com

D​:\dev\junct\foo​: JUNCTION
  Substitute Name​: d​:\dev\test

D​:\dev\junct>echo Test1 > foo\..\test.echo

D​:\dev\junct>type test.echo
Test1

D​:\dev\junct>cd ..

D​:\dev>echo test2 > test\..\test.echo

D​:\dev>type test.echo
test2

Anyhow, should this discussion drag on any longer without resolution there's
a simple yardstick to use​: Which retains the most information? Not
collapsing .. does. So given that the current implementation is clearly
wrong, and if we can't decide between the two right implementations, pick
the one that's safest to at least give a correct answer. Then we can
discuss some more and maybe have switch to the other one.

Sure. My comment was mostly that by using the Win32 API's one can do
this type of stuff more reliably than with File​::Spec. With the
emphasis intended to be more on "this type of stuff" than on the
particular problem that lead to this thread.

To be honest i would really like to see the expected behaviour of
canonpath when called on a relative path explicitly defined. IMO
canonpath should act more like rel2abs, insofar as it should support
an optional $base argument to use instead of CWD when trying to clean
up a relative path.

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From @schwern

On Fri, Jul 08, 2005 at 11​:07​:22PM +0200, demerphq wrote​:

should be testing here, but i beleive the problem you are thinking of
is due to symlinks to a directory? If so then the win32 equivelent
would be a junction I think and in that case yes, foo\..\bar == bar.

I have the creeping feeling that there's an argument to be made here, but
I never fully understood the symlinks vs .. cannonpath argument so I'll
hope someone else jumps in and makes it.

To be honest i would really like to see the expected behaviour of
canonpath when called on a relative path explicitly defined. IMO
canonpath should act more like rel2abs, insofar as it should support
an optional $base argument to use instead of CWD when trying to clean
up a relative path.

canonpath() should never be inserting the CWD when cleaning up. "./bar" is
the same as "bar" but "$CWD/bar" is not! Its important that cannonical
relative paths remain relative.

--
Michael G Schwern schwern@​pobox.com http​://www.pobox.com/~schwern
'All anyone gets in a mirror is themselves,' she said. 'But what you
gets in a good gumbo is everything.'
  -- "Witches Abroad" by Terry Prachett

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From @demerphq

On 7/8/05, Michael G Schwern <schwern@​pobox.com> wrote​:

On Fri, Jul 08, 2005 at 11​:07​:22PM +0200, demerphq wrote​:

should be testing here, but i beleive the problem you are thinking of
is due to symlinks to a directory? If so then the win32 equivelent
would be a junction I think and in that case yes, foo\..\bar == bar.

I have the creeping feeling that there's an argument to be made here, but
I never fully understood the symlinks vs .. cannonpath argument so I'll
hope someone else jumps in and makes it.

Me too. Not being all that familiar with *nix file systems I made my
best stab at an answer as it seems to pertain to Win32. I think John
Peacock also said something to this effect.

To be honest i would really like to see the expected behaviour of
canonpath when called on a relative path explicitly defined. IMO
canonpath should act more like rel2abs, insofar as it should support
an optional $base argument to use instead of CWD when trying to clean
up a relative path.

canonpath() should never be inserting the CWD when cleaning up. "./bar" is
the same as "bar" but "$CWD/bar" is not! Its important that cannonical
relative paths remain relative.

Sorry, i guess I didnt express myself properly. You cant clean up a
relative path properly without knowing where it is relative to.
Consider the following path​:

  ..\..\foo

If we are in \bar then ..\..\foo is the same as ..\foo and \foo but
if we are in \bar\baz\bat then its not the same as either as it maps
to \bar\foo.

yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From perl@nevcal.com

On approximately 7/8/2005 1​:53 PM, came the following characters from
the keyboard of Michael G Schwern​:

On Fri, Jul 08, 2005 at 03​:50​:49PM +0200, demerphq wrote​:

Im not sure if this is useful, but many of the things that File​::Spec
tries to do on win32 are actually supported directly by the Win32 API.
IMO at least some of File​::Spec's behaviour could take advantage of
this API.

Win32​::GetFullPathName() is the one i have in mind when I say this.

perl -e "use Win32; print Win32​::GetFullPathName(qq[foo\\..\\bar]);"

outputs "CWD\bar".

So if you strip off the CWD from the result of
Win32​::GetFullPathName() you get the OS'es solution of this problem,
which should bypass all of these issues.

Does it? It still leaves us asking the question​: can we assume foo\..\bar ==
bar on Windows? Just because a system call does it that way doesn't mean
its right.

In the presence of junction points, I think not. The system call would
have the opportunity to understand and check for the existance of
junction points, but whether it does or not, is anyone's guess, without
trying it.

Anyhow, should this discussion drag on any longer without resolution there's
a simple yardstick to use​: Which retains the most information? Not
collapsing .. does. So given that the current implementation is clearly
wrong, and if we can't decide between the two right implementations, pick
the one that's safest to at least give a correct answer. Then we can
discuss some more and maybe have switch to the other one.

--
Glenn -- http​://nevcal.com/

Having identified a vast realm of ignorance, Wolfram is saying that much
of this realm lies forever outside the light cone of human knowledge.
  -- Michael Swaine, Dr Dobbs Journal, Sept 2002

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From @demerphq

On 7/8/05, Glenn Linderman <perl@​nevcal.com> wrote​:

On approximately 7/8/2005 1​:53 PM, came the following characters from
the keyboard of Michael G Schwern​:

On Fri, Jul 08, 2005 at 03​:50​:49PM +0200, demerphq wrote​:

Im not sure if this is useful, but many of the things that File​::Spec
tries to do on win32 are actually supported directly by the Win32 API.
IMO at least some of File​::Spec's behaviour could take advantage of
this API.

Win32​::GetFullPathName() is the one i have in mind when I say this.

perl -e "use Win32; print Win32​::GetFullPathName(qq[foo\\..\\bar]);"

outputs "CWD\bar".

So if you strip off the CWD from the result of
Win32​::GetFullPathName() you get the OS'es solution of this problem,
which should bypass all of these issues.

Does it? It still leaves us asking the question​: can we assume foo\..\bar ==
bar on Windows? Just because a system call does it that way doesn't mean
its right.

In the presence of junction points, I think not. The system call would
have the opportunity to understand and check for the existance of
junction points, but whether it does or not, is anyone's guess, without
trying it.

Doesn't look like there is a problem with this. I guess ".." is
resolved by the OS relative to ".", and not a hard link to a specific
directory. (But that is pure speculation.)

D​:\dev\junct\foo>perl -MWin32 -e"print Win32​::GetFullPathName('.');"
D​:\dev\junct\foo
D​:\dev\junct\foo>cd ..\..\test

D​:\dev\test>perl -MWin32 -e"print Win32​::GetFullPathName('.');"
D​:\dev\test
D​:\dev\test>cd ..

D​:\dev>junction junct\foo

Junction v1.03 - Win2K junction creator and reparse point viewer
Copyright (C) 2000-2002 Mark Russinovich
Systems Internals - http​://www.sysinternals.com

D​:\dev\junct\foo​: JUNCTION
  Substitute Name​: d​:\dev\test

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From perl@nevcal.com

On approximately 7/8/2005 2​:07 PM, came the following characters from
the keyboard of demerphq​:

On 7/8/05, Michael G Schwern <schwern@​pobox.com> wrote​:

On Fri, Jul 08, 2005 at 03​:50​:49PM +0200, demerphq wrote​:

Im not sure if this is useful, but many of the things that File​::Spec
tries to do on win32 are actually supported directly by the Win32 API.
IMO at least some of File​::Spec's behaviour could take advantage of
this API.

Win32​::GetFullPathName() is the one i have in mind when I say this.

perl -e "use Win32; print Win32​::GetFullPathName(qq[foo\\..\\bar]);"

outputs "CWD\bar".

So if you strip off the CWD from the result of
Win32​::GetFullPathName() you get the OS'es solution of this problem,
which should bypass all of these issues.

Does it? It still leaves us asking the question​: can we assume foo\..\bar ==
bar on Windows? Just because a system call does it that way doesn't mean
its right.

Well, i suppose you are correct. Im not entirely sure what scenario I
should be testing here, but i beleive the problem you are thinking of
is due to symlinks to a directory? If so then the win32 equivelent
would be a junction I think and in that case yes, foo\..\bar == bar.

D​:\dev\junct>junction foo

Junction v1.03 - Win2K junction creator and reparse point viewer
Copyright (C) 2000-2002 Mark Russinovich
Systems Internals - http​://www.sysinternals.com

D​:\dev\junct\foo​: JUNCTION
Substitute Name​: d​:\dev\test

D​:\dev\junct>echo Test1 > foo\..\test.echo

D​:\dev\junct>type test.echo
Test1

D​:\dev\junct>cd ..

D​:\dev>echo test2 > test\..\test.echo

D​:\dev>type test.echo
test2

OK, you have just proven that the file is the same whether accessed by
one name or the other. Now that you have the junction set up, how about
reporting on the results of​:

d​:
md \dev\junct\bar
cd \dev\junct
cd foo\..\bar
cd ..
cd foo
cd ..
cd \dev\test
cd foo\..\bar
cd ..
cd foo
cd ..

showing all the prompts that display the current path name in the sequence?

And even then, one would have to test the function of GetFullPathName
separately, but if a discrepancy shows up in the above test, then
foo\..\bar != bar all the time.

Anyhow, should this discussion drag on any longer without resolution there's
a simple yardstick to use​: Which retains the most information? Not
collapsing .. does.

Actually, if you knew that GetFullPathName was sensitive to and properly
handled junction points, and it collapsed the name, you would get more
information that way.

So given that the current implementation is clearly
wrong, and if we can't decide between the two right implementations, pick
the one that's safest to at least give a correct answer. Then we can
discuss some more and maybe have switch to the other one.

Sure. My comment was mostly that by using the Win32 API's one can do
this type of stuff more reliably than with File​::Spec. With the
emphasis intended to be more on "this type of stuff" than on the
particular problem that lead to this thread.

To be honest i would really like to see the expected behaviour of
canonpath when called on a relative path explicitly defined. IMO
canonpath should act more like rel2abs, insofar as it should support
an optional $base argument to use instead of CWD when trying to clean
up a relative path.

--
Glenn -- http​://nevcal.com/

Having identified a vast realm of ignorance, Wolfram is saying that much
of this realm lies forever outside the light cone of human knowledge.
  -- Michael Swaine, Dr Dobbs Journal, Sept 2002

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From @kenahoo

On Jul 8, 2005, at 4​:26 PM, Michael G Schwern wrote​:

On Fri, Jul 08, 2005 at 11​:07​:22PM +0200, demerphq wrote​:

should be testing here, but i beleive the problem you are thinking of
is due to symlinks to a directory? If so then the win32 equivelent
would be a junction I think and in that case yes, foo\..\bar == bar.

I have the creeping feeling that there's an argument to be made here,
but
I never fully understood the symlinks vs .. cannonpath argument so I'll
hope someone else jumps in and makes it.

It's been in the File​::Spec docs for a couple versions now, because
it's widely misunderstood​:

  http​://search.cpan.org/~kwilliams/PathTools-3.09/lib/File/Spec/Unix.pm

  -Ken

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From @schwern

On Fri, Jul 08, 2005 at 11​:49​:59PM +0200, demerphq wrote​:

canonpath() should never be inserting the CWD when cleaning up. "./bar" is
the same as "bar" but "$CWD/bar" is not! Its important that cannonical
relative paths remain relative.

Sorry, i guess I didnt express myself properly. You cant clean up a
relative path properly without knowing where it is relative to.
Consider the following path​:

..\..\foo

If we are in \bar then ..\..\foo is the same as ..\foo and \foo but
if we are in \bar\baz\bat then its not the same as either as it maps
to \bar\foo.

Reason #982983 to leave .. alone.

Your proposed "cleanup" is actually changing the meaning of the path.
..\..\foo means "go up two levels and then down into foo". ..\foo means
"go up one level and then down into foo". They represent different things.
canonpath() is about producing filepaths with the same meaning and NOT
about determining if they might point to the same file given the current
CWD. For that you can just run it through rel2abs().

Would would be nice is some sort of collapse() method which does collapse
.. so that canonpath("foo/../bar") == 'foo/../bar' but
collapse(canonpath("foo/../bar")) == 'bar'.

I like this solution. canonpath() can remain strict and work the same
across platforms. The user can decide if they want .. collapsed or not.
We stop having to second guess the user's needs.

--
Michael G Schwern schwern@​pobox.com http​://www.pobox.com/~schwern
You are wicked and wrong to have broken inside and peeked at the
implementation and then relied upon it.
  -- tchrist in <31832.969261130@​chthon>

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From @kenahoo

On Jul 8, 2005, at 4​:50 PM, yves orton via RT wrote​:

Sorry, i guess I didnt express myself properly. You cant clean up a
relative path properly without knowing where it is relative to.
Consider the following path​:

..\..\foo

If we are in \bar then ..\..\foo is the same as ..\foo and \foo but
if we are in \bar\baz\bat then its not the same as either as it maps
to \bar\foo.

The current working directory is not considered in canonpath(), period.
  canonpath("../../foo") is "../../foo".

  -Ken

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From @schwern

Ok, enough dithering. Let's squash this.

Attached is a patch which fixes this bug so canonpath("a\\..\\..\\b\\c")
returns "..\\b\\c". The question of whether or not canonpath() should
collapse .. is out of the scope of this bug as it currently contains
quite explicit code and comments to do so.

The patch replaces the collapse code in Win32->canonpath() with
Unix->_collapse() (I put it in Unix because its platform independent)
that does a better job. It also neatly encapsulates the tricky
collapsing code from the tricky canonpath code.

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2005

From @schwern

bug27052.patch
--- old-PathTools-3.09/lib/File/Spec/Unix.pm	2005-07-08 16:22:17.000000000 -0700
+++ new-PathTools-3.09/lib/File/Spec/Unix.pm	2005-07-08 16:23:47.000000000 -0700
@@ -472,4 +472,40 @@
     Cwd::cwd();
 }
 
+
+# Internal method to reduce xx\..\yy -> yy
+sub _collapse {
+    my($fs, $path) = @_;
+
+    my $updir  = $fs->updir;
+    my $curdir = $fs->curdir;
+
+    my($vol, $dirs, $file) = $fs->splitpath($path);
+    my @dirs = $fs->splitdir($dirs);
+
+    my @collapsed;
+    push @collapsed, $curdir unless $fs->file_name_is_absolute($path);
+
+    foreach my $dir (@dirs) {
+        if( $dir eq $updir              and   # if we have an updir
+            @collapsed                  and   # and something to collapse
+            length $collapsed[-1]       and   # and its not the rootdir
+            $collapsed[-1] ne $updir    and   # nor another updir
+            $collapsed[-1] ne $curdir         # nor the curdir
+          ) 
+        {                                     # then
+            pop @collapsed;                   # collapse
+        }
+        else {                                # else
+            push @collapsed, $dir;            # just hang onto it
+        }
+    }
+
+    return $fs->catpath($vol,
+                        $fs->catdir(@collapsed),
+                        $file
+                       );
+}
+
+
 1;

--- old-PathTools-3.09/lib/File/Spec/Win32.pm	2005-07-08 16:22:17.000000000 -0700
+++ new-PathTools-3.09/lib/File/Spec/Win32.pm	2005-07-08 16:24:10.000000000 -0700
@@ -145,29 +145,7 @@
     return $path unless $path =~ /\.\./;    # too few .'s to cleanup
     return $path if $path =~ /\.\.\.\./;    # too many .'s to cleanup
 
-    my ($vol,$dirs,$file) = $self->splitpath($path);
-    my @dirs = $self->splitdir($dirs);
-    my (@base_dirs, @path_dirs);
-    my $dest = \@base_dirs;
-    for my $dir (@dirs){
-	$dest = \@path_dirs if $dir eq $self->updir;
-	push @$dest, $dir;
-    }
-    # for each .. in @path_dirs pop one item from 
-    # @base_dirs
-    while (my $dir = shift @path_dirs){ 
-	unless ($dir eq $self->updir){
-	    unshift @path_dirs, $dir;
-	    last;
-	}
-	pop @base_dirs;
-    }
-    $path = $self->catpath( 
-			   $vol, 
-			   $self->catdir(@base_dirs, @path_dirs), 
-			   $file
-			  );
-    return $path;
+    return $self->_collapse($path);
 }
 
 =item splitpath

--- old-PathTools-3.09/t/Spec.t	2005-07-08 16:22:17.000000000 -0700
+++ new-PathTools-3.09/t/Spec.t	2005-07-08 16:31:53.000000000 -0700
@@ -92,14 +92,16 @@
 [ "Unix->catdir('','d1','d2','d3')",    '/d1/d2/d3' ],
 [ "Unix->catdir('d1','d2','d3')",       'd1/d2/d3'  ],
 
-[ "Unix->canonpath('')",                                      ''          ],
 [ "Unix->canonpath('///../../..//./././a//b/.././c/././')",   '/a/b/../c' ],
-[ "Unix->canonpath('/.')",                                    '/'         ],
-[ "Unix->canonpath('/./')",                                   '/'         ],
-[ "Unix->canonpath('/a/./')",                                 '/a'        ],
-[ "Unix->canonpath('/a/.')",                                  '/a'        ],
-[ "Unix->canonpath('/../../')",                               '/'         ],
-[ "Unix->canonpath('/../..')",                                '/'         ],
+[ "Unix->canonpath('')",                       ''               ],
+# rt.perl.org 27052
+[ "Unix->canonpath('a/../../b/c')",            'a/../../b/c'    ],
+[ "Unix->canonpath('/.')",                     '/'              ],
+[ "Unix->canonpath('/./')",                    '/'              ],
+[ "Unix->canonpath('/a/./')",                  '/a'             ],
+[ "Unix->canonpath('/a/.')",                   '/a'             ],
+[ "Unix->canonpath('/../../')",                '/'              ],
+[ "Unix->canonpath('/../..')",                 '/'              ],
 
 [  "Unix->abs2rel('/t1/t2/t3','/t1/t2/t3')",          ''                   ],
 [  "Unix->abs2rel('/t1/t2/t4','/t1/t2/t3')",          '../t4'              ],
@@ -214,6 +216,8 @@
 [ "Win32->canonpath('a:')",             'A:'                  ],
 [ "Win32->canonpath('A:f')",            'A:f'                 ],
 [ "Win32->canonpath('A:/')",            'A:\\'                ],
+# rt.perl.org 27052
+[ "Win32->canonpath('a\\..\\..\\b\\c')", '..\\b\\c'           ],
 [ "Win32->canonpath('//a\\b//c')",      '\\\\a\\b\\c'         ],
 [ "Win32->canonpath('/a/..../c')",      '\\a\\....\\c'        ],
 [ "Win32->canonpath('//a/b\\c')",       '\\\\a\\b\\c'         ],

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2005

From @demerphq

On 7/9/05, Ken Williams <ken@​mathforum.org> wrote​:

On Jul 8, 2005, at 4​:50 PM, yves orton via RT wrote​:

Sorry, i guess I didnt express myself properly. You cant clean up a
relative path properly without knowing where it is relative to.
Consider the following path​:

..\..\foo

If we are in \bar then ..\..\foo is the same as ..\foo and \foo but
if we are in \bar\baz\bat then its not the same as either as it maps
to \bar\foo.

The current working directory is not considered in canonpath(), period.
canonpath("../../foo") is "../../foo".

As I said before, the docs DONT specify what canonpath() is for very
well. All of this stuff is inferred or implied. Personally i dont feel
bad in coming up with a interpretation of what canonpath() is for that
differs from your own, or Schwerns, when the docs do not actually
explicitly say what it does.

One can infer almost any behaviour from "cleans up the path". And IMO,
whether looking at cwd or not is included in "does not look at the
filesystem" is unclear.

Im happy with the behaviour you outline, and i think i like Schwern's
idea of a collapse path or whatever. But id like to see this stuff
spelled out in the File​::Spec docs properly.

OTOH, i still think canonpath for absolute paths on Win32 should use
GetFullPathName().

yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2005

From @demerphq

On 7/9/05, Glenn Linderman <perl@​nevcal.com> wrote​:

On approximately 7/8/2005 2​:07 PM, came the following characters from
the keyboard of demerphq​:

On 7/8/05, Michael G Schwern <schwern@​pobox.com> wrote​:

On Fri, Jul 08, 2005 at 03​:50​:49PM +0200, demerphq wrote​:

Im not sure if this is useful, but many of the things that File​::Spec
tries to do on win32 are actually supported directly by the Win32 API.
IMO at least some of File​::Spec's behaviour could take advantage of
this API.

Win32​::GetFullPathName() is the one i have in mind when I say this.

perl -e "use Win32; print Win32​::GetFullPathName(qq[foo\\..\\bar]);"

outputs "CWD\bar".

So if you strip off the CWD from the result of
Win32​::GetFullPathName() you get the OS'es solution of this problem,
which should bypass all of these issues.

Does it? It still leaves us asking the question​: can we assume foo\..\bar ==
bar on Windows? Just because a system call does it that way doesn't mean
its right.

Well, i suppose you are correct. Im not entirely sure what scenario I
should be testing here, but i beleive the problem you are thinking of
is due to symlinks to a directory? If so then the win32 equivelent
would be a junction I think and in that case yes, foo\..\bar == bar.

D​:\dev\junct>junction foo

Junction v1.03 - Win2K junction creator and reparse point viewer
Copyright (C) 2000-2002 Mark Russinovich
Systems Internals - http​://www.sysinternals.com

D​:\dev\junct\foo​: JUNCTION
Substitute Name​: d​:\dev\test

D​:\dev\junct>echo Test1 > foo\..\test.echo

D​:\dev\junct>type test.echo
Test1

D​:\dev\junct>cd ..

D​:\dev>echo test2 > test\..\test.echo

D​:\dev>type test.echo
test2

OK, you have just proven that the file is the same whether accessed by
one name or the other. Now that you have the junction set up, how about
reporting on the results of​:

d​:
md \dev\junct\bar
cd \dev\junct
cd foo\..\bar
cd ..
cd foo
cd ..
cd \dev\test
cd foo\..\bar
cd ..
cd foo
cd ..

showing all the prompts that display the current path name in the sequence?

im not sure if your script exploited the structure i set up properly or not.

I had d​:\dev\junct\foo -> d​:\dev\test

so creating a directory in junct doesnt create it in test. Heres the
full output with me adding directories to resolve the "cant find
directory" errors. I ran the script you requested three times as
"junct.bat".

D​:\dev>junction junct\foo

Junction v1.03 - Win2K junction creator and reparse point viewer
Copyright (C) 2000-2002 Mark Russinovich
Systems Internals - http​://www.sysinternals.com

D​:\dev\junct\foo​: JUNCTION
  Substitute Name​: d​:\dev\test

D​:\dev>junct.bat

D​:\dev>d​:

D​:\dev>md \dev\junct\bar

D​:\dev>cd \dev\junct

D​:\dev\junct>cd foo\..\bar

D​:\dev\junct\bar>cd ..

D​:\dev\junct>cd foo

D​:\dev\junct\foo>cd ..

D​:\dev\junct>cd \dev\test

D​:\dev\test>cd foo\..\bar
The system cannot find the path specified.

D​:\dev\test>cd ..

D​:\dev>cd foo
The system cannot find the path specified.

D​:\dev>cd ..

D​:\>cd dev\test

D​:\dev\test>dir
Volume in drive D is Data
Volume Serial Number is C836-51DB

Directory of D​:\dev\test

2005-07-08 16​:24 <DIR> .
2005-07-08 16​:24 <DIR> ..
2005-07-08 16​:24 6 foo.txt
  1 File(s) 6 bytes
  2 Dir(s) 69,569,888,256 bytes free

D​:\dev\test>md foo

D​:\dev\test>cd ..

D​:\dev>junct.bat

D​:\dev>d​:

D​:\dev>md \dev\junct\bar
A subdirectory or file \dev\junct\bar already exists.

D​:\dev>cd \dev\junct

D​:\dev\junct>cd foo\..\bar

D​:\dev\junct\bar>cd ..

D​:\dev\junct>cd foo

D​:\dev\junct\foo>cd ..

D​:\dev\junct>cd \dev\test

D​:\dev\test>cd foo\..\bar
The system cannot find the path specified.

D​:\dev\test>cd ..

D​:\dev>cd foo
The system cannot find the path specified.

D​:\dev>cd ..

D​:\>cd dev\test

D​:\dev\test>dir
Volume in drive D is Data
Volume Serial Number is C836-51DB

Directory of D​:\dev\test

2005-07-09 08​:35 <DIR> .
2005-07-09 08​:35 <DIR> ..
2005-07-09 08​:35 <DIR> foo
2005-07-08 16​:24 6 foo.txt
  1 File(s) 6 bytes
  3 Dir(s) 69,569,888,256 bytes free

D​:\dev\test>cd foo

D​:\dev\test\foo>cd ..

D​:\dev\test>md bar

D​:\dev\test>cd ..

D​:\dev>junct.bat

D​:\dev>d​:

D​:\dev>md \dev\junct\bar
A subdirectory or file \dev\junct\bar already exists.

D​:\dev>cd \dev\junct

D​:\dev\junct>cd foo\..\bar

D​:\dev\junct\bar>cd ..

D​:\dev\junct>cd foo

D​:\dev\junct\foo>cd ..

D​:\dev\junct>cd \dev\test

D​:\dev\test>cd foo\..\bar

D​:\dev\test\bar>cd ..

D​:\dev\test>cd foo

D​:\dev\test\foo>cd ..

D​:\dev\test>

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2005

From perl@nevcal.com

On approximately 7/8/2005 11​:40 PM, came the following characters from
the keyboard of demerphq​:

On 7/9/05, Glenn Linderman <perl@​nevcal.com> wrote​:

On approximately 7/8/2005 2​:07 PM, came the following characters from
the keyboard of demerphq​:

On 7/8/05, Michael G Schwern <schwern@​pobox.com> wrote​:

On Fri, Jul 08, 2005 at 03​:50​:49PM +0200, demerphq wrote​:

Im not sure if this is useful, but many of the things that File​::Spec
tries to do on win32 are actually supported directly by the Win32 API.
IMO at least some of File​::Spec's behaviour could take advantage of
this API.

Win32​::GetFullPathName() is the one i have in mind when I say this.

perl -e "use Win32; print Win32​::GetFullPathName(qq[foo\\..\\bar]);"

outputs "CWD\bar".

So if you strip off the CWD from the result of
Win32​::GetFullPathName() you get the OS'es solution of this problem,
which should bypass all of these issues.

Does it? It still leaves us asking the question​: can we assume foo\..\bar ==
bar on Windows? Just because a system call does it that way doesn't mean
its right.

Well, i suppose you are correct. Im not entirely sure what scenario I
should be testing here, but i beleive the problem you are thinking of
is due to symlinks to a directory? If so then the win32 equivelent
would be a junction I think and in that case yes, foo\..\bar == bar.

D​:\dev\junct>junction foo

Junction v1.03 - Win2K junction creator and reparse point viewer
Copyright (C) 2000-2002 Mark Russinovich
Systems Internals - http​://www.sysinternals.com

D​:\dev\junct\foo​: JUNCTION
Substitute Name​: d​:\dev\test

D​:\dev\junct>echo Test1 > foo\..\test.echo

D​:\dev\junct>type test.echo
Test1

D​:\dev\junct>cd ..

D​:\dev>echo test2 > test\..\test.echo

D​:\dev>type test.echo
test2

OK, you have just proven that the file is the same whether accessed by
one name or the other. Now that you have the junction set up, how about
reporting on the results of​:

d​:
md \dev\junct\bar
cd \dev\junct
cd foo\..\bar
cd ..
cd foo
cd ..
cd \dev\test
cd foo\..\bar
cd ..
cd foo
cd ..

showing all the prompts that display the current path name in the sequence?

im not sure if your script exploited the structure i set up properly or not.

I had d​:\dev\junct\foo -> d​:\dev\test

Oops. No it probably doesn't. You used foo differently than in my
mental image, and then I confused myself by having two mental images of foo.

Although I think that in spite of my confusion, and the resulting
failure of some of the commands (which you tried to fix, thanks), that
the information I desired to obtain is demonstrated by the fragment of
your testing that I quote below...

so creating a directory in junct doesnt create it in test. Heres the
full output with me adding directories to resolve the "cant find
directory" errors. I ran the script you requested three times as
"junct.bat".

D​:\dev>junction junct\foo

Junction v1.03 - Win2K junction creator and reparse point viewer
Copyright (C) 2000-2002 Mark Russinovich
Systems Internals - http​://www.sysinternals.com

D​:\dev\junct\foo​: JUNCTION
Substitute Name​: d​:\dev\test

D​:\dev>junct.bat

D​:\dev>d​:

D​:\dev>md \dev\junct\bar

D​:\dev>cd \dev\junct

D​:\dev\junct>cd foo\..\bar

D​:\dev\junct\bar>cd ..

D​:\dev\junct>cd foo

D​:\dev\junct\foo>cd ..

D​:\dev\junct>cd \dev\test

At this point the prompt being D​:\dev\junct indicates that the previous
"cd .." doesn't take one over to D​:\dev\test. But, it appears that
CMD's CD command is likely looking at D​:\dev\junct\foo and lexically
removing one path to return to .. rather than following the file system
.. entry to get there...

I guess there would be more interesting tests, perhaps a hybrid of what
you did, and what I was trying to do...

d​:
cd \dev
echo a1 > junct\test.txt
echo b2 > test.txt
rem verify they are different files
type junct\test.txt
type test.txt
rem see which one we get
type junct\foo\..\test.txt

--
Glenn -- http​://nevcal.com/

Having identified a vast realm of ignorance, Wolfram is saying that much
of this realm lies forever outside the light cone of human knowledge.
  -- Michael Swaine, Dr Dobbs Journal, Sept 2002

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2005

From @schwern

On Sat, Jul 09, 2005 at 08​:14​:47AM +0200, demerphq wrote​:

The current working directory is not considered in canonpath(), period.
canonpath("../../foo") is "../../foo".

As I said before, the docs DONT specify what canonpath() is for very
well. All of this stuff is inferred or implied. Personally i dont feel
bad in coming up with a interpretation of what canonpath() is for that
differs from your own, or Schwerns, when the docs do not actually
explicitly say what it does.

One can infer almost any behaviour from "cleans up the path". And IMO,
whether looking at cwd or not is included in "does not look at the
filesystem" is unclear.

Yes, the docs stink. Patches welcome.

OTOH, i still think canonpath for absolute paths on Win32 should use
GetFullPathName().

A) Why change what works?
B) The docs for GetFullPathName() read more like rel2abs() than canonpath().

http​://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/getfullpathname.asp
GetFullPathName merges the name of the current drive and directory with a specified file name to determine the full path and file name of a specified file.

C) Its not specified in the GetFullPathName() that it does the sort of
canonicalizing we expect. It might *happen* to do it now but who knows
later.

D) File​::Spec​::Win32 will no longer work on non-Windows platforms making
cross-platform development and testing just that more frustrating.

--
Michael G Schwern schwern@​pobox.com http​://www.pobox.com/~schwern
Don't try the paranormal until you know what's normal.
  -- "Lords and Ladies" by Terry Prachett

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2005

From @kenahoo

On Jul 8, 2005, at 4​:07 PM, demerphq wrote​:

Im not entirely sure what scenario I
should be testing here, but i beleive the problem you are thinking of
is due to symlinks to a directory? If so then the win32 equivelent
would be a junction I think and in that case yes, foo\..\bar == bar.

Could you try a scenario like the one outlined in the canonpath() docs
here?

  http​://search.cpan.org/~kwilliams/PathTools-3.09/lib/File/Spec/Unix.pm

  -Ken

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2005

From @schwern

Ok, enough dithering. Let's kill this bug.

File​::Spec​::Win32->canonpath() currently contains code to collapse .. so
whether or not it should continue to do so in the future is outside the
scope of this bug. That code is also busted and is the source of this bug.

Attached is a patch to fix this bug. It replaces the collapsing code in
canonpath() with a more reliable method. It also moves the code into
its own method to simplify canonpath(). _collapse() goes into
File​::Spec​::Unix because it is not platform specific. Tests have been
added for this bug in both Win32 and Unix.

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2005

From @schwern

fs.patch
diff -rN -u old-PathTools-3.09/lib/File/Spec/Unix.pm new-PathTools-3.09/lib/File/Spec/Unix.pm
--- old-PathTools-3.09/lib/File/Spec/Unix.pm	2005-07-08 16:22:17.000000000 -0700
+++ new-PathTools-3.09/lib/File/Spec/Unix.pm	2005-07-08 16:23:47.000000000 -0700
@@ -472,4 +472,40 @@
     Cwd::cwd();
 }
 
+
+# Internal method to reduce xx\..\yy -> yy
+sub _collapse {
+    my($fs, $path) = @_;
+
+    my $updir  = $fs->updir;
+    my $curdir = $fs->curdir;
+
+    my($vol, $dirs, $file) = $fs->splitpath($path);
+    my @dirs = $fs->splitdir($dirs);
+
+    my @collapsed;
+    push @collapsed, $curdir unless $fs->file_name_is_absolute($path);
+
+    foreach my $dir (@dirs) {
+        if( $dir eq $updir              and   # if we have an updir
+            @collapsed                  and   # and something to collapse
+            length $collapsed[-1]       and   # and its not the rootdir
+            $collapsed[-1] ne $updir    and   # nor another updir
+            $collapsed[-1] ne $curdir         # nor the curdir
+          ) 
+        {                                     # then
+            pop @collapsed;                   # collapse
+        }
+        else {                                # else
+            push @collapsed, $dir;            # just hang onto it
+        }
+    }
+
+    return $fs->catpath($vol,
+                        $fs->catdir(@collapsed),
+                        $file
+                       );
+}
+
+
 1;
diff -rN -u old-PathTools-3.09/lib/File/Spec/Win32.pm new-PathTools-3.09/lib/File/Spec/Win32.pm
--- old-PathTools-3.09/lib/File/Spec/Win32.pm	2005-07-08 16:22:17.000000000 -0700
+++ new-PathTools-3.09/lib/File/Spec/Win32.pm	2005-07-08 16:24:10.000000000 -0700
@@ -145,29 +145,7 @@
     return $path unless $path =~ /\.\./;    # too few .'s to cleanup
     return $path if $path =~ /\.\.\.\./;    # too many .'s to cleanup
 
-    my ($vol,$dirs,$file) = $self->splitpath($path);
-    my @dirs = $self->splitdir($dirs);
-    my (@base_dirs, @path_dirs);
-    my $dest = \@base_dirs;
-    for my $dir (@dirs){
-	$dest = \@path_dirs if $dir eq $self->updir;
-	push @$dest, $dir;
-    }
-    # for each .. in @path_dirs pop one item from 
-    # @base_dirs
-    while (my $dir = shift @path_dirs){ 
-	unless ($dir eq $self->updir){
-	    unshift @path_dirs, $dir;
-	    last;
-	}
-	pop @base_dirs;
-    }
-    $path = $self->catpath( 
-			   $vol, 
-			   $self->catdir(@base_dirs, @path_dirs), 
-			   $file
-			  );
-    return $path;
+    return $self->_collapse($path);
 }
 
 =item splitpath
diff -rN -u old-PathTools-3.09/t/Spec.t new-PathTools-3.09/t/Spec.t
--- old-PathTools-3.09/t/Spec.t	2005-07-08 16:22:17.000000000 -0700
+++ new-PathTools-3.09/t/Spec.t	2005-07-08 16:31:53.000000000 -0700
@@ -92,14 +92,16 @@
 [ "Unix->catdir('','d1','d2','d3')",    '/d1/d2/d3' ],
 [ "Unix->catdir('d1','d2','d3')",       'd1/d2/d3'  ],
 
-[ "Unix->canonpath('')",                                      ''          ],
 [ "Unix->canonpath('///../../..//./././a//b/.././c/././')",   '/a/b/../c' ],
-[ "Unix->canonpath('/.')",                                    '/'         ],
-[ "Unix->canonpath('/./')",                                   '/'         ],
-[ "Unix->canonpath('/a/./')",                                 '/a'        ],
-[ "Unix->canonpath('/a/.')",                                  '/a'        ],
-[ "Unix->canonpath('/../../')",                               '/'         ],
-[ "Unix->canonpath('/../..')",                                '/'         ],
+[ "Unix->canonpath('')",                       ''               ],
+# rt.perl.org 27052
+[ "Unix->canonpath('a/../../b/c')",            'a/../../b/c'    ],
+[ "Unix->canonpath('/.')",                     '/'              ],
+[ "Unix->canonpath('/./')",                    '/'              ],
+[ "Unix->canonpath('/a/./')",                  '/a'             ],
+[ "Unix->canonpath('/a/.')",                   '/a'             ],
+[ "Unix->canonpath('/../../')",                '/'              ],
+[ "Unix->canonpath('/../..')",                 '/'              ],
 
 [  "Unix->abs2rel('/t1/t2/t3','/t1/t2/t3')",          ''                   ],
 [  "Unix->abs2rel('/t1/t2/t4','/t1/t2/t3')",          '../t4'              ],
@@ -214,6 +216,8 @@
 [ "Win32->canonpath('a:')",             'A:'                  ],
 [ "Win32->canonpath('A:f')",            'A:f'                 ],
 [ "Win32->canonpath('A:/')",            'A:\\'                ],
+# rt.perl.org 27052
+[ "Win32->canonpath('a\\..\\..\\b\\c')", '..\\b\\c'           ],
 [ "Win32->canonpath('//a\\b//c')",      '\\\\a\\b\\c'         ],
 [ "Win32->canonpath('/a/..../c')",      '\\a\\....\\c'        ],
 [ "Win32->canonpath('//a/b\\c')",       '\\\\a\\b\\c'         ],

@p5pRT
Copy link
Author

p5pRT commented Jul 19, 2005

From @rgarcia

On 7/12/05, Michael G Schwern via RT <perlbug-followup@​perl.org> wrote​:

Ok, enough dithering. Let's kill this bug.

File​::Spec​::Win32->canonpath() currently contains code to collapse .. so
whether or not it should continue to do so in the future is outside the
scope of this bug. That code is also busted and is the source of this bug.

Attached is a patch to fix this bug. It replaces the collapsing code in
canonpath() with a more reliable method. It also moves the code into
its own method to simplify canonpath(). _collapse() goes into
File​::Spec​::Unix because it is not platform specific. Tests have been
added for this bug in both Win32 and Unix.

What's the status of this patch ? applied to File​::Spec ? rejected ?
updated ? (catching up over there...)

@p5pRT
Copy link
Author

p5pRT commented Jul 21, 2005

From @kenahoo

On Jul 19, 2005, at 3​:06 AM, Rafael Garcia-Suarez wrote​:

On 7/12/05, Michael G Schwern via RT <perlbug-followup@​perl.org> wrote​:

Ok, enough dithering. Let's kill this bug.

File​::Spec​::Win32->canonpath() currently contains code to collapse ..
so
whether or not it should continue to do so in the future is outside
the
scope of this bug. That code is also busted and is the source of
this bug.

Attached is a patch to fix this bug. It replaces the collapsing code
in
canonpath() with a more reliable method. It also moves the code into
its own method to simplify canonpath(). _collapse() goes into
File​::Spec​::Unix because it is not platform specific. Tests have been
added for this bug in both Win32 and Unix.

What's the status of this patch ? applied to File​::Spec ? rejected ?
updated ? (catching up over there...)

Finally applied.

  -Ken

@p5pRT
Copy link
Author

p5pRT commented Jun 21, 2008

p5p@spam.wizbit.be - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant