Safety for -i option #15216

p5pRT · 2016-03-06T10:32:46Z

Migrated from rt.perl.org#127663 (status was 'resolved')

Searchable as RT127663$

p5pRT · 2016-03-06T10:32:46Z

From @jiangyy

inplace-safety.rep

p5pRT · 2016-03-06T10:32:46Z

From @jiangyy

Regards,
Yanyan Jiang 蒋炎岩
Institute of Computer Software,
Dept. of Computer Science, Nanjing University

p5pRT · 2016-03-06T13:47:17Z

From @jkeenan

On Sun Mar 06 02:32:46 2016, jiangyy@outlook.com wrote:

Regards,
Yanyan Jiang 蒋炎岩
Institute of Computer Software,
Dept. of Computer Science, Nanjing University

Since the bug report was attached with a file extension which RT reports as a binary file, the report may not be visible. I am re-attaching as a plain-text file.

Thank you very much.

--
James E Keenan (jkeenan@cpan.org)

p5pRT · 2016-03-06T13:47:17Z

From @jkeenan

To: perlbug@perl.org
Subject: Safety for -i option
From: jiangyy@outlook.com
Message-Id: <5.22.1_10199_1457258711@ubuntuvm>
Reply-To: jiangyy@outlook.com

This is a bug report for perl from jiangyy@outlook.com,
generated with the help of perlbug 1.40 running under perl 5.22.1.

Like sed, perl can be used with -i to change files in-place.

However, our tool discovered that the saving procedure is not as
safe as sed. The system call trace (from 5.22.1):

open("file.txt", O_RDONLY|O_LARGEFILE) = 3
_llseek(3, 0, [0], SEEK_CUR) = 0
unlink("file.txt") = 0
open("file.txt", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = 4
read(3, ...)
read(3, ...)
write(4, ...)
...

If the program terminates in between, the file-system runs out of
space (when the replaced text is longer) or the system crashes, the
contents may lost (the worst case, completely gone due to the unlink).

sed uses a temporary file to get the job and rename it. But it seems
difficult to work considering portability. Many infrastructures (e.g.,
gtk and qt) provide portable solution, but seems not to apply with perl.

Thank you for your attention!

Flags:
category=core
severity=medium

Site configuration information for perl 5.22.1:

Configured by jyy at Sun Mar 6 04:34:47 EST 2016.

Summary of my perl5 (revision 5 version 22 subversion 1) configuration:

Platform:
osname=linux, osvers=4.2.0-27-generic, archname=i686-linux
uname='linux ubuntuvm 4.2.0-27-generic #32~14.04.1-ubuntu smp fri jan 22 15:32:27 utc 2016 i686 i686 i686 gnulinux '
config_args='-ds -e'
hint=recommended, useposix=true, d_sigaction=define
useithreads=undef, usemultiplicity=undef
use64bitint=undef, use64bitall=undef, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O2',
cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
ccversion='', gccversion='4.8.4', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234, doublekind=3
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12, longdblkind=3
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib/gcc/i686-linux-gnu/4.8/include-fixed /usr/include/i386-linux-gnu /usr/lib /lib/i386-linux-gnu /lib/../lib /usr/lib/i386-linux-gnu /usr/lib/../lib /lib
libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
libc=libc-2.19.so, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version='2.19'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector'

@INC for perl 5.22.1:
/usr/local/lib/perl5/site_perl/5.22.1/i686-linux
/usr/local/lib/perl5/site_perl/5.22.1
/usr/local/lib/perl5/5.22.1/i686-linux
/usr/local/lib/perl5/5.22.1
.

Environment for perl 5.22.1:
HOME=/home/jyy
LANG=en_US.UTF-8
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
PERL_BADLANG (unset)
SHELL=/bin/bash

p5pRT · 2016-03-06T13:47:17Z

From [Unknown Contact. See original ticket]

On Sun Mar 06 02:32:46 2016, jiangyy@outlook.com wrote:

Regards,
Yanyan Jiang 蒋炎岩
Institute of Computer Software,
Dept. of Computer Science, Nanjing University

Since the bug report was attached with a file extension which RT reports as a binary file, the report may not be visible. I am re-attaching as a plain-text file.

Thank you very much.

--
James E Keenan (jkeenan@cpan.org)

p5pRT · 2016-03-06T13:58:12Z

From @jkeenan

On Sun Mar 06 02:32:46 2016, jiangyy@outlook.com wrote:

Regards,
Yanyan Jiang 蒋炎岩
Institute of Computer Software,
Dept. of Computer Science, Nanjing University

From original report:
#####
Like sed, perl can be used with -i to change files in-place. However, our tool discovered that the saving procedure is not as safe as sed. The system call trace (from 5.22.1):

open("file.txt", O_RDONLY|O_LARGEFILE) = 3
_llseek(3, 0, [0], SEEK_CUR) = 0
unlink("file.txt") = 0

open("file.txt", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = 4
read(3, ...)
read(3, ...)
write(4, ...) ...
#####

Can you supply us with: (a) the list of commands you invoked at the command-line to get these results; (b) some idea of the size of the file in question relative to the size of memory?

Thank you very much.

--
James E Keenan (jkeenan@cpan.org)

p5pRT · 2016-03-06T13:58:12Z

The RT System itself - Status changed from 'new' to 'open'

p5pRT · 2016-03-06T15:15:04Z

From @jhi

Can you supply us with: (a) the list of commands you invoked at the
command-line to get these results; (b) some idea of the size of the
file in question relative to the size of memory?

Thank you very much.

Also:

(c) You said: "sed uses a temporary file to get the job and rename it. But it seems
difficult to work considering portability. Many infrastructures (e.g.,
gtk and qt) provide portable solution, but seems not to apply with perl."

Can you elaborate on the portable solutions that gtk and qt provide?

p5pRT · 2016-03-06T18:54:20Z

From @mauke

On Sun Mar 06 05:58:12 2016, jkeenan wrote:

On Sun Mar 06 02:32:46 2016, jiangyy@outlook.com wrote:

Regards,
Yanyan Jiang 蒋炎岩
Institute of Computer Software,
Dept. of Computer Science, Nanjing University

From original report:
#####
Like sed, perl can be used with -i to change files in-place. However,
our tool discovered that the saving procedure is not as safe as sed.
The system call trace (from 5.22.1):

open("file.txt", O_RDONLY|O_LARGEFILE) = 3
_llseek(3, 0, [0], SEEK_CUR) = 0
unlink("file.txt") = 0

open("file.txt", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = 4
read(3, ...)
read(3, ...)
write(4, ...) ...
#####

Can you supply us with: (a) the list of commands you invoked at the
command-line to get these results; (b) some idea of the size of the
file in question relative to the size of memory?

Here's my attempt:

% echo hi > tmp.txt
% strace -o strace.log perl -i -pe '' tmp.txt

Excerpt from strace.log:

open("tmp.txt", O_RDONLY|O_LARGEFILE) = 3
ioctl(3, TCGETS, 0xbfdd070c) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(3, 0, [0], SEEK_CUR) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=3, ...}) = 0
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
unlink("tmp.txt") = 0
open("tmp.txt", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = 4
ioctl(4, TCGETS, 0xbfdd070c) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(4, 0, [0], SEEK_CUR) = 0
fstat64(4, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
fstat64(4, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
fchmod(4, 0100644) = 0
read(3, "hi\n", 8192) = 3
read(3, "", 8192) = 0
write(4, "hi\n", 3) = 3
close(4) = 0
close(3) = 0

So, the file is tiny in this case (not sure why that matters?). Perl opens the input file (fd #3), unlinks it, then opens the same name again (fd #4), then streams data from fd 3 to fd 4.

If perl dies after the unlink() but before it is done writing to fd 4 and closing it, you get a truncated (or completely missing) output file.

p5pRT · 2016-03-06T21:27:19Z

From @rjbs

See also https://rt-archive.perl.org/perl5/Ticket/Display.html?id=57512

--
rjbs

p5pRT · 2016-03-06T22:03:32Z

From @jhi

On Sun Mar 06 07:15:04 2016, jhi wrote:

Can you supply us with: (a) the list of commands you invoked at the
command-line to get these results; (b) some idea of the size of the
file in question relative to the size of memory?

Thank you very much.

Also:

(c) You said: "sed uses a temporary file to get the job and rename it.
But it seems
difficult to work considering portability. Many infrastructures (e.g.,
gtk and qt) provide portable solution, but seems not to apply with
perl."

Can you elaborate on the portable solutions that gtk and qt provide?

I got the below email from jiangyy@outlook.com:

-- cut here --

Hi Jarkko,

My reply of bug #127663 is not appearing in the bug tracking system (I just replied the mail, sending to perlbug-followup@perl.org with subject “Re: [perl #127663] Safety for -i option”, and I have no idea why that does not work). I listed the comments below. Maybe you can post it.

Sed just uses rename() to replace the file with a temporary one, seems it is assuming a POSIX runtime, and this is POSIX safe. Gtk provides g_file_replace(), and Qt provides QSaveFile. Both are portable. We extensively tested these two implementations, and they are both safe in handling file overwrite.

We believe that perl is an extremely portable software, and semantics of rename() may be different on other platforms, and this shall be handled with care (though I’m not an expert on portability).

Regards,
Yanyan Jiang 蒋炎岩
Institute of Computer Software,
Dept. of Computer Science, Nanjing University

p5pRT · 2016-03-07T04:19:09Z

From @jiangyy

Can you supply us with: (a) the list of commands you invoked at the command-line to get these results; (b) some idea of the size of the file in question relative to the size of memory?

For perl, I just used a simple case of in-place text replacement:

perl5.22.1 -i -pe 's/old/new/g’ file.txt

I get the system-call trace via

strace COMMAND

The file is small (just kilobytes). If the program terminates just after unlink(), the file is gone. I simulated this process by killing it immediately after unlink(), and the file is indeed gone. If the file contents are huge, the overwrite itself can cause inconsistency (the first half is updated, the second half is old, and there are some corruptions in the middle).

Can you elaborate on the portable solutions that gtk and qt provide?

Sed just uses rename() to replace the file with a temporary one, seems it is assuming a POSIX runtime, and this is POSIX safe. Gtk provides g_file_replace(), and Qt provides QSaveFile. Both are portable. We extensively tested these two implementations, and they are both safe in handling file overwrite.

We believe that perl is an extremely portable software, and semantics of rename() may be different on other platforms, and this shall be handled with care (though I’m not an expert on portability).

Safety for -i option #15216

Safety for -i option #15216

Comments

p5pRT commented Mar 6, 2016

p5pRT commented Mar 6, 2016

From @jiangyy

p5pRT commented Mar 6, 2016

From @jiangyy

p5pRT commented Mar 6, 2016

From @jkeenan

p5pRT commented Mar 6, 2016

From @jkeenan

p5pRT commented Mar 6, 2016

From [Unknown Contact. See original ticket]

p5pRT commented Mar 6, 2016

From @jkeenan

p5pRT commented Mar 6, 2016

p5pRT commented Mar 6, 2016

From @jhi

p5pRT commented Mar 6, 2016

From @mauke

p5pRT commented Mar 6, 2016

From @rjbs

p5pRT commented Mar 6, 2016

From @jhi

p5pRT commented Mar 7, 2016

From @jiangyy

p5pRT commented Mar 8, 2016

From @tonycoz

p5pRT commented Apr 5, 2016

From @iabyn

p5pRT commented Jun 21, 2016

From @tonycoz

p5pRT commented Dec 6, 2016

From @tonycoz

p5pRT commented Dec 6, 2016

From @tonycoz

p5pRT commented Dec 6, 2016

From @ppisar

p5pRT commented Dec 6, 2016

From @demerphq

p5pRT commented Dec 6, 2016

From @tonycoz

p5pRT commented Dec 7, 2016

From @tonycoz

p5pRT commented Dec 7, 2016

From @tonycoz

p5pRT commented Dec 7, 2016

From @demerphq

p5pRT commented Dec 8, 2016

From @tonycoz

p5pRT commented Dec 8, 2016

From @tonycoz

p5pRT commented Dec 8, 2016

From @tonycoz

p5pRT commented Jan 9, 2017

From @tonycoz

p5pRT commented Jan 12, 2017

From @tonycoz

p5pRT commented Jan 12, 2017

From @tonycoz

p5pRT commented Sep 4, 2017

From @tonycoz

p5pRT commented Sep 4, 2017

From @tonycoz

p5pRT commented Sep 11, 2017

From @tonycoz

p5pRT commented Sep 12, 2017

From @tonycoz

p5pRT commented Sep 18, 2017

From @tonycoz

p5pRT commented Nov 19, 2018

From @jkeenan

p5pRT commented Nov 19, 2018

p5pRT commented May 22, 2019

From @khwilliamson

p5pRT commented May 22, 2019