Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

edit <> files in place is not atomic #6177

Closed
p5pRT opened this issue Dec 21, 2002 · 13 comments
Closed

edit <> files in place is not atomic #6177

p5pRT opened this issue Dec 21, 2002 · 13 comments

Comments

@p5pRT
Copy link

p5pRT commented Dec 21, 2002

Migrated from rt.perl.org#19333 (status was 'resolved')

Searchable as RT19333$

@p5pRT
Copy link
Author

p5pRT commented Dec 21, 2002

From eperez@it.uc3m.es

The operation​:
perl -pi -e 's/foo/bar/' file
should be atomic.

If something while perl is working hangs the system like a crash or a
power failure all the data is lost.

This is what perl currently does (simplified Linux strace)​:
open("file", O_RDONLY|O_LARGEFILE) = original
unlink("file")
open("file", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0666) = new
read(original,...
write(new,...
close(new)
close(original)

Thus is the atomic approach​:
open("file", O_RDONLY|O_LARGEFILE) = original
open("file.new", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0666) = new
read(original,...
write(new,...
close(new)
close(original)
rename("file.new", "file")

This way you always keep a good copy if something fails.

@p5pRT
Copy link
Author

p5pRT commented Dec 21, 2002

From eperez@it.uc3m.es

This is the patch I'm working.
It's segfaulting.
As I'm not a perl core coder I don't know where's the problem.

Be aware it removes some features, so some tests don't pass.

But this is what a working patch should look like.

Could some perl core coder tell me how to fix this code?

@p5pRT
Copy link
Author

p5pRT commented Dec 21, 2002

From eperez@it.uc3m.es

perl_inplace_atomic.diff
--- perl/doio.c
+++ perl/doio.c
@@ -745,87 +745,9 @@
 		    do_close(gv,FALSE);
 		    continue;
 		}
-		if (*PL_inplace) {
-		    char *star = strchr(PL_inplace, '*');
-		    if (star) {
-			char *begin = PL_inplace;
-			sv_setpvn(sv, "", 0);
-			do {
-			    sv_catpvn(sv, begin, star - begin);
-			    sv_catpvn(sv, PL_oldname, oldlen);
-			    begin = ++star;
-			} while ((star = strchr(begin, '*')));
-			if (*begin)
-			    sv_catpv(sv,begin);
-		    }
-		    else {
-			sv_catpv(sv,PL_inplace);
-		    }
-#ifndef FLEXFILENAMES
-		    if ((PerlLIO_stat(SvPVX(sv),&PL_statbuf) >= 0
-			 && PL_statbuf.st_dev == filedev
-			 && PL_statbuf.st_ino == fileino)
-#ifdef DJGPP
-			|| ((_djstat_fail_bits & _STFAIL_TRUENAME)!=0)
-#endif
-                      )
-		    {
-			if (ckWARN_d(WARN_INPLACE))	
-			    Perl_warner(aTHX_ packWARN(WARN_INPLACE),
-			      "Can't do inplace edit: %s would not be unique",
-			      SvPVX(sv));
-			do_close(gv,FALSE);
-			continue;
-		    }
-#endif
-#ifdef HAS_RENAME
-#if !defined(DOSISH) && !defined(__CYGWIN__) && !defined(EPOC)
-		    if (PerlLIO_rename(PL_oldname,SvPVX(sv)) < 0) {
-		        if (ckWARN_d(WARN_INPLACE))	
-			    Perl_warner(aTHX_ packWARN(WARN_INPLACE),
-			      "Can't rename %s to %s: %s, skipping file",
-			      PL_oldname, SvPVX(sv), Strerror(errno) );
-			do_close(gv,FALSE);
-			continue;
-		    }
-#else
-		    do_close(gv,FALSE);
-		    (void)PerlLIO_unlink(SvPVX(sv));
-		    (void)PerlLIO_rename(PL_oldname,SvPVX(sv));
-		    do_open(gv,SvPVX(sv),SvCUR(sv),PL_inplace!=0,O_RDONLY,0,Nullfp);
-#endif /* DOSISH */
-#else
-		    (void)UNLINK(SvPVX(sv));
-		    if (link(PL_oldname,SvPVX(sv)) < 0) {
-		        if (ckWARN_d(WARN_INPLACE))	
-			    Perl_warner(aTHX_ packWARN(WARN_INPLACE),
-			      "Can't rename %s to %s: %s, skipping file",
-			      PL_oldname, SvPVX(sv), Strerror(errno) );
-			do_close(gv,FALSE);
-			continue;
-		    }
-		    (void)UNLINK(PL_oldname);
-#endif
-		}
-		else {
-#if !defined(DOSISH) && !defined(AMIGAOS)
-#  ifndef VMS  /* Don't delete; use automatic file versioning */
-		    if (UNLINK(PL_oldname) < 0) {
-		        if (ckWARN_d(WARN_INPLACE))	
-			    Perl_warner(aTHX_ packWARN(WARN_INPLACE),
-			      "Can't remove %s: %s, skipping file",
-			      PL_oldname, Strerror(errno) );
-			do_close(gv,FALSE);
-			continue;
-		    }
-#  endif
-#else
-		    Perl_croak(aTHX_ "Can't do inplace edit without backup");
-#endif
-		}
 
-		sv_setpvn(sv,">",!PL_inplace);
-		sv_catpvn(sv,PL_oldname,oldlen);
+		sv_setpvn(sv,PL_oldname,oldlen);
+		sv_catpvn(sv,".new",4);
 		SETERRNO(0,0);		/* in case sprintf set errno */
 #ifdef VMS
 		if (!do_open(PL_argvoutgv,SvPVX(sv),SvCUR(sv),PL_inplace!=0,
@@ -883,7 +805,37 @@
     if (io && (IoFLAGS(io) & IOf_ARGV))
 	IoFLAGS(io) |= IOf_START;
     if (PL_inplace) {
+	STRLEN oldlen;
 	(void)do_close(PL_argvoutgv,FALSE);
+	    /* We shouldn't get here if there where any problem (like out of space) writing the file or closing them */
+	PL_oldname = SvPVx(GvSV(gv), oldlen);
+	sv_setpvn(sv,PL_oldname,oldlen);
+	sv_catpvn(sv,".new",4);
+#ifdef HAS_RENAME
+#if !defined(DOSISH) && !defined(__CYGWIN__) && !defined(EPOC)
+		    if (PerlLIO_rename(SvPVX(sv),PL_oldname) < 0) {
+		        if (ckWARN_d(WARN_INPLACE))	
+			    Perl_warner(aTHX_ packWARN(WARN_INPLACE),
+			      "Can't rename %s to %s: %s, skipping file",
+			      SvPVX(sv), PL_oldname, Strerror(errno) );
+			do_close(gv,FALSE);
+		    }
+#else
+		    /* (void)PerlLIO_unlink(SvPVX(sv)); */ /* Don't know if this is needed, but if needed it would be not atomic, at least it wont lose data */
+		    (void)PerlLIO_rename(SvPVX(sv),PL_oldname);
+#endif /* DOSISH */
+#else
+		    (void)UNLINK(PL_oldname);
+		    if (link(SvPVX(sv),PL_oldname) < 0) {
+		        if (ckWARN_d(WARN_INPLACE))	
+			    Perl_warner(aTHX_ packWARN(WARN_INPLACE),
+			      "Can't rename %s to %s: %s, skipping file",
+			      PL_oldname, SvPVX(sv), Strerror(errno) );
+			do_close(gv,FALSE);
+		    } else {
+			(void)UNLINK(SvPVX(sv));
+		    }
+#endif
 	if (io && (IoFLAGS(io) & IOf_ARGV)
 	    && PL_argvout_stack && AvFILLp(PL_argvout_stack) >= 0)
 	{

@p5pRT
Copy link
Author

p5pRT commented Nov 24, 2011

From @jkeenan

On Sat Dec 21 10​:32​:12 2002, eperez wrote​:

The operation​:
perl -pi -e 's/foo/bar/' file
should be atomic.

If something while perl is working hangs the system like a crash or a
power failure all the data is lost.

As of Perl 5.14.2, 'pods/perlrun.pod' says this about the '-i' option
###
-i[*extension*]
  specifies that files processed by the "<>" construct are to be
  edited in-place. It does this by renaming the input file, opening
  the output file by the original name, and selecting that output
  file as the default for print() statements. The extension, if
  supplied, is used to modify the name of the old file to make a
  backup copy, following these rules​:

  If no extension is supplied, no backup is made and the current file
  is overwritten.
###

This seems to me to guarantee the atomicity the original poster was seeking.

Am I correct in this conclusion? If so, then the ticket may be closed.

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Nov 27, 2011

From PeterCMartini@GMail.com

On Thu, Nov 24, 2011 at 2​:30 PM, James E Keenan via RT <
perlbug-followup@​perl.org> wrote​:

On Sat Dec 21 10​:32​:12 2002, eperez wrote​:

The operation​:
perl -pi -e 's/foo/bar/' file
should be atomic.

If something while perl is working hangs the system like a crash or a
power failure all the data is lost.

As of Perl 5.14.2, 'pods/perlrun.pod' says this about the '-i' option
###
-i[*extension*]
specifies that files processed by the "<>" construct are to be
edited in-place. It does this by renaming the input file, opening
the output file by the original name, and selecting that output
file as the default for print() statements. The extension, if
supplied, is used to modify the name of the old file to make a
backup copy, following these rules​:

If no extension is supplied, no backup is made and the current file
is overwritten.
###

This seems to me to guarantee the atomicity the original poster was
seeking.

Am I correct in this conclusion? If so, then the ticket may be closed.

Thank you very much.
Jim Keenan

The issue is what happens if the program crashes in the middle of the
operation. In most OSes* If no extension is supplied and perl dies
unexpectedly, you're out of luck - the original file has been replaced by
whatever perl had managed to write out so far. The request in this ticket
is to change that to silently using a default extension. I'd propose a
different solution​: update perlrun.pod to point out the following two
caveats if no extension is supplied​:

1. If the script dies any time after opening the file, the file will now
only have what had been processed up to that point.
2. Even if no extension is supplied and the file appears to be written in
place, the filesystem doesn't count the space used by the original file as
free until perl exits (or more technically, until the filehandle closes).
I vaguely recall this being brought up as a separate bug on this list,
since it can lead to data loss if the filesystem is close to full.

I'm not sure how much detail to go into in perlrun, otherwise I'd have a
patch attached!

* See http​://perldoc.perl.org/perlcygwin.html; Windows does what the
requester wants, with the .bak extension, because it does not support doing
it any other way. I don't see a note about that in perlwin32.pod, but I
believe it would hold there as well?

@p5pRT
Copy link
Author

p5pRT commented Nov 27, 2011

From @rjbs

On Sat Nov 26 23​:25​:16 2011, pcm wrote​:

...
different solution​: update perlrun.pod to point out the following two
caveats if no extension is supplied​:
...

We can add this, but they're not really "perl facts" so much as "how
file I/O generally works." I'm not excited about the idea of adding
this kind of warning everywhere it might be relevant. Is this case
really special, or is this just one user who didn't think through the
ramifications of his program?

@p5pRT
Copy link
Author

p5pRT commented Dec 11, 2011

From @wolfsage

On Sun Nov 27 03​:49​:39 2011, rjbs wrote​:

On Sat Nov 26 23​:25​:16 2011, pcm wrote​:

...
different solution​: update perlrun.pod to point out the following two
caveats if no extension is supplied​:
...

We can add this, but they're not really "perl facts" so much as "how
file I/O generally works." I'm not excited about the idea of adding
this kind of warning everywhere it might be relevant. Is this case
really special, or is this just one user who didn't think through the
ramifications of his program?

I think adding the warning to perlrun.pod at least would be good so it's
clear that if you don't want to lose your data, you should make backups.

I realize that this really goes without saying, because if you mess up
the search/replace pattern you can really screw up your data, but
sometimes it's good to re-iterate or make obvious things we may consider
common knowledge.

I don't think we should make backing up the data default behavior
though, but I suppose my only reasoning for that is because "This is how
it's always been."

-- Matthew Horsfall (alh)

@p5pRT
Copy link
Author

p5pRT commented Sep 22, 2012

From @jkeenan

On Sun Dec 11 15​:23​:08 2011, alh wrote​:

On Sun Nov 27 03​:49​:39 2011, rjbs wrote​:

On Sat Nov 26 23​:25​:16 2011, pcm wrote​:

...
different solution​: update perlrun.pod to point out the following two
caveats if no extension is supplied​:
...

We can add this, but they're not really "perl facts" so much as "how
file I/O generally works." I'm not excited about the idea of adding
this kind of warning everywhere it might be relevant. Is this case
really special, or is this just one user who didn't think through the
ramifications of his program?

I think adding the warning to perlrun.pod at least would be good so it's
clear that if you don't want to lose your data, you should make backups.

I realize that this really goes without saying, because if you mess up
the search/replace pattern you can really screw up your data, but
sometimes it's good to re-iterate or make obvious things we may consider
common knowledge.

I don't think we should make backing up the data default behavior
though, but I suppose my only reasoning for that is because "This is how
it's always been."

-- Matthew Horsfall (alh)

rjbs, alh​:

Can you come to some resolution of these issues so that we can patch if
needed and then close this RT?

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Sep 28, 2012

From @rjbs

* James E Keenan via RT <perlbug-followup@​perl.org> [2012-09-21T22​:48​:07]

Can you come to some resolution of these issues so that we can patch if
needed and then close this RT?

I relent. A warning is appropriate.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Sep 28, 2012

From @jkeenan

On Thu Sep 27 19​:00​:19 2012, perl.p5p@​rjbs.manxome.org wrote​:

* James E Keenan via RT <perlbug-followup@​perl.org> [2012-09-21T22​:48​:07]

Can you come to some resolution of these issues so that we can patch if
needed and then close this RT?

I relent. A warning is appropriate.

alh, pcm​: Can either of you submit a *brief* patch?

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Oct 9, 2012

From PeterCMartini@GMail.com

On Thu, Sep 27, 2012 at 10​:27 PM, James E Keenan via RT <
perlbug-followup@​perl.org> wrote​:

On Thu Sep 27 19​:00​:19 2012, perl.p5p@​rjbs.manxome.org wrote​:

* James E Keenan via RT <perlbug-followup@​perl.org>
[2012-09-21T22​:48​:07]

Can you come to some resolution of these issues so that we can patch if
needed and then close this RT?

I relent. A warning is appropriate.

alh, pcm​: Can either of you submit a *brief* patch?

Something like this?

From a09ace7d638055c4a5ecf3cce1d41de3347bfcda Mon Sep 17 00​:00​:00 2001
From​: Peter Martini <PeterCMartini@​GMail.com>
Date​: Mon, 8 Oct 2012 22​:31​:37 -0400
Subject​: [PATCH] Clarify that in-place editing actually creates a new file.
If the in-place editing dies, the original is gone.
Another implication of this is that hard links on UNIX
won't work properly, since a new inode will be generated -
I think that's a little too specific to spell out in the docs
though.


pod/perlrun.pod | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)

Inline Patch
diff --git a/pod/perlrun.pod b/pod/perlrun.pod
index 9ed678c..1f32cd5 100644
--- a/pod/perlrun.pod
+++ b/pod/perlrun.pod
@@ -506,8 +506,10 @@ default for print() statements.  The extension, if
supplied, is used to modify the name of the old file to make a backup copy\, following these rules​:

-If no extension is supplied, no backup is made and the current file is
-overwritten.
+If no extension is supplied, and your system supports it, the original
+I<file> is kept open without a name while the output is redirected to
+a new file with the original I<filename>. When perl exits, cleanly or not,
+the original I<file> is unlinked.

If the extension doesn't contain a C<*>, then it is appended to the
end of the current filename as a suffix. If the extension does
--
1.7.1

Thank you very much.
Jim Keenan

---
via perlbug​: queue​: perl5 status​: open
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=19333

@p5pRT
Copy link
Author

p5pRT commented Oct 9, 2012

From @cpansprout

On Mon Oct 08 19​:47​:55 2012, pcm wrote​:

On Thu, Sep 27, 2012 at 10​:27 PM, James E Keenan via RT <
perlbug-followup@​perl.org> wrote​:

On Thu Sep 27 19​:00​:19 2012, perl.p5p@​rjbs.manxome.org wrote​:

* James E Keenan via RT <perlbug-followup@​perl.org>
[2012-09-21T22​:48​:07]

Can you come to some resolution of these issues so that we can
patch if
needed and then close this RT?

I relent. A warning is appropriate.

alh, pcm​: Can either of you submit a *brief* patch?

Something like this?

From a09ace7d638055c4a5ecf3cce1d41de3347bfcda Mon Sep 17 00​:00​:00 2001
From​: Peter Martini <PeterCMartini@​GMail.com>
Date​: Mon, 8 Oct 2012 22​:31​:37 -0400
Subject​: [PATCH] Clarify that in-place editing actually creates a new
file.
If the in-place editing dies, the original is gone.
Another implication of this is that hard links on UNIX
won't work properly, since a new inode will be generated -
I think that's a little too specific to spell out in the docs
though.

---
pod/perlrun.pod | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/pod/perlrun.pod b/pod/perlrun.pod
index 9ed678c..1f32cd5 100644
--- a/pod/perlrun.pod
+++ b/pod/perlrun.pod
@​@​ -506,8 +506,10 @​@​ default for print() statements. The extension,
if
supplied, is used to
modify the name of the old file to make a backup copy, following
these
rules​:

-If no extension is supplied, no backup is made and the current file
is
-overwritten.
+If no extension is supplied, and your system supports it, the
original
+I<file> is kept open without a name while the output is redirected to
+a new file with the original I<filename>. When perl exits, cleanly
or not,
+the original I<file> is unlinked.

If the extension doesn't contain a C<*>, then it is appended to the
end of the current filename as a suffix. If the extension does
--
1.7.1

Good enough for me. Thank you. Applied as 479e5f8.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Oct 9, 2012

@cpansprout - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant