48

I am not being able to check this via experiments and could not gather it from the man pages as well.

Say I have two processes, one moving(rename) file1 from directory1 to directory2. Say the other process running concurrently copies the contents of directory1 and directory2 to another location. Is it possible that the copy happens in such a way that both directory1 and directory2 will show file1 - i.e directory1 is copied before the move and directory2 after the move by the first process.

Basically is rename() is an atomic system call?

Thanks

Guy Avraham
  • 2,830
  • 2
  • 31
  • 43
Lipika Deka
  • 3,386
  • 6
  • 40
  • 54

4 Answers4

28

Yes and no.

rename() is atomic assuming the OS does not crash. It cannot be split by any other filesystem op.

If the system crashes you might see a ln() operation instead.

Also note, when operating on a network filesystem, you might get ENOENT when the operation succeeded successfully. Local filesystem can't do that to you.

Joshua
  • 34,237
  • 6
  • 59
  • 120
  • @Joshua...Thanks. Is there some references where I can learn about the atomicity of operations. Moreover the following question suggests the absence of atomicity. I am not locking explicitly. – Lipika Deka Aug 14 '11 at 04:10
  • Read the man pages in section 2 of the online manuals for: link(), rename(), creat(), open(), unlink(), read(), write(). There are two APIs for explicit locking but programs that ignore them waltz right through locks. Also, do not trust flock() and friends on network filesystems (no reference: you are expected to know this is an unsolvable problem in computer science and the implementation can fail). O_EXCL didn't always exist which made things harder in the past. – Joshua Aug 15 '11 at 03:09
  • 8
    My source for "If the system crashes you might see a ln() operation instead." is the kernel source code itself. – Joshua Aug 15 '11 at 03:09
  • 1
    The ln() operation is to create a hard link. Once created in the new location, a remove() is performed on the old location. That's your rename. Although both operations are applied atomically. – Alexis Wilke Aug 27 '12 at 05:13
  • Note although that when overwriting it may not really be atomic. From rename(2) man page: "However, when overwriting there will probably be a window in which both oldpath and newpath refer to the file being renamed." – Alexis Wilke Aug 27 '12 at 05:23
25

This is a very late answer, but... yes rename() is atomic but not in the sense of your question. Under Linux, rename(2) says:

However, when overwriting there will probably be a window in which both oldpath and newpath refer to the file being renamed.

But rename() is still atomic in a very important sense: if you use it to overwrite a file, then you will end up with either the old or the new version and nothing else.

[update: but as @jonas-wielicki points out in the comments, you need to make sure the file you are renaming actually has up-to-date contents, using fsync() and friends.]

If newpath already exists it will be atomically replaced (subject to a few conditions; see ERRORS below), so that there is no point at which another process attempting to access newpath will find it missing.

If you see ERRORS, you will find that the rename might fail, but it will never break the atomicity.

This is all from the Linux man page. What I don't know is if you do a rename() on a network file-system where the server runs a different OS. Does the client have a hope in hell of guaranteeing atomicity then? I doubt it.

Adrian Ratnapala
  • 5,069
  • 1
  • 25
  • 38
  • 2
    Does the same apply to renaming a folder? – proteneer May 09 '14 at 23:11
  • 1
    @proteneer Yes, but probably not in the way you want. You are not allowed to overwrite an existing directory unless it is empty. For the empty directory, I guess you will have your atomicity guarantees. – Adrian Ratnapala May 10 '14 at 10:49
  • 4
    I’d like to add that in the overwrite use-case, it’s important to call flush before rename to ensure that the data is actually written to the file. Otherwise, in the event of a crash, one might end up with only an empty file (rename() succeeded, and data was not written to disk yet, then crash -> empty file). – Jonas Schäfer Dec 30 '16 at 15:10
7

I'm not sure the "basically" part of your question is valid. Unless you have some kind of synchronization between the two, it doesn't matter how atomic rename is. If the directory copy gets there before the rename, you are going to have file1 in both places.

I'm not sure if you meant thread or processes, but if there are locking mechanisms for both, threading locks are by far the simplest because they don't have to cross process boundaries.

boatcoder
  • 15,651
  • 16
  • 95
  • 164
  • Done correctly, rename() is a perfectly legitimate way to divvy up workload between worker processes. – Joshua Aug 15 '11 at 03:10
  • 3
    @Joshua: Yes, but Mark0978 is right: the process described in by the OP is racy *even though* `rename()` on one filesystem is atomic (because reading the contents of two different directories is not atomic, so the rename could happen after you have read directory1 and before you have read directory2). – caf Aug 15 '11 at 07:32
  • @caf: Mark is using a strawman argument. You do rename() from both processes. – Joshua Aug 15 '11 at 17:51
  • 1
    It's not a strawman argument. He wants rename to be atomic, hinting at the fact he doesn't want it interrupted while doing the rename. He hints at using multiple processes or threads (not sure that he knows the difference between the two) and wanting to make sure a rename doesn't get caught in the middle of a copy. All of these point to someone that needs to better understand race conditions. He's asking the WRONG question here, worrying about the wrong thing. The color of the auto doesn't matter when it is going off a cliff to burn in the canyon below. – boatcoder Aug 15 '11 at 21:53
  • 3
    @Joshua: You need to re-read the original question. It has one process doing a `rename()`, racing with another process that is *"copying the contents of directory1 and directory2 to another location"*. That second process requires at least distinct *"read the contents of directory1"* and *"read the contents of directory2"* steps, even *before* it does any copying/renaming of its own, and it is *these* steps that can race with the `rename()` in the first process. – caf Aug 15 '11 at 23:30
  • 1
    Unless Juggler did not mean to say "copy" for the second process, I have to agree that the atomicity of the rename() has nothing to do with the mentioned problem. – Alexis Wilke Aug 27 '12 at 05:14
0

the gnu libc manual says

One useful feature of rename is that the meaning of newname changes “atomically” from any previously existing file by that name to its new meaning (i.e., the file that was called oldname). There is no instant at which newname is non-existent “in between” the old meaning and the new meaning. If there is a system crash during the operation, it is possible for both names to still exist; but newname will always be intact if it exists at all.