62

I thought fsync() does fflush() internally, so using fsync() on a stream is OK. But I am getting an unexpected result when executed under network I/O.

My code snippet:

FILE* fp = fopen(file, "wb");
/* multiple fputs() calls like: */
fputs(buf, fp);
...
...
fputs(buf.c_str(), fp);
/* get fd of the FILE pointer */
fd = fileno(fp);
#ifndef WIN32
ret = fsync(fd);
#else
ret = _commit(fd);
fclose(fp);

But it seems _commit() is not flushing the data (I tried on Windows and the data was written on a Linux exported filesystem).

When I changed the code to be:

FILE* fp = fopen(file, "wb");
/* multiple fputs() calls like: */
fputs(buf, fp);   
...   
...
fputs(buf.c_str(), fp);
/* fflush the data */
fflush(fp);
fclose(fp);

it flushes the data.

I am wondering if _commit() does the same thing as fflush(). Any inputs?

S.S. Anne
  • 13,819
  • 7
  • 31
  • 62
Adil
  • 2,190
  • 4
  • 30
  • 36
  • What is the problem you were seeing with the first example? – rogerdpack Nov 08 '12 at 00:16
  • 1
    @rogerdpack in first example, writing to stream through fputs() is not synching / committing to disk even if calling _commit() function on the fd (file descriptor). This test was done under cluster system where the remote linux filesystem is exported as CIFS and used on Windows machine and node failover is tested during write. When node recovers it has been found that file size is zero. – Adil Nov 09 '12 at 12:26
  • Where is `#endif`? – binki Nov 25 '16 at 18:56
  • 1
    The question was internally consistent, and the answers consistent with it, prior to revision #5, which fundamentally changed the nature of the question. Rolled back to revision #4. – John Bollinger Jan 12 '17 at 14:38

6 Answers6

94

fflush() works on FILE*, it just flushes the internal buffers in the FILE* of your application out to the OS.

fsync works on a lower level, it tells the OS to flush its buffers to the physical media.

OSs heavily cache data you write to a file. If the OS enforced every write to hit the drive, things would be very slow. fsync (among other things) allows you to control when the data should hit the drive.

Furthermore, fsync/commit works on a file descriptor. It has no knowledge of a FILE* and can't flush its buffers. FILE* lives in your application, file descriptors live in the OS kernel, typically.

Daniel Porteous
  • 3,173
  • 1
  • 19
  • 33
nos
  • 207,058
  • 53
  • 381
  • 474
  • Thanks i was thinking on the same line. So if we are using FILE* then the same can be achieved by fflush() followed by fsync(). – Adil Feb 26 '10 at 09:48
  • No, because you cannot `fsync` a `FILE*`. – pattivacek Mar 26 '14 at 16:37
  • 8
    @patrickvacek actually you can get the file descriptor from `FILE *` using `int fileno(FILE * stream);` from `stdio.h`. – jotik Aug 11 '14 at 13:48
  • 2
    @jotik: You should use *either* (standard) `FILE *` functions, *or* (operating system) file handles. You don't mix them. And `fileno()` is not a standard function. Unfortunately, people have been notoriously lax with "expanding" standard headers... – DevSolar Feb 04 '15 at 15:32
  • 3
    @DevSolar - `fileno()` is part of the POSIX standard, so although it's not necessarily portable, it is standard on some platforms. – Josh Kelley Aug 19 '15 at 20:06
8

The standard C function fflush() and the POSIX system call fsync() are conceptually somewhat similar. fflush() operates on C file streams (FILE objects), and is therefore portable. fsync() operate on POSIX file descriptors. Both cause buffered data to be sent to a destination.

On a POSIX system, each C file stream has an associated file descriptor, and all the operations on a C file stream will be implemented by delegating, when necessary, to POSIX system calls that operate on the file descriptor.

One might think that a call to fflush on a POSIX system would cause a write of any data in the buffer of the file stream, followed by a call of fsync() for the file descriptor of that file stream. So on a POSIX system there would be no need to follow a call to fflush with a call to fsync(fileno(fp)). But is that the case: is there a call to fsync from fflush?

No, calling fflush on a POSIX system does not imply that fsync will be called.

The C standard for fflush says (emphasis added) it

causes any unwritten data for [the] stream to be delivered to the host environment to be written to the file

Saying that the data is to be written, rather than that is is written implies that further buffering by the host environment is permitted. That buffering by the "host environment" could include, for a POSIX environment, the internal buffering that fsync flushes. So a close reading of the C standard suggests that the standard does not require the POSIX implementation to call fsync.

The POSIX standard description of fflush does not declare, as an extension of the C semantics, that fsync is called.

Raedwald
  • 40,290
  • 35
  • 127
  • 207
2

I could say that for simplicity:

use fsync() with not streaming files (integer file descriptors)

use fflush() with file streams.

Also here is the help from man:

int fflush(FILE *stream); // flush a stream, FILE* type

int fsync(int fd); // synchronize a file's in-core state with storage device
                    // int type
pulse
  • 129
  • 2
  • 14
  • So does `fflush()` implicitly call `fsync()` for you? – binki Nov 25 '16 at 18:57
  • 1
    @binki no - it calls `write()` on the buffered data – Guillaume Jul 27 '17 at 15:14
  • Ah, it looks like the question was edited to ask the wrong thing and it was fixed after I looked at it (when I looked at it the question was “does `fflush()` call `fsync()` for you?, now it’s back to the nonsensical “does `fsync()` call `fflush()` for you which is impossible because you can map a `FILE*` onto an `fd` but not map an `fd` to a `FILE*`). So @Guillaume, you’re saying that one would need to do `fflush()` and *then* `fsync()` (unless you’re about to `fclose()` which will do both for you anyway, oh, `fclose()` doesn’t automatically flush, xD). – binki Jul 27 '17 at 15:21
  • @binki if you want to make sure all written data (written by both buffered and unbuffered I/O) is on the physical media backing the file. You would need to both call `fflush()` and `fsync()`, yes. In general people that care about that do not use buffered I/O (using `setvbuf()` for example) so they usually just do a `fsync()`. – Guillaume Jul 27 '17 at 16:06
1

fflush() and fsync() can be used to try and ensure data is written to the storage media (but it is not always be possible):

  1. first use fflush(fp) on the output stream (fp being a FILE * obtained from fopen or one of the standard streams stdout or stderr) to write the contents of the buffer associated with the stream to the OS.
  2. then use fsync(fileno(fp)) to tell the OS to write its own buffers to the storage media.

Note however that fileno() and fsync() are POSIX functions that might not be available on all systems, notably Microsoft legacy systems where alternatives may be named _fileno(), _fsync() or _commit()...

chqrlie
  • 98,886
  • 10
  • 89
  • 149
0

To force the commitment of recent changes to disk, use the sync() or fsync() functions.

fsync() will synchronize all of the given file's data and metadata with the permanent storage device. It should be called just before the corresponding file has been closed.

sync() will commit all modified files to disk.

0

I think below document from python (https://docs.python.org/2/library/os.html) clarifies it very well.

os.fsync(fd) Force write of file with filedescriptor fd to disk. On Unix, this calls the native fsync() function; on Windows, the MS _commit() function.

If you’re starting with a Python file object f, first do f.flush(), and then do os.fsync(f.fileno()), to ensure that all internal buffers associated with f are written to disk.

Availability: Unix, and Windows starting in 2.2.3.

poordeveloper
  • 2,136
  • 1
  • 21
  • 35