21

I seem to recall cases in lower level languages that opening a file more than once in a program could result in a shared seek pointer. By messing around in Python a bit, this doesn't seem to be happening for me:

$ cat file.txt
first line!
second
third
fourth
and fifth
>>> f1 = open('file.txt')
>>> f2 = open('file.txt')
>>> f1.readline()
'first line!\n'
>>> f2.read()
'first line!\nsecond\nthird\nfourth\nand fifth\n'
>>> f1.readline()
'second\n'
>>> f2.read()
''
>>> f2.seek(0)
>>> f1.readline()
'third\n'

Is this behavior known to be safe? I'm having a hard time finding a source saying that it's okay, and it would help a lot if I could depend on this.

I'm not seeing the position as an attribute of the file object, otherwise I'd have more confidence in this. I know it could be kept internally in the iterator, but idk how .tell() would get to it in that case.

>>> dir(f1)
['__class__', '__delattr__', '__doc__', '__getattribute__', '__hash__',
 '__init__', '__iter__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
 '__setattr__', '__str__', 'close', 'closed', 'encoding', 'fileno', 'flush',
 'isatty', 'mode', 'name', 'newlines', 'next', 'read', 'readinto', 'readline',
 'readlines', 'seek', 'softspace', 'tell', 'truncate', 'write', 'writelines',
 'xreadlines']

UPDATE
On page 161 of The Python Essential Reference it states

The same file can be opened more than once in the same program (or in different programs). Each instance of the open file has its own file pointer that can be manipulated independently.

So it seems to in fact be safe, defined behavior

Ryan Haining
  • 30,835
  • 10
  • 95
  • 145
  • 1
    In python each time you call `open()` it creates a new file object(iterator), so you're safe. – Ashwini Chaudhary Jan 24 '13 at 21:01
  • 1
    I don't know of any platform where you could have a problem maintaining different seek pointers here. But… is it acceptable for your use case that your code may raise an exception opening `f2` in some cases on Windows, even though it never fails on Unix? – abarnert Jan 24 '13 at 21:58
  • @abarnert I only expect this to ever run on windows, and only when reading. Will opening on Windows acquire exclusive locks normally? – Ryan Haining Jan 25 '13 at 15:24
  • @xhainingx: IIRC, it will not acquire any locks, which means as long as no other code acquires any locks to the file, you're fine. However, I can't actually find that documented anywhere. [`fopen`](http://msdn.microsoft.com/en-us/library/yeby3zcb(v=vs.71).aspx) does not mention sharing at all. If you look inside, it's just calling [`fsopen`](http://msdn.microsoft.com/en-us/library/8f30b0db(v=vs.71).aspx) with `_SH_DENYNO` (at least with VC7 and VC10), but nothing seems to guarantee that anywhere, or that the `shflag` here and `dwShareMode` in `CreateFile` interact the way you'd hope… – abarnert Jan 25 '13 at 19:28
  • @xhainingx: My suggestion would be to test and make sure it works, and, if you don't about what happens when you interact with other apps trying to open the same file, don't worry about it. – abarnert Jan 25 '13 at 19:31
  • @abarnert let's say I'm only on unix, and I have two threads both reading from the same file at once, which is being written to by another thread. assuming that one thread reading works, is there any reason why adding a second reading thread wouldn't? – Ryan Haining Jan 26 '13 at 00:06
  • 1
    @xhainingx: On Unix, it's a lot simpler. POSIX defines how exactly how things work, and basically, even if a particular Unix does have Windows-style locking, it's not allowed to use it here. Just as you can `tail -f foo` in 4 different terminals and they don't interfere with each other, you can open the file in 2 threads and read in both of them and they won't interfere with each other. – abarnert Jan 26 '13 at 00:57
  • @abarnert okay great! I'm basically implementing tail -f in python in multiple threads so that's perfect. thanks a lot – Ryan Haining Jan 26 '13 at 01:05
  • open twice for read, you mean, presumably. Not for write/truncate, or append. See also [open file for both reading and writing?](https://stackoverflow.com/questions/6648493/open-file-for-both-reading-and-writing), [Open file for reading and writing with truncate](https://stackoverflow.com/questions/33466635/open-file-for-reading-and-writing-with-truncate) – smci May 01 '18 at 10:43

1 Answers1

12

On a modern OS (post-1969 for UNIX-like OSs, or post-2000 for Windows, and probably before that but I'm counting Win2K as the first "modern" Windows), each instance of an open file (file descriptor) has its own seek pointer. There is no magic in Python's file class that would cause instances to share state; file is a wrapper for an ordinary C file handle, which itself encapsulates an OS file descriptor, and the implementation of file.tell() and file.seek() call the corresponding C stdio functions. (For the messy details see CPython's fileobject.c.) There can be differences between the C library behavior and the underlying OS's behavior, but in this particular case that's not a factor.

If you're using IronPython or Jython, it's going to use the standard .Net or Java file object for its underlying implementation, which in turn is going to use the standard C library or OS implementation.

So your approach is fine unless you are somehow running Python on some non-standard OS with bizarre I/O behavior.

You may get unexpected results when writing if you don't flush in a timely manner; data can hang out in memory for some time before it actually hits the disk and is available to the other file descriptors you've opened on the same file. As abarnert points out in a comment, that's problematic anyway, except in very simple cases.

kindall
  • 158,047
  • 31
  • 244
  • 289
  • If you open all of the files read-only, the "flush" issue never comes up. And if you _don't_ open all the files read-only, you'll have more serious problems. – abarnert Jan 24 '13 at 21:48
  • 6
    More importantly, most of the details here are wrong. `file` is _not_ a wrapper for an ordinary file descriptor; it's a wrapper for a `FILE *`. And `tell` and `seek` don't call straight through to the corresponding low-level OS functions; they call the corresponding mid-level `stdio` functions. And this makes a very big difference on Windows, because the rules for how, e.g., file locking works in `fopen` vs. `CreateFile` are completely different. – abarnert Jan 24 '13 at 21:53