77

So for creating files I use the following:

fileHandle = open('fileName', 'w')

then write the contents to the file, close the file. In the next step I process the file. At the end of the program, I end up with a "physical file" that I need to delete.

Is there a way to write a "virtual" file that behaves exactly like a "physical" one (allowing it to be manipulated the same way) but does not exist at the end of the run in Python?

martineau
  • 99,260
  • 22
  • 139
  • 249
Steve Grafton
  • 1,391
  • 3
  • 13
  • 18

5 Answers5

75

You might want to consider using a tempfile.SpooledTemporaryFile which gives you the best of both worlds in the sense that it will create a temporary memory-based virtual file initially but will automatically switch to a physical disk-based file if the data held in memory exceeds a specified size.

Another nice feature is that (when using memory) it will automatically use either an io.BytesIO or io.StringIO depending on what mode is being used—allowing you to either read and write Unicode strings or binary data (bytes) to it.

The only tricky part might be the fact that you'll need to avoid closing the file between steps because doing so would cause it to be deleted from memory or disk. Instead you can just rewind it back to the beginning with a file seek(0) method call.

When you are completely done with the file and close it, it will automatically be deleted from disk if the amount of data in it caused it to be rolled-over to a physical file.

martineau
  • 99,260
  • 22
  • 139
  • 249
  • Relevant examples: https://stackoverflow.com/questions/8577137/creating-a-tmp-file-in-python – Anton Tarasenko Oct 24 '17 at 14:33
  • 3
    Coming here from another question, it is worth noting that there is no filename for this temporary in-memory file (one needs to operate on the handler). The solution is great for OP usage though (+1). What is unfortunately missing in the module is `tempfile.NamedSpooledTemporaryFile()` (a combination of `NamedTemporaryFile()` and `SpooledTemporaryFile()`) – WoJ Nov 09 '18 at 14:16
  • @WoJ: Thanks for the +1. I think the reason there's no `tempfile.NamedSpooledTemporaryFile()` is that it doesn't make sense to say something that starts out as a memory-based virtual file could be guaranteed to have a visible name in the file system—although It _might_ have one at some point should its size exceeds the specified `max_size` threshold. The source code for the `tempfile` module is in `python/Lib/tempfile.py`, which might be helpful if you wanted to implement something yourself with the desired behavior (whatever that might be when the data is currently in memory). – martineau Nov 09 '18 at 18:37
53

You have StringIO and BytesIO in the io module.

StringIO behaves like a file opened in text mode - reading and writing unicode strings (equivalent to opening a file with io.open(filename, mode, encoding='...')), and the BytesIO behaves like a file opened in binary mode (mode='[rw]b'), and can read write bytes.

Python 2:

In [4]: f = io.BytesIO('test')
In [5]: type(f.read())
Out[5]: str
In [6]: f = io.StringIO(u'test')
In [7]: type(f.read())
Out[7]: unicode

Python 3:

In [2]: f = io.BytesIO(b'test')
In [3]: type(f.read())
Out[3]: builtins.bytes
In [4]: f = io.StringIO('test')
In [5]: type(f.read())
Out[5]: builtins.str
Viktor Kerkez
  • 38,587
  • 11
  • 96
  • 81
  • 11
    It should be noted that should you need to interface with code that needs filenames, then: [If all your legacy code can take is a filename, then a `StringIO` instance is not the way to go. Use the `tempfile` module to generate a temporary filename instead.](http://stackoverflow.com/questions/11892623/python-stringio-and-compatibility-with-with-statement-context-manager/11892712#11892712) – sdaau Jun 29 '14 at 20:17
13

You can use StringIO as a virtual file , from official documentation

from io import StringIO

output = StringIO()
output.write('First line.\n')
print >>output, 'Second line.'

# Retrieve file contents -- this will be
# 'First line.\nSecond line.\n'
contents = output.getvalue()

# Close object and discard memory buffer --
# .getvalue() will now raise an exception.
output.close()
Max Schmitt
  • 606
  • 1
  • 4
  • 12
0

There is the StringIO module, read its documentation, it should be easy to use.

Bear in mind, though, that this would keep the "file's" contents in memory. If you have too much data, it would probably be better to create a real file, e.g. in /tmp, and delete it afterwards.

nickie
  • 5,068
  • 2
  • 20
  • 35
0

If you mean writing to memory instead of a file, you can simply write the text to a buffer and use the following function:

def write(text):
  global buffer
  buffer += text + '\n'  # Add a linefeed as you would if you were writing to a file

buffer = ""  # Initialize the buffer
write("My name is Steve Grafton")

At the end, you will have a buffer that will be the same as if you had written your stuff to a file and then open the file and read all its contents to a buffer! Moreover, you can use the buffer during the process (before having finished your writing) and do searches in it, as if you had created a file for both reading and writing, only that in this case your pointer will

Apostolos
  • 2,307
  • 18
  • 20