140

I'm using a Python library that does something to an object

do_something(my_object)

and changes it. While doing so, it prints some statistics to stdout, and I'd like to get a grip on this information. The proper solution would be to change do_something() to return the relevant information,

out = do_something(my_object)

but it will be a while before the devs of do_something() get to this issue. As a workaround, I thought about parsing whatever do_something() writes to stdout.

How can I capture stdout output between two points in the code, e.g.,

start_capturing()
do_something(my_object)
out = end_capturing()

?

Nico Schlömer
  • 37,093
  • 21
  • 139
  • 189

3 Answers3

216

Try this context manager:

from io import StringIO 
import sys

class Capturing(list):
    def __enter__(self):
        self._stdout = sys.stdout
        sys.stdout = self._stringio = StringIO()
        return self
    def __exit__(self, *args):
        self.extend(self._stringio.getvalue().splitlines())
        del self._stringio    # free up some memory
        sys.stdout = self._stdout

Usage:

with Capturing() as output:
    do_something(my_object)

output is now a list containing the lines printed by the function call.

Advanced usage:

What may not be obvious is that this can be done more than once and the results concatenated:

with Capturing() as output:
    print('hello world')

print('displays on screen')

with Capturing(output) as output:  # note the constructor argument
    print('hello world2')

print('done')
print('output:', output)

Output:

displays on screen                     
done                                   
output: ['hello world', 'hello world2']

Update: They added redirect_stdout() to contextlib in Python 3.4 (along with redirect_stderr()). So you could use io.StringIO with that to achieve a similar result (though Capturing being a list as well as a context manager is arguably more convenient).

Antti Haapala
  • 117,318
  • 21
  • 243
  • 279
kindall
  • 158,047
  • 31
  • 244
  • 289
  • Thanks! And thanks for adding the advanced section... I originally used a slice assignment to stick the captured text into the list, then I bonked myself in the head and used `.extend()` instead so it could be used concatenatively, just as you noticed. :-) – kindall May 15 '13 at 21:58
  • P.S. If it's going to be used repeatedly, I'd suggest adding a `self._stringio.truncate(0)` after the `self.extend()` call in the `__exit__()` method to release some of the memory held by the `_stringio` member. – martineau May 16 '13 at 00:43
  • @martineau A new `StringIO` instance will be created each time the context manager is used anyway, and the old one will go away at that time, but you could set `self._stringio` to `None` (or `del` it) in `__exit__()` if the interim memory usage is a problem. – kindall May 16 '13 at 14:59
  • Oh, yeah, that's right...my mistake. The previous contents were interfering with something experimental I was doing to the first instance before the second was created, which (mis)led me into thinking the results would accumulate. The fact that both have the same name compounded the illusion. ;-) – martineau May 16 '13 at 15:27
  • @kindall That's a very smart solution! Works like a charm, even if in my case I had to redirect `stderr` as well, as I wanted to capture `argsparse` error message. – Joël Mar 26 '14 at 14:27
  • 26
    Great answer, thanks. For Python 3, use `from io import StringIO` instead of first line in context manager. – Wtower Nov 02 '15 at 12:33
  • 1
    Is this thread-safe? What happens if some other thread/call uses print() while do_something runs? – Derorrist Nov 17 '15 at 09:42
  • Answering myself.. Not thread-safe. If another thread calls for print, it will trickle into 'output' – Derorrist Nov 17 '15 at 10:27
  • Yep, not thread-safe, although printing to stdout can itself get a little wonky with threads anyway. – kindall Nov 17 '15 at 17:16
  • Unfortunately this doesn't work with C modules (such as `fontforge`) – Clément Dec 05 '15 at 17:45
  • @kindall, this is a great idea. Thank you for this! – kennes Jun 16 '16 at 20:34
  • Is the `del` statement necessary? I would have thought that after `__exit__` is finished, everything goes out of scope and will be promptly garbage collected. – user1071847 Oct 18 '17 at 13:19
  • Why would it go out of scope? – kindall Oct 18 '17 at 15:56
  • 1
    This answer will not work for output from C shared libraries, see [this answer](https://stackoverflow.com/questions/24277488/in-python-how-to-capture-the-stdout-from-a-c-shared-library-to-a-variable/29834357) instead. – craymichael Aug 13 '18 at 21:03
  • Fantastic, exactly what I needed for my use case. – Btibert3 Oct 18 '20 at 02:27
  • is it possible to send the `output ` to another object in real-time ? – Pablo Apr 28 '21 at 20:17
115

In python >= 3.4, contextlib contains a redirect_stdout decorator. It can be used to answer your question like so:

import io
from contextlib import redirect_stdout

f = io.StringIO()
with redirect_stdout(f):
    do_something(my_object)
out = f.getvalue()

From the docs:

Context manager for temporarily redirecting sys.stdout to another file or file-like object.

This tool adds flexibility to existing functions or classes whose output is hardwired to stdout.

For example, the output of help() normally is sent to sys.stdout. You can capture that output in a string by redirecting the output to an io.StringIO object:

  f = io.StringIO() 
  with redirect_stdout(f):
      help(pow) 
  s = f.getvalue()

To send the output of help() to a file on disk, redirect the output to a regular file:

 with open('help.txt', 'w') as f:
     with redirect_stdout(f):
         help(pow)

To send the output of help() to sys.stderr:

with redirect_stdout(sys.stderr):
    help(pow)

Note that the global side effect on sys.stdout means that this context manager is not suitable for use in library code and most threaded applications. It also has no effect on the output of subprocesses. However, it is still a useful approach for many utility scripts.

This context manager is reentrant.

ForeverWintr
  • 4,154
  • 2
  • 36
  • 60
  • when tried `f = io.StringIO() with redirect_stdout(f): logger = getLogger('test_logger') logger.debug('Test debug message') out = f.getvalue() self.assertEqual(out, 'DEBUG:test_logger:Test debug message')` . It gives me an error: `AssertionError: '' != 'Test debug message'` – Eziz Durdyyev Dec 12 '19 at 15:19
  • which means i did something wrong or it could not catch stdout log. – Eziz Durdyyev Dec 12 '19 at 15:21
  • @EzizDurdyyev, `logger.debug` doesn't write to stdout by default. If you replace your log call with `print()` you should see the message. – ForeverWintr Dec 12 '19 at 20:51
  • Yeah, I know, but I do make it write to stdout like so: `stream_handler = logging.StreamHandler(sys.stdout)`. And add that handler to my logger. so it should write to stdout and `redirect_stdout` should catch it, right? – Eziz Durdyyev Dec 13 '19 at 09:55
  • I suspect the issue is with the way you've configured your logger. I would verify that it prints to stdout without the redirect_stdout. If it does, maybe the buffer isn't being flushed until the context manager exits. – ForeverWintr Dec 13 '19 at 17:14
  • When your code does not redirect either std_out or std_err (I mean: I still have output to console and nothing is saved to disk), what does it mean? – Antonio Sesto May 14 '21 at 10:03
1

Here is an async solution using file pipes.

import threading
import sys
import os

class Capturing():
    def __init__(self):
        self._stdout = None
        self._stderr = None
        self._r = None
        self._w = None
        self._thread = None
        self._on_readline_cb = None

    def _handler(self):
        while not self._w.closed:
            try:
                while True:
                    line = self._r.readline()
                    if len(line) == 0: break
                    if self._on_readline_cb: self._on_readline_cb(line)
            except:
                break

    def print(self, s, end=""):
        print(s, file=self._stdout, end=end)

    def on_readline(self, callback):
        self._on_readline_cb = callback

    def start(self):
        self._stdout = sys.stdout
        self._stderr = sys.stderr
        r, w = os.pipe()
        r, w = os.fdopen(r, 'r'), os.fdopen(w, 'w', 1)
        self._r = r
        self._w = w
        sys.stdout = self._w
        sys.stderr = self._w
        self._thread = threading.Thread(target=self._handler)
        self._thread.start()

    def stop(self):
        self._w.close()
        if self._thread: self._thread.join()
        self._r.close()
        sys.stdout = self._stdout
        sys.stderr = self._stderr

Example usage:

from Capturing import *
import time

capturing = Capturing()

def on_read(line):
    # do something with the line
    capturing.print("got line: "+line)

capturing.on_readline(on_read)
capturing.start()
print("hello 1")
time.sleep(1)
print("hello 2")
time.sleep(1)
print("hello 3")
capturing.stop()
miXo
  • 143
  • 11