12

Introduction

Suppose I have this C code:

#include <stdio.h>

// Of course, these functions are simplified for the purposes of this question.
// The actual functions are more complex and may receive additional arguments.

void printout() {
    puts("Hello");
}
void printhere(FILE* f) {
    fputs("Hello\n", f);
}

That I'm compiling as a shared object (DLL): gcc -Wall -std=c99 -fPIC -shared example.c -o example.so

And then I'm importing it into Python 3.x running inside Jupyter or IPython notebook:

import ctypes
example = ctypes.cdll.LoadLibrary('./example.so')

printout = example.printout
printout.argtypes = ()
printout.restype = None

printhere = example.printhere
printhere.argtypes = (ctypes.c_void_p)  # Should have been FILE* instead
printhere.restype = None

Question

How can I execute both printout() and printhere() C functions (through ctypes) and get the output printed inside the Jupyter/IPython notebook?

If possible, I want to avoid writing more C code. I would prefer a pure-Python solution.

I also would prefer to avoid writing to a temporary file. Writing to a pipe/socket might be reasonable, though.

The the expected state, the current state

If I type the following code in one Notebook cell:

print("Hi")           # Python-style print
printout()            # C-style print
printhere(something)  # C-style print
print("Bye")          # Python-style print

I want to get this output:

Hi
Hello
Hello
Bye

But, instead, I only get the Python-style output results inside the notebook. The C-style output gets printed to the terminal that started the notebook process.

Research

As far as I know, inside Jupyter/IPython notebook, the sys.stdout is not a wrapper to any file:

import sys

sys.stdout

# Output in command-line Python/IPython shell:
<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
# Output in IPython Notebook:
<IPython.kernel.zmq.iostream.OutStream at 0x7f39c6930438>
# Output in Jupyter:
<ipykernel.iostream.OutStream at 0x7f6dc8f2de80>

sys.stdout.fileno()

# Output in command-line Python/IPython shell:
1
# Output in command-line Jupyter and IPython notebook:
UnsupportedOperation: IOStream has no fileno.

Related questions and links:

The following two links use similar solutions that involve creating a temporary file. However, care must be taken when implementing such solution to make sure both Python-style output and C-style output gets printed in the correct order.

Is it possible to avoid a temporary file?

I tried finding a solution using C open_memstream() and assigning the returned FILE* to stdout, but it did not work because stdout cannot be assigned.

Then I tried getting the fileno() of the stream returned by open_memstream(), but I can't because it has no file descriptor.

Then I looked at freopen(), but its API requires passing a filename.

Then I looked at Python's standard library and found tempfile.SpooledTemporaryFile(), which is a temporary file-like object in memory. However, it gets written to the disk as soon as fileno() is called.

So far, I couldn't find any memory-only solution. Most likely, we will need to use a temporary file anyway. (Which is not a big deal, but just some extra overhead and extra cleanup that I'd prefer to avoid.)

It may be possible to use os.pipe(), but that seems difficult to do without forking.

Denilson Sá Maia
  • 40,640
  • 31
  • 100
  • 109
  • You can do something similar to the temporary file approach but making the stdout fd the write end of a pipe instead. Then a separate Python thread can pull data of the read end of the pipe, and send it to the redirected sys.stdout. For most practical cases, this will get the ordering close enough to be useful. If you need it to be more precise, you should set `sys.stdout` and `sys.stderr` back to the originals so that Python output goes through the pipe as well. – Thomas K Mar 02 '16 at 13:02
  • For reading `stdout`, the suggestion from @ThomasK should work. `os.dup2` the write end of the pipe to file 1. For `printhere`, you can call `libc.fdopen(1, b'wb')` to get a new `FILE` for fd 1. Set `restype` to an opaque `FILE` pointer, e.g. `class FILE(ctypes.Structure): pass;` `PFILE = ctypes.POINTER(FILE);` `libc.fdopen.restype = PFILE`. For the C lib, use `libc = ctypes.CDLL(ctypes.util.find_library('c'), use_errno=True)`. If the call fails (i.e. the result is boolean `False`), use `err = ctypes.get_errno();` `raise OSError(err, os.strerror(err))` to raise a formatted exception. – Eryk Sun Mar 02 '16 at 15:02
  • For jupyter/ipython, I guess a solution is to let `sys.stdout, sys.stderr = sys.__stdout__, sys.__stderr__` first, then it becomes the normal case. Finally set them back to stored `ipykernel.iostream.OutStream`s. – Syrtis Major May 14 '17 at 08:09
  • I think that for Jupyter users, @minrk has made what @ThomasK describes in his comment above easy with 'wurlitzer', see [here](https://notebook.community/minrk/wurlitzer/Demo). Easiest: run in a cell `%pip install wurlitzer` and then `%load_ext wurlitzer` (for classic notebooks, at least) before running your C code. (I found this OP while researching [this question](https://discourse.jupyter.org/t/when-calling-printf-by-ctypes-jupyter-does-not-show-output-from-printf/7319?u=fomightez) at the Jupyter Discourse Forum.) – Wayne Dec 31 '20 at 18:58

2 Answers2

9

I've finally developed a solution. It requires wrapping the entire cell inside a context manager (or wrapping only the C code). It also uses a temporary file, since I couldn't find any solution without using one.

The full notebook is available as a GitHub Gist: https://gist.github.com/denilsonsa/9c8f5c44bf2038fd000f


Part 1: Preparing the C library in Python

import ctypes

# use_errno parameter is optional, because I'm not checking errno anyway.
libc = ctypes.CDLL(ctypes.util.find_library('c'), use_errno=True)

class FILE(ctypes.Structure):
    pass

FILE_p = ctypes.POINTER(FILE)

# Alternatively, we can just use:
# FILE_p = ctypes.c_void_p

# These variables, defined inside the C library, are readonly.
cstdin = FILE_p.in_dll(libc, 'stdin')
cstdout = FILE_p.in_dll(libc, 'stdout')
cstderr = FILE_p.in_dll(libc, 'stderr')

# C function to disable buffering.
csetbuf = libc.setbuf
csetbuf.argtypes = (FILE_p, ctypes.c_char_p)
csetbuf.restype = None

# C function to flush the C library buffer.
cfflush = libc.fflush
cfflush.argtypes = (FILE_p,)
cfflush.restype = ctypes.c_int

Part 2: Building our own context manager to capture stdout

import io
import os
import sys
import tempfile
from contextlib import contextmanager

@contextmanager
def capture_c_stdout(encoding='utf8'):
    # Flushing, it's a good practice.
    sys.stdout.flush()
    cfflush(cstdout)

    # We need to use a actual file because we need the file descriptor number.
    with tempfile.TemporaryFile(buffering=0) as temp:
        # Saving a copy of the original stdout.
        prev_sys_stdout = sys.stdout
        prev_stdout_fd = os.dup(1)
        os.close(1)

        # Duplicating the temporary file fd into the stdout fd.
        # In other words, replacing the stdout.
        os.dup2(temp.fileno(), 1)

        # Replacing sys.stdout for Python code.
        #
        # IPython Notebook version of sys.stdout is actually an
        # in-memory OutStream, so it does not have a file descriptor.
        # We need to replace sys.stdout so that interleaved Python
        # and C output gets captured in the correct order.
        #
        # We enable line_buffering to force a flush after each line.
        # And write_through to force all data to be passed through the
        # wrapper directly into the binary temporary file.
        temp_wrapper = io.TextIOWrapper(
            temp, encoding=encoding, line_buffering=True, write_through=True)
        sys.stdout = temp_wrapper

        # Disabling buffering of C stdout.
        csetbuf(cstdout, None)

        yield

        # Must flush to clear the C library buffer.
        cfflush(cstdout)

        # Restoring stdout.
        os.dup2(prev_stdout_fd, 1)
        os.close(prev_stdout_fd)
        sys.stdout = prev_sys_stdout

        # Printing the captured output.
        temp_wrapper.seek(0)
        print(temp_wrapper.read(), end='')

Part Fun: Using it!

libfoo = ctypes.CDLL('./foo.so')

printout = libfoo.printout
printout.argtypes = ()
printout.restype = None

printhere = libfoo.printhere
printhere.argtypes = (FILE_p,)
printhere.restype = None


print('Python Before capturing')
printout()  # Not captured, goes to the terminal

with capture_c_stdout():
    print('Python First')
    printout()
    print('Python Second')
    printhere(cstdout)
    print('Python Third')

print('Python After capturing')
printout()  # Not captured, goes to the terminal

Output:

Python Before capturing
Python First
C printout puts
Python Second
C printhere fputs
Python Third
Python After capturing

Credits and further work

This solution is fruit of reading all the links I linked at the question, plus a lot of trial and error.

This solution only redirects stdout, it could be interesting to redirect both stdout and stderr. For now, I'm leaving this as an exercise to the reader. ;)

Also, there is no exception handling in this solution (at least not yet).

Denilson Sá Maia
  • 40,640
  • 31
  • 100
  • 109
  • You can also look 'python.boost' to combine C++ with python. – Konstantin Purtov Mar 02 '16 at 18:48
  • `cstdin = FILE_p.in_dll(libc, 'stdin')`: this is non-portable, so I hope you're only planning to support your development (and similar) platforms. You can't use this on Windows. – Eryk Sun Mar 02 '16 at 18:59
  • You should be able to use the write end of a pipe from `os.pipe` instead of a temp file. But a pipe needs a worker thread to read it into the capture buffer, else it blocks when full. – Eryk Sun Mar 02 '16 at 19:08
  • @eryksun: Indeed, this code works on Linux and maybe other POSIX systems. Please, feel free to suggest alternatives to make it more portable. To be honest, I don't even know what makes it not working on Windows. Also, if you can find a way to use `os.pipe` and a worker thread, please, submit another answer! I tried using `os.pipe` and `os.fork`, but the results were disastrous (but I may have done something wrong; and I don't have that non-working code anymore). – Denilson Sá Maia Mar 03 '16 at 01:45
  • It's not just a matter of Windows. The C standard doesn't specify that `stdin`, `stdout`, and `stderr` are symbols that can be accessed via `dlsym`, `GetProcAddress`, and so on. So each platform's C runtime is free to implement this as it sees fit. For Windows, prior to the Universal CRT (used by 3.5+) you had to know the layout of a `FILE`, or at least its size. With the new universal CRT it's done via an `__acrt_iob_func` function that takes the file number 0, 1, or 2 and returns the `FILE` pointer. – Eryk Sun Mar 03 '16 at 02:49
  • 1
    Also, `find_library('c')` no longer works for Windows Python 3.5+. It returns `None`. You have to hard code `libc = CDLL('ucrtbase', use_errno=True)`. – Eryk Sun Mar 03 '16 at 02:56
  • To get the correct library on Windows, see @ErykSun's excellent answer to https://stackoverflow.com/questions/17942874/stdout-redirection-with-ctypes. When there are multiple possible files this can be confusing, in my case I tried with `msvcrt` first, although it had `fflush` the flushing commands did not work since it was not the C runtime used by the DLL – prusswan Sep 09 '20 at 21:10
1

I spend a whole afternoon to revise it for python2, damn, it's tricky, the key is to reopen the tempfile with io.open Then I try a better solution, just write a Logger class for python stdout

# -*- coding: utf-8 -*-

import ctypes
# from ctypes import *
from ctypes import util

# use_errno parameter is optional, because I'm not checking errno anyway.
libraryC = ctypes.util.find_library('c')
libc = ctypes.CDLL(libraryC, use_errno=True)


# libc = cdll.msvcrt


class FILE(ctypes.Structure):
    pass


FILE_p = ctypes.POINTER(FILE)

# Alternatively, we can just use:
# FILE_p = ctypes.c_void_p

# These variables, defined inside the C library, are readonly.
##cstdin = FILE_p.in_dll(libc, 'stdin')
##cstdout = FILE_p.in_dll(libc, 'stdout')
##cstderr = FILE_p.in_dll(libc, 'stderr')

# C function to disable buffering.
csetbuf = libc.setbuf
csetbuf.argtypes = (FILE_p, ctypes.c_char_p)
csetbuf.restype = None

# C function to flush the C library buffer.
cfflush = libc.fflush
cfflush.argtypes = (FILE_p,)
cfflush.restype = ctypes.c_int

import io
import os
import sys
import tempfile
from contextlib import contextmanager
#import cStringIO


def read_as_encoding(fileno, encoding="utf-8"):
    fp = io.open(fileno, mode="r+", encoding=encoding, closefd=False)
    return fp


class Logger(object):
    def __init__(self, file, encoding='utf-8'):
        self.file = file
        self.encoding = encoding

    def write(self, message):
        self.file.flush()  # Meed to flush
        # python2 temp file is always binary
        # msg_unicode = message.('utf-8')
        self.file.write(message)


@contextmanager
def capture_c_stdout(on_output, on_error=None, encoding='utf8'):
    # Flushing, it's a good practice.
    sys.stdout.flush()
    sys.stderr.flush()
    ##cfflush(cstdout)
    # cfflush(cstdcerr)

    # We need to use a actual file because we need the file descriptor number.
    with tempfile.NamedTemporaryFile() as temp:
        with tempfile.NamedTemporaryFile() as temp_err:
            # print "TempName:", temp.name
            # print "TempErrName:", temp_err.name

            # Saving a copy of the original stdout.
            prev_sys_stdout = sys.stdout
            prev_stdout_fd = os.dup(1)
            os.close(1)
            # Duplicating the temporary file fd into the stdout fd.
            # In other words, replacing the stdout.
            os.dup2(temp.fileno(), 1)

            if on_error:
                prev_sys_stderr = sys.stderr
                prev_stderr_fd = os.dup(2)
                os.close(2)
                os.dup2(temp_err.fileno(), 2)

            # Replacing sys.stdout for Python code.
            #
            # IPython Notebook version of sys.stdout is actually an
            # in-memory OutStream, so it does not have a file descriptor.
            # We need to replace sys.stdout so that interleaved Python
            # and C output gets captured in the correct order.
            #
            # We enable line_buffering to force a flush after each line.
            # And write_through to force all data to be passed through the
            # wrapper directly into the binary temporary file.
            # No need to use TextIOWrapper in python2, in python2, tempFile is always binary according to official document
            ##temp_wrapper = io.TextIOWrapper(
            ##   read_as_encoding(temp.fileno(), encoding=encoding), encoding=encoding, line_buffering=True) ##, write_through=True)

            # temp_wrapper_python = io.TextIOWrapper(
            #    read_as_encoding(temp.fileno(), encoding=encoding), encoding='ascii', line_buffering=True)
            temp_wrapper_python = Logger(temp, encoding=encoding)
            sys.stdout = temp_wrapper_python

            if on_error:
                # temp_wrapper_err = io.TextIOWrapper(
                #   read_as_encoding(temp_err.fileno(), encoding=encoding), encoding=encoding, line_buffering=True) ##, write_through=True)
                temp_wrapper_python_err = Logger(temp_err, encoding=encoding)
                # string_str_err = cStringIO.StringIO()
                sys.stderr = temp_wrapper_python_err

            # Disabling buffering of C stdout.
            ##csetbuf(cstdout, None)

            yield

            # Must flush to clear the C library buffer.
            ##cfflush(cstdout)

            # Restoring stdout.
            os.dup2(prev_stdout_fd, 1)
            os.close(prev_stdout_fd)
            sys.stdout = prev_sys_stdout

            if on_error:
                os.dup2(prev_stderr_fd, 2)
                os.close(prev_stderr_fd)
                sys.stderr = prev_sys_stderr

            # Printing the captured output.
            # temp_wrapper.seek(0)
            # print "Reading: "
            # print temp_wrapper.read()
            if on_output:
                temp.flush()
                temp.seek(0)
                on_output(temp.read())
            temp.close()

            if on_error:
                temp_err.flush()
                temp_err.seek(0)
                on_error(temp_err.read())
                temp_err.close()


import repo_checker_cpp


def on_capture_output(input_stream):
    if input_stream:
        print "Here is captured stdout: \n", input_stream


def on_capture_err(input_stream):
    if input_stream:
        print "Here is captured stderr: \n", input_stream


if __name__ == '__main__':
    with capture_c_stdout(on_capture_output, on_capture_err) as custom_output:  # redirection here
        # repo_checker_cpp is a ctypes.CDll module
        print >> sys.stderr, "Hello World in python err\n"
        repo_checker_cpp.test_exception()  # throw an exception an capture inside cpp module then output to std::cerr
        print "Hello World in python\n"
        repo_checker_cpp.hello_world()  # simple std::cout << "Hello World" << std::endl; std::cerr << "Hello World in cerr" << std::endl;


I can't get cstdin = FILE_p.in_dll(libc, 'stdin') alike lines working. I comment them with ## to indicate they are originally written by Denilson. And thank Denilson for your work.

It works fine in my Window10 + python 2.7, outputs:

Here is captured stdout: 
Hello World in python
Hello World(C++)


Here is captured stderr: 
Hello World in python err
RepoCheckCpp_TestException, Reason: ensure failed : false
xxxxx\repocheckercpp.cpp(38)
context variables:
    error : This is a test exception


Hello World(C++) in cerr

Everything is perfectly captured

Alen Wesker
  • 125
  • 5
  • Plus, you can also refer to : https://stackoverflow.com/questions/24277488/in-python-how-to-capture-the-stdout-from-a-c-shared-library-to-a-variable and https://eli.thegreenplace.net/2015/redirecting-all-kinds-of-stdout-in-python/ But I dont like the idea of using pipe, because if you do not drain pipe on time , the pipe will jam the origin c process. And python has GIL, the pip solution may works for a less complicated project, and since I have encountered the jam problem, I prefer using temp file instead of a pipe. – Alen Wesker Jun 27 '19 at 03:50
  • Was able to get your version to work with Python 3 + cffi on Windows 10. Somehow Denilson's version did not work for me – prusswan Sep 09 '20 at 17:38
  • Also, flushing is necessary depending on the behavior of the DLL you are working with, otherwise not all the output will be redirected. For current versions of Python (3.5+) on Windows, refer to https://stackoverflow.com/questions/17942874/stdout-redirection-with-ctypes on which C runtime to load in order for `libc.fflush` to work. – prusswan Sep 09 '20 at 21:17