1

I'm learning subprocess, but I have a little confusion with this code:

import subprocess

proc = subprocess.Popen('lspci', stdout=subprocess.PIPE)
for line in proc.stdout:
    print(line)

Output:

b'00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller (rev 09)\n'
b'00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09)\n'

As you can see, the output is formated. But I dont know why there is the character b'' and the \n at the end.

If I run this command in my terminal, there aren't these char.

Normal output:

00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09)

How could I remove them?

Casimir Crystal
  • 18,651
  • 14
  • 55
  • 76
skami
  • 43
  • 1
  • 7
  • You could try change `print(line)` to `print(type(line))` and `print(type(line.decode()))` and see the output, as the answers say, `line` is a byte object. – Casimir Crystal Oct 30 '15 at 12:13

3 Answers3

2

You're probably using python3 - python changed up the way certain objects read/write data, and now there's a real bytes() object. To get the string you want, you just need:

print(line.decode("utf8")) ## or some encoding; that one should print anything though

You may also need to strip the newline (\n) from your output; I can't remember how stdout does the buffering/reporting:

print(line.decode("utf8").strip())
dwanderson
  • 2,345
  • 1
  • 19
  • 35
  • 1
    Oh is it? I didn't know for certain, I just ran into this once and a coworker said `iso 8859-1` "should generally do what I want". But if there's an actual recommendation, let's go with that instead. Thanks! – dwanderson Oct 30 '15 at 12:03
  • Actually there is lots of encode here. If you'd like learn more, [here's the Python document](https://docs.python.org/3/howto/unicode.html) :) – Casimir Crystal Oct 30 '15 at 12:06
  • 1
    @KevinGuan It's not that `utf8` is recommended; it depends entirely on what the command is producing. You have to know if the output uses ISO-8859-1, or UTF-8, or some other encoding for its output, and use the correct character set to decode it. – chepner Oct 30 '15 at 12:13
  • @chepner Oh, you're right. That depends on the subprocess's output encoding :) – Casimir Crystal Oct 30 '15 at 12:23
  • Yes i'm using python3. it's first time i'm using it. Thanks for your code. Its working great ;) – skami Oct 30 '15 at 13:10
  • @skami Yeah, for the most part, python2.7 and python3.3+ are pretty similar (to the average user), but there are a few gotchas, and dealing with `bytes` vs `str`s is definitely one of them that pops up all the time. – dwanderson Oct 30 '15 at 13:47
  • @dwanderson: the character encoding may depend on locale, to avoid mojibake, [enable text mode by passing `universal_newlines=True`](http://stackoverflow.com/a/33453867/4279) – jfs Oct 31 '15 at 16:32
1

b'' is a text representation for bytes objects in Python 3.

To print bytes as is, use a binary stream -- sys.stdout.buffer:

#!/usr/bin/env python3
import sys
from subprocess import Popen, PIPE

with Popen('lspci', stdout=PIPE, bufsize=1) as process:
    for line in process.stdout: # b'\n'-terminated lines
        sys.stdout.buffer.write(line)
        # do something with line here..

To get the output as text (Unicode string), you could use universal_newlines=True parameter:

#!/usr/bin/env python3
from subprocess import Popen, PIPE

with Popen('lspci', stdout=PIPE, bufsize=1, universal_newlines=True) as process:
    for line in process.stdout: # b'\n', b'\r\n', b'\r' are recognized as newline
        print(line, end='')
        # do something with line here..

locale.getpreferredencoding(False) character encoding is used to decode the output.

If the child process uses a different encoding, then you could specify it explicitly using io.TextIOWrapper():

#!/usr/bin/env python3
import io
from subprocess import Popen, PIPE

with Popen('lspci', stdout=PIPE, bufsize=1) as process:
    for line in io.TextIOWrapper(process.stdout, encoding='utf-8'):
        print(line, end='')
        # do something with line here..

For Python 2 code and links to possible issues, see Python: read streaming input from subprocess.communicate()

Community
  • 1
  • 1
jfs
  • 346,887
  • 152
  • 868
  • 1,518
0

I think you use python 3:

b is for Bytes, and it indicates that it is a byte sequence which is equivilent to a normal string in Python 2.6+

see https://docs.python.org/3/reference/lexical_analysis.html#literals

A.H
  • 946
  • 4
  • 14