2

I have a number of python scripts I want to pipe back to back about 1000 times, changing the input file for each

I was previously doing this with a bash shell script, but I need it to work on a windows machine now.

Here is the python, with the line in question commented out

namecount = 0
for file in files:
     in_filestring = "test_" + str(namecount)
     out_filestring = "out_ + str(namecount)
     namecount += 1
     #Run this on the command line: python pre.py < in_filestring | filter.py | a_filter.py > out_filestring

Can I use this here or is there a better method? I ask because I am currently reading subprocess http://docs.python.org/2/library/subprocess.html. Apparently it replaces the outdated os.system, but I don't understand how to use it yet.

import os
os.system('system command you want to run')
SwimBikeRun
  • 3,456
  • 9
  • 42
  • 73

3 Answers3

1

subprocess.call should be fine. The basic is,

call(["args" in comma separated])

Here is the link http://docs.python.org/2/library/subprocess.html#using-the-subprocess-module.

In your case,try something like this,

from subprocess import call
...
...
call(["python", "pre.py", "<", filestring, "|", "filter.py", "|", "a_filter.py", ">", "out_filestring"])
Naffi
  • 618
  • 1
  • 5
  • 13
1

For calling multiple programs connected by pipes, os.system is the easiest way. You could also use subprocess.Popen, but then you have to connect inputs and outputs yourself like this:

p = subprocess.Popen("echo 'asdf'".split(), stdout=subprocess.PIPE)
q = subprocess.Popen("sed s/a/g/".split(), stdin=p.stdout, stdout=subprocess.PIPE)
q.stdout.read()

There is a comprehensive answer to a similar question.

But, since you want to call python programs, you could check if they can be used within your process.

If they aren't doing that already, you can convert them to functions using a generator as input and output. Then you can wire them up like this:

output_file.writelines(a_filter(filter(pre(input_file)))

That saves you the overhead of starting a thousand processes. As a bonus, you could use the multiprocessing module's pool to parallelize your workload.

Community
  • 1
  • 1
Thomas Fenzl
  • 4,167
  • 1
  • 15
  • 25
0

os.system() has a problem that it prints the commandline output directly, though you don't want it to be printed. Eg)

If you want to execute ls command and save the output to a file or a variable, system() doesn't help. use

Popen

This Popen really makes os.system() outdated. This is a bit tougher to understand, but it is more useful.

Aswin Murugesh
  • 9,508
  • 10
  • 34
  • 65