Application appears to run multiple times depending on input size

Question

I have 2 python scripts https://gist.github.com/2233477.

rsgen.py generates "random" inputs for use in simulate.py
simulate.py does the actual simulation

Thing is, when I start to increase the input size from rsgen.py with the --numReferences param, I get different outputs

# ./rsgen.py --numReferences 1000 > rs.txt; cat rs.txt | xargs ./simulate.py
Number of page faults : 59

# ./rsgen.py --numReferences 100000 > rs.txt; cat rs.txt | xargs ./simulate.py
Number of page faults : 873
Number of page faults : 848
Number of page faults : 823
Number of page faults : 103

./rsgen.py --numReferences 1000000 > rs.txt; cat rs.txt | xargs ./simulate.py
Number of page faults : 866
Number of page faults : 869
Number of page faults : 876
Number of page faults : 907
Number of page faults : 910
Number of page faults : 1001
Number of page faults : 845
...

Notice as I increase numReferences, the python script simulate appears to run more times. Why is that? I am expecting just 1 line of "Number of page faults: ..."

score 2 · Accepted Answer · answered Mar 29 '12 at 06:24

2

This probably has something to do with xargs' ARG_MAX which defines a batch size for how many args to send to an executable; hence why multiple invocations of your script since it is splitting up the args across multiple calls.

Try the -n (or --max-args) flag of xargs .

A better way alltogether would be to have simulate.py accept a file argument so you could do something like this:

./rsgen.py --numReferences N > rs.txt; 
./simulate.py -f rs.txt

It would probably be a lot faster since it avoids the xargs overhead

answered Mar 29 '12 at 06:24

Preet Kukreti

7,997
25
34

Ok so I use the file a [`FileType`](http://docs.python.org/dev/library/argparse.html#filetype-objects). But how do I read from that? Sorry I am new to Python and it doesn't appear obvious how I might read/write to that `FileType` object – Jiew Meng Mar 29 '12 at 11:33
@JiewMeng You would read the filename in as a string and then `open(filename)` to get a file object, which you can then do `.readlines()` on and you must `.close()` once you are finished reading. See [this question](http://stackoverflow.com/a/8010133/1086804) – Preet Kukreti Mar 29 '12 at 12:17
Oh, but then how does the `FileType` object come into play? It appears that I could just use a string variable then? – Jiew Meng Mar 30 '12 at 00:22
`FileType` in `argparse` is just a syntactic shortcut that automatically opens the file via command line argumente ready for reading. If you are new to handling files, I would actually suggest you dont use `FileType` in `argparse` and instead open the file "the old way" via `open()` using a filename string (which you read from an argument), since there is much more documentation and examples to help you if you get stuck. – Preet Kukreti Mar 30 '12 at 00:29
Ok I found that `FileType` will return a [`TextIOWrapper`](http://docs.python.org/library/io.html#io.TextIOWrapper). And I can call `read()` or `readLine()` methods – Jiew Meng Mar 31 '12 at 01:50

Application appears to run multiple times depending on input size

1 Answers1