I've been trying to get a standardized estimate of FLOPS across all of the computers that I've implemented a Python distributed processing program on. While I currently can calculate pystones quite fine, pystones are not particularly well known, and I'm not entirely sure how accurate they really are.
Thus, I need a way to calculate (or a module that already does it) FLOPS on a variety of machines, which may have any variety of CPU's, etc. Seeing as Python is an interpreted language, simply counting the time it takes to do a set number of operations won't perform on the level of, say, Linpack. While I don't particularly need to have the exact same estimates as one of the big 'names' in benchmarking, I'd like it to be reasonably close at least.
Thus, is there a way, or pre-existing module to allow me to get FLOPS? Otherwise, my only choice will be compiling into Cython, or trying to estimate the capabilities based on CPU clock speed...