Questions tagged [flops]

FLOPS (FLoating point Operations Per Second): a unit of measurement used to quantify the performance of the implementation of a numerical algorithm.

Anything related to the FLOPS unit of measurement (FLoating point Operations Per Second), i.e. a unit of measurement used to quantify the performance of the implementation of a numerical algorithm.

See Wikipedia page on FLOPS.

121 questions
667
votes
4 answers

How do I achieve the theoretical maximum of 4 FLOPs per cycle?

How can the theoretical peak performance of 4 floating point operations (double precision) per cycle be achieved on a modern x86-64 Intel CPU? As far as I understand it takes three cycles for an SSE add and five cycles for a mul to complete on most…
user1059432
  • 7,158
  • 3
  • 17
  • 16
58
votes
2 answers

FLOPS per cycle for sandy-bridge and haswell SSE2/AVX/AVX2

I'm confused on how many flops per cycle per core can be done with Sandy-Bridge and Haswell. As I understand it with SSE it should be 4 flops per cycle per core for SSE and 8 flops per cycle per core for AVX/AVX2. This seems to be verified…
user2088790
46
votes
9 answers

What is FLOP/s and is it a good measure of performance?

I've been asked to measure the performance of a fortran program that solves differential equations on a multi-CPU system. My employer insists that I measure FLOP/s (Floating operations per second) and compare the results with benchmarks (LINPACK)…
caglarozdag
  • 639
  • 1
  • 8
  • 13
29
votes
6 answers

What's the relative speed of floating point add vs. floating point multiply

A decade or two ago, it was worthwhile to write numerical code to avoid using multiplies and divides and use addition and subtraction instead. A good example is using forward differences to evaluate a polynomial curve instead of computing the…
J. Peterson
  • 1,854
  • 1
  • 24
  • 19
15
votes
3 answers

What is FLOPS in field of deep learning?

What is FLOPS in field of deep learning? Why we don't use the term just FLO? We use the term FLOPS to measure the number of operations of a frozen deep learning network. Following Wikipedia, FLOPS = floating point operations per second. When we test…
ladofa
  • 313
  • 1
  • 2
  • 9
13
votes
4 answers

how to calculate a Mobilenet FLOPs in Keras

run_meta = tf.RunMetadata() enter codwith tf.Session(graph=tf.Graph()) as sess: K.set_session(sess) with tf.device('/cpu:0'): base_model = MobileNet(alpha=1, weights=None, input_tensor=tf.placeholder('float32', shape=(1,224,224,3))) opts…
Y. Han
  • 131
  • 1
  • 4
12
votes
7 answers

How to compare performance of two pieces of codes

I have a friendly competition with couple of guys in the field of programming and recently we have become so interested in writing efficient code. Our challenge was to try to optimize the code (in sense of cpu time and complexity) at any cost…
Pouya
  • 1,187
  • 2
  • 17
  • 40
11
votes
1 answer

Counting the number of multiply-add operations (MAC) in Caffe CNN's architecture

Lately I've been benchmarking some CNNs regarding time, # of multiply-add operations (MAC), # of parameters and model size. I have seen some similar SO questions (here and here) and in the latter, they suggest using Netscope CNN Analyzer. This tool…
rafaspadilha
  • 609
  • 6
  • 20
10
votes
3 answers

How many FLOPs does tanh need?

I would like to compute how many flops each layer of LeNet-5 (paper) needs. Some papers give FLOPs for other architectures in total (1, 2, 3) However, those papers don't give details on how to compute the number of FLOPs and I have no idea how many…
Martin Thoma
  • 91,837
  • 114
  • 489
  • 768
10
votes
2 answers

Determine FLOPS of our ASM program

We had to implement an ASM program for multiplying sparse matrices in the coordinate scheme format (COOS) as well as in the compressed row format (CSR). Now that we have implemented all these algorithms we want to know how much more performant they…
tzwickl
  • 1,183
  • 1
  • 12
  • 28
9
votes
3 answers

Do RFID tags have a processor?

Do RFID tags have a "real" processor capable of simple computations? If so, what is the processing power of nowadays RFID processors?
qertoip
  • 1,780
  • 1
  • 16
  • 28
8
votes
5 answers

How to measure FLOPS

How do I measure FLOPS or IOPS? If I do measure time for ordinary floating point addition / multiplication , is it equivalent to FLOPS?
Madhumitha B
  • 123
  • 1
  • 2
  • 10
8
votes
5 answers

What counts as a flop?

Say I have a C program that in pseudoish is: For i=0 to 10 x++ a=2+x*5 next Is the number of FLOPs for this (1 [x++] + 1 [x*5] + 1 [2+(x+5))] * 10[loop], for 30 FLOPS? I am having trouble understanding what a flop is. Note the [...] are…
Joshua Enfield
  • 15,822
  • 9
  • 45
  • 91
8
votes
1 answer

floating point operations per cycle - intel

I have been looking for quite a while and cannot seem to find an official/conclusive figure quoting the number of single precision floating point operations/clock cycle that an Intel Xeon quadcore can complete. I have an Intel Xeon quadcore E5530…
user3495341
  • 113
  • 1
  • 1
  • 7
6
votes
4 answers

Counting Flops for a code!

This is really taking my time. I could not find a simple way to estimate FLOPS for a following code (the loop), How much FLOPS are for a single iteration of the loop: float func(float * atominfo, float energygridItem, int xindex, int yindex) { …
usman
  • 1,217
  • 1
  • 14
  • 21
1
2 3
8 9