Questions tagged [flops]

FLOPS (FLoating point Operations Per Second): a unit of measurement used to quantify the performance of the implementation of a numerical algorithm.

Anything related to the FLOPS unit of measurement (FLoating point Operations Per Second), i.e. a unit of measurement used to quantify the performance of the implementation of a numerical algorithm.

See Wikipedia page on FLOPS.

121 questions
6
votes
2 answers

Why are math libraries often compared by FLOPS?

Math libraries are very often compared based on FLOPS. What information is being conveyed to me when I'm shown a plot of FLOPS vs size with sets of points for several different math libraries? FLOPS as a measure of performance would make more sense…
Praxeolitic
  • 17,768
  • 12
  • 57
  • 109
6
votes
2 answers

Compiler Optimizations effect on FLOPs and L2/L3 Cache Miss Rate using PAPI

So we've been tasked with an assignment to compile some code (we're supposed to treat it as a black box), using different intel compiler optimization flags (-O1 and -O3) as well as vectorization flags (-xhost and -no-vec) and to observe changes…
kfkhalili
  • 866
  • 9
  • 22
6
votes
1 answer

Python FLOPS calculation

I've been trying to get a standardized estimate of FLOPS across all of the computers that I've implemented a Python distributed processing program on. While I currently can calculate pystones quite fine, pystones are not particularly well known, and…
Doc Sohl
  • 165
  • 1
  • 10
5
votes
2 answers

Measuring FLOPs of an application with the linux perf tool

I want to measure the ammount of floating point and arithmetic operations executed by some application with 'perf', the new command line interface command to the linux performance counter subsystem. (For testing purposes I use a simple dummy app…
5
votes
1 answer

Understanding FMA instructions performance

i'm tring to understand how can i max out the number of operations i can get on my CPU. I'm doing a simple matrix multiplication program, and i have a Skylake processor. I was looking at the wikipedia page for the flops information on this…
Peter L.
  • 147
  • 1
  • 1
  • 6
5
votes
1 answer

For XMM/YMM FP operation on Intel Haswell, can FMA be used in place of ADD?

This question is for packed, single-prec floating ops with XMM/YMM registers on Haswell. So according to the awesome, awesome table put together by Agner Fog, I know that MUL can be done on either port p0 and p1 (with recp thruput of 0.5), while…
codechimp
  • 1,086
  • 10
  • 18
4
votes
2 answers

Why tensorflow's FLOPs is 2 times Caffe's macc?

I'm trying to rewrite a model from caffe to tensorflow. To make sure I did not make mistake, I count the macc and Flops and then I find this interesting thing: For example, when input a image 112x112x3, and conv2d it with 32 3x3 kernel, stride=1,…
MarStarck
  • 383
  • 3
  • 12
4
votes
1 answer

Gigaflops of a processor

I discovered my computer has NVIDIA CUDA Technology and I want measure the power of processing, in CPU and GPU. Instead of searching for a program to do this, I want have a deeper understanding of how it works. What kind of code (C/C++) I need?
rigon
  • 1,018
  • 3
  • 12
  • 31
3
votes
0 answers

Programatic way of counting floating point operations (JAVA)

I'm looking for a programmatic way of counting the number of floating point operations (flops) in call to a function, in JAVA. There are several closely related questions, asking about what floating points are, and how to do big-O computational…
kabdulla
  • 4,549
  • 1
  • 12
  • 28
3
votes
1 answer

Estimating the efficiency of GPU in FLOPS (CUDA SAMPLES)

It seems to me, that I don't completely understand the conception of FLOPS. In CUDA SAMPLES, there is Matrix Multiplication Example (0_Simple/matrixMul). In this example the number of FLOPs (operations with floating point) per matrix multiplication…
3
votes
1 answer

GPU FLOPS and FPS

I am modelling a GPU (cannot disclose which) for estimating the performance of OpenCL and OpenGL applications, The model can reasonably estimate the FLOPS of the executing app/kernel/code is there a way to estimate to Frames per Second from the…
Umair
  • 68
  • 6
3
votes
3 answers

Really slow loop with vector-scalar multiplication in MATLAB

Have I done something wrong or is vector-by-scalar multiplication really so costly? Doesn't MATLAB (ver 2012a or higher) optimize the code somehow to prevent such curiosities? >> tic; for i=1:100000; x = sin(i)*[1; 1]; end; toc; Elapsed time is…
Przemek M
  • 55
  • 6
3
votes
9 answers

FLOPS what really is a FLOP

I came from this thread: FLOPS Intel core and testing it with C (innerproduct) As I began writing simple test scripts, a few questions came into my mind. Why floating point? What is so significant about floating point that we have to consider? Why…
user185732
3
votes
1 answer

why are floating point operations considered expensive?

I read that gprof (function profiling) and other methods of profiling can return the number of floating point operations taking place in the execution of a program and thus was wondering how Flops are so much more expensive than regular operations?
Siddhartha
  • 3,899
  • 6
  • 40
  • 61
2
votes
1 answer

On GPU, is it possible to get more flops by combining double and float operations?

If a GPU can do N1 single precision operations per second, and N2 double precision operations per second. Is it possible, by mixing (independent) single and double precision operations to achieve N1+N2 total operations per second, or at least…
nat chouf
  • 697
  • 5
  • 10
1
2
3
8 9