Highest Voted 'compiler-optimization' Questions

2323

votes

10 answers

Why are elementwise additions much faster in separate loops than in a combined loop?

Suppose a1, b1, c1, and d1 point to heap memory, and my numerical code has the following core loop. const int n = 100000; for (int j = 0; j < n; j++) { a1[j] += b1[j]; c1[j] += d1[j]; } This loop is executed 10,000 times via another outer…

asked Dec 17 '11 at 20:40

Johannes Gerer

24,320
5
24
33

2194

votes

12 answers

Why doesn't GCC optimize aaaaaa to (aaa)(aaa)?

I am doing some numerical optimization on a scientific application. One thing I noticed is that GCC will optimize the call pow(a,2) by compiling it into a*a, but the call pow(a,6) is not optimized and will actually call the library function pow,…

gcc assembly floating-point compiler-optimization fast-math

asked Jun 21 '11 at 18:49

xis

22,592
8
39
55

1499

votes

11 answers

Replacing a 32-bit loop counter with 64-bit introduces crazy performance deviations with _mm_popcnt_u64 on Intel CPUs

I was looking for the fastest way to popcount large arrays of data. I encountered a very weird effect: Changing the loop variable from unsigned to uint64_t made the performance drop by 50% on my PC. The Benchmark #include #include…

c++ performance assembly x86 compiler-optimization

asked Aug 01 '14 at 10:33

gexicide

35,369
19
80
136

956

votes

9 answers

Swift Beta performance: sorting arrays

I was implementing an algorithm in Swift Beta and noticed that the performance was very poor. After digging deeper I realized that one of the bottlenecks was something as simple as sorting arrays. The relevant part is here: let n = 1000000 var x = …

swift performance sorting xcode6 compiler-optimization

asked Jun 07 '14 at 23:53

Jukka Suomela

11,423
4
32
45

468

votes

6 answers

Why does GCC generate 15-20% faster code if I optimize for size instead of speed?

I first noticed in 2009 that GCC (at least on my projects and on my machines) have the tendency to generate noticeably faster code if I optimize for size (-Os) instead of speed (-O2 or -O3), and I have been wondering ever since why. I have managed…

c++ performance gcc x86-64 compiler-optimization

asked Oct 19 '13 at 20:36

Ali

51,545
25
157
246

430

votes

2 answers

Why do we use volatile keyword?

Possible Duplicate: Why does volatile exist? I have never used it but I wonder why people use it? What does it exactly do? I searched the forum, I found it only C# or Java topics.

c++ volatile compiler-optimization

asked Dec 14 '10 at 09:14

Nawaz

327,095
105
629
812

342

votes

1 answer

Why does the Rust compiler not optimize code assuming that two mutable references cannot alias?

As far as I know, reference/pointer aliasing can hinder the compiler's ability to generate optimized code, since they must ensure the generated binary behaves correctly in the case where the two references/pointers indeed alias. For instance, in the…

rust compiler-optimization llvm-codegen

asked Jul 29 '19 at 17:57

Zhiyao

3,078
2
7
16

300

votes

12 answers

How to compile Tensorflow with SSE4.2 and AVX instructions?

This is the message received from running a script to check if Tensorflow is working: I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcublas.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:125]…

tensorflow x86 compiler-optimization simd compiler-options

asked Dec 22 '16 at 23:21

GabrielChu

5,418
8
23
35

197

votes

2 answers

What is &&& operation in C

#include volatile int i; int main() { int c; for (i = 0; i < 3; i++) { c = i &&& i; printf("%d\n", c); } return 0; } The output of the above program compiled using gcc is 0 1 1 With the -Wall or…

c++ c operators compiler-optimization gcc-warning

asked Dec 19 '12 at 06:48

manav m-n

10,236
21
66
95

185

votes

3 answers

Why does GCC generate such radically different assembly for nearly the same C code?

While writing an optimized ftol function I found some very odd behaviour in GCC 4.6.1. Let me show you the code first (for clarity I marked the differences): fast_trunc_one, C: int fast_trunc_one(int i) { int mantissa, exponent, sign, r; …

c gcc assembly x86 compiler-optimization

asked Apr 20 '12 at 16:59

orlp

98,226
29
187
285

182

votes

3 answers

Why can lambdas be better optimized by the compiler than plain functions?

In his book The C++ Standard Library (Second Edition) Nicolai Josuttis states that lambdas can be better optimized by the compiler than plain functions. In addition, C++ compilers optimize lambdas better than they do ordinary functions. (Page…

c++ optimization c++11 lambda compiler-optimization

asked Dec 05 '12 at 11:38

Stephan Dollberg

27,667
11
72
104

178

votes

5 answers

How to see which flags -march=native will activate?

I'm compiling my C++ app using GCC 4.3. Instead of manually selecting the optimization flags I'm using -march=native, which in theory should add all optimization flags applicable to the hardware I'm compiling on. But how can I check which flags is…

gcc g++ compiler-optimization compiler-flags

asked Mar 29 '11 at 09:14

vartec

118,560
34
206
238

175

votes

4 answers

Can I hint the optimizer by giving the range of an integer?

I am using an int type to store a value. By the semantics of the program, the value always varies in a very small range (0 - 36), and int (not a char) is used only because of the CPU efficiency. It seems like many special arithmetical optimizations…

c++ optimization integer range compiler-optimization

asked Nov 06 '16 at 08:00

rolevax

1,560
1
10
21

151

votes

2 answers

Limits of Nat type in Shapeless

In shapeless, the Nat type represents a way to encode natural numbers at a type level. This is used for example for fixed size lists. You can even do calculations on type level, e.g. append a list of N elements to a list of K elements and get back a…

scala numbers compiler-optimization shapeless

asked Jan 22 '14 at 23:15

Rüdiger Klaehn

11,945
2
36
55

148

votes

5 answers

Why does the enhanced GCC 6 optimizer break practical C++ code?

GCC 6 has a new optimizer feature: It assumes that this is always not null and optimizes based on that. Value range propagation now assumes that the this pointer of C++ member functions is non-null. This eliminates common null pointer checks but…

c++ gcc compiler-optimization undefined-behavior

asked Apr 27 '16 at 14:45

boot4life

4,292
5
20
40

Questions tagged [compiler-optimization]