3

I have to raise 10 to the power of a double a lot of times.

Is there a more efficient way to do this than with the math library pow(10,double)? If it matters, my doubles are always negative between -5 and -11.

I assume pow(double,double) uses a more general algorithm than is required for pow(10,double) and might therefore not be the fastest method. Given some of the answers below, that might have been an incorrect assumption.

As for the why, it is for logartihmic interpolation. I have a table of x and y values. My object has a known x value (which is almost always a double).

double Dbeta(struct Data *diffusion, double per){
  double frac;
  while(per>diffusion->x[i]){
      i++;
  }
  frac = (per-diffusion->x[i-1])/(diffusion->x[i]-diffusion->x[i-1]);
  return pow(10,log10DB[i-1] + frac * (log10DB[i]-log10DB[i-1]));
}

This function is called a lot of times. I have been told to look into profiling, so that is what I will do first.

I have just been told I could have used natural logarithms instead of base 10, which is obviously right. (my stupidity sometimes amazes even myself.)

After replacing everything with natural logarithms everything runs a bit faster. With profiling (which is a new word I learned today) I found out 39% of my code is spend in the exp function, so for those who wondered if it was in fact this part that was bottlenecking my code, it was.

Kvaestr
  • 35
  • 6
  • The standard library functions are almost guaranteed to be very optimized. And the compiler can even generate code inline as a further optimization. Please [edit] your question to tell us why the standard `pow` isn't "efficient" enough, and what benchmarking or measurements you have made. And you remembered to build with optimizations enabled before doing any benchmarking or measuring? – Some programmer dude Oct 22 '20 at 07:15
  • 2
    Do the `double` powers actually have integer values? – Weather Vane Oct 22 '20 at 07:21
  • I understand the standard functions are as optimized as they can be. However, the pow function is a very general double to the power double. If there is an algorithm that is more efficient for my specific purpose, I would like to know it. As for benchmarks, I'm rather new to this. I suspect this is the bottleneck, my code takes long to run, and I would like for it to be faster. – Kvaestr Oct 22 '20 at 07:24
  • 1
    The double powers usually don't have integer values, though occasionaly they might. – Kvaestr Oct 22 '20 at 07:27
  • You might want to do some research about *profiling* and how it can be used to find actual bottlenecks in your code. – Some programmer dude Oct 22 '20 at 07:32
  • Sometimes you can guess what might be taking a long time in a program but the way to be sure is to use a profiler. You also want to be sure you're running an optimized release build. If you're not sure of that or you don't know how to profile then you have some things to investigate. – Retired Ninja Oct 22 '20 at 07:32
  • 2
    And will the second argument to `pow` always be an integer? Will you ever pass e.g. `-6.8` (or similar) as the second argument? Or will it always only be `-5`, `-6`, `-7`, `-8`, `-9`, `-10` or `-11` (and *nothing* else in between)? – Some programmer dude Oct 22 '20 at 07:33
  • 3
    "I have to raise 10 to the power of a double a lot of times" This is rather unusual, can you show some of your code? – n. 'pronouns' m. Oct 22 '20 at 07:53
  • 3
    `I suspect this is the bottleneck` Don't suspect - profile and be sure, otherwise you may just be wasting time. Remember to print out a copy of [rules of optimizations](https://wiki.c2.com/?RulesOfOptimizationClub). – KamilCuk Oct 22 '20 at 07:58
  • 3
    if the exponents are always integers or belong to a limited set of known values then just use a lookup table – phuclv Oct 22 '20 at 08:03
  • 1
    @Kvaestr if the function is called a lot of times, try vectorizing it. Most modern compilers have pretty good auto-vectorization support. For example with [`_mm256_pow_pd`](https://stackoverflow.com/q/36636159/995714) you can calculate 4 powers at the same time. Alternatively use some libraries for that: [pow for SSE types](https://stackoverflow.com/q/25936031/995714), [SIMD math libraries for SSE and AVX](https://stackoverflow.com/q/15723995/995714), [Mathematical functions for SIMD registers](https://stackoverflow.com/q/40475140/995714) – phuclv Oct 22 '20 at 09:24
  • OK why base 10? Natural logarithms and exponents would work just as well. You could use exp(x * log(10)) trick but why not start with natural logarithms and exponents in the first place? – n. 'pronouns' m. Oct 22 '20 at 11:47
  • Sometimes my stupidity amazes even myself. Obviously I could have used natural logarithms. – Kvaestr Oct 22 '20 at 12:19

2 Answers2

5

For pow(10.0, n) it should be faster to set c = log(10.0), which you can compute once, then use exp(c*n), which should be significantly faster than pow(10.0, n) (which is basically doing that same thing internally, except it would be calculating log(10.0) over and over instead of just once). Beyond that, there probably isn't much else you can do.

Tom Karzes
  • 17,131
  • 2
  • 14
  • 32
  • `which is basically doing that` Och, didn't knew, [it really does do that!](https://code.woboq.org/userspace/glibc/sysdeps/ieee754/dbl-64/e_pow.c.html#366) – KamilCuk Oct 22 '20 at 07:58
3

Yes, the pow function is slow (roughly 50x the cost of a multiply, for those asking for benchmarks).

  • By some log/exponents trickery, we can express 10^x as

    10^x = exp(log(10^x)) = exp(x * log(10)).
    

    So you can implement 10^x with exp(x * M_LN10), which should be more efficient than pow.

  • If double accuracy isn't critical, use the float version of the function expf (or powf), which should be more efficient than the double version.

  • If rough accuracy is Ok, precompute a table over the [-5, -11] range and do a quick look up with linear interpolation.

Some benchmarks (using glibc 2.31):

Benchmark                Time
---------------------------------
pow(10, x)               15.54 ns
powf(10, x)               7.18 ns
expf(x * (float)M_LN10)   3.45 ns
Pascal Getreuer
  • 2,201
  • 1
  • 3
  • 12
  • thank you, very helpfull. It might be accurate enough to use float, so I will try that as well. I think the lookup table would be too inaccurate. – Kvaestr Oct 22 '20 at 08:25