def f1(x):
for i in range(1, 100):
x *= 2
x /= 3.14159
x *= i**.25
return x
def f2(x):
for i in range(1, 100):
x *= 2 / 3.14159 * i**.25
return x
Both functions compute exactly the same, but f1
takes 3x longer to do so, even with @numba.njit
. Can Python be made to recognize the equivalence in compilation, just like it optimizes in other ways seen with dis
by e.g. throwing out unused assignments?
Note, I'm aware floating point arithmetic cares about order, so the two functions may output slightly differently, but if anything more separate edits to array values are less accurate, so it'd be a 2-in-1 optimization.
x = np.random.randn(10000, 1000)
%timeit f1(x.copy()) # 2.68 s ± 50.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit f2(x.copy()) # 894 ms ± 36.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit njit(f1)(x.copy()) # 2.59 s ± 65.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit njit(f2)(x.copy()) # 901 ms ± 41.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)